{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,17]],"date-time":"2025-11-17T03:03:28Z","timestamp":1763348608890,"version":"3.37.3"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"11","license":[{"start":{"date-parts":[[2024,1,16]],"date-time":"2024-01-16T00:00:00Z","timestamp":1705363200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,1,16]],"date-time":"2024-01-16T00:00:00Z","timestamp":1705363200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001636","name":"University College Cork","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001636","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2024,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Real-world sound signals exhibit various aspects of grouping and profiling behaviors, such as being recorded from identical sources, having similar environmental settings, or encountering related background noises. In this work, we propose novel neural profiling networks (NeuProNet) capable of learning and extracting high-level unique profile representations from sounds. An end-to-end framework is developed so that any backbone architectures can be plugged in and trained, achieving better performance in any downstream sound classification tasks. We introduce an in-batch profile grouping mechanism based on profile awareness and attention pooling to produce reliable and robust features with contrastive learning. Furthermore, extensive experiments are conducted on multiple benchmark datasets and tasks to show that neural computing models under the guidance of our framework gain significant performance gaps across all evaluation tasks. Particularly, the integration of NeuProNet surpasses recent state-of-the-art (SoTA) approaches on UrbanSound8K and VocalSound datasets with statistically significant improvements in benchmarking metrics, up to 5.92% in accuracy compared to the previous SoTA method and up to 20.19% compared to baselines. Our work provides a strong foundation for utilizing neural profiling for machine learning tasks.<\/jats:p>","DOI":"10.1007\/s00521-023-09361-8","type":"journal-article","created":{"date-parts":[[2024,1,16]],"date-time":"2024-01-16T11:02:59Z","timestamp":1705402979000},"page":"5873-5887","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["NeuProNet: neural profiling networks for sound classification"],"prefix":"10.1007","volume":"36","author":[{"given":"Khanh-Tung","family":"Tran","sequence":"first","affiliation":[]},{"given":"Xuan-Son","family":"Vu","sequence":"additional","affiliation":[]},{"given":"Khuong","family":"Nguyen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2541-3269","authenticated-orcid":false,"given":"Hoang D.","family":"Nguyen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,1,16]]},"reference":[{"issue":"4","key":"9361_CR1","doi-asserted-by":"publisher","first-page":"913","DOI":"10.1007\/s00521-019-04166-0","volume":"32","author":"D Herremans","year":"2019","unstructured":"Herremans D, Chuan CH (2019) The emergence of deep learning: new opportunities for music and audio technologies. Neural Comput Appl 32(4):913\u2013914","journal-title":"Neural Comput Appl"},{"issue":"22","key":"9361_CR2","doi-asserted-by":"publisher","first-page":"19485","DOI":"10.1007\/s00521-022-07375-2","volume":"34","author":"G Coelho","year":"2022","unstructured":"Coelho G, Matos LM, Pereira PJ, Ferreira A, Pilastri A, Cortez P (2022) Deep autoencoders for acoustic anomaly detection: experiments with working machine and in-vehicle audio. Neural Comput Appl 34(22):19485\u201319499","journal-title":"Neural Comput Appl"},{"issue":"31","key":"9361_CR3","doi-asserted-by":"publisher","first-page":"22935","DOI":"10.1007\/s00521-022-06913-2","volume":"35","author":"A Sharma","year":"2022","unstructured":"Sharma A, Sharma K, Kumar A (2022) Real-time emotional health detection using fine-tuned transfer networks with multimodal fusion. Neural Comput Appl 35(31):22935\u201322948","journal-title":"Neural Comput Appl"},{"key":"9361_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.imu.2020.100378","volume":"20","author":"A Imran","year":"2020","unstructured":"Imran A, Posokhova I, Qureshi HN, Masood U, Riaz MS, Ali K et al (2020) AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app. Inform Med Unlocked 20:100378","journal-title":"Inform Med Unlocked"},{"issue":"10","key":"9361_CR5","first-page":"586","volume":"01","author":"J Earis","year":"2000","unstructured":"Earis J, Cheetham B (2000) Current methods used for computerized respiratory sound analysis. Eur Respir Rev 01(10):586\u2013590","journal-title":"Eur Respir Rev"},{"key":"9361_CR6","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1007\/978-981-10-7419-6_6","volume-title":"Precision medicine powered by pHealth and connected health","author":"BM Rocha","year":"2018","unstructured":"Rocha BM, Filos D, Mendes L, Vogiatzis I, Perantoni E, Kaimakamis E et al (2018) A respiratory sound database for the development of automated classification. In: Maglaveras N, Chouvarda I, de Carvalho P (eds) Precision medicine powered by pHealth and connected health. Springer Singapore, Singapore, pp 33\u201337"},{"key":"9361_CR7","doi-asserted-by":"crossref","unstructured":"Bukhsh Z (2022) Contrastive sensor transformer for predictive maintenance of industrial assets. In: ICASSP 2022\u20142022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3558\u20133562","DOI":"10.1109\/ICASSP43922.2022.9746728"},{"key":"9361_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.ecolind.2022.108986","volume":"140","author":"B Williams","year":"2022","unstructured":"Williams B, Lamont TAC, Chapuis L, Harding HR, May EB, Prasetya ME et al (2022) Enhancing automated analysis of marine soundscapes using ecoacoustic indices and machine learning. Ecol Ind 140:108986","journal-title":"Ecol Ind"},{"issue":"5","key":"9361_CR9","doi-asserted-by":"publisher","first-page":"339","DOI":"10.1016\/j.cities.2005.05.003","volume":"22","author":"M Raimbault","year":"2005","unstructured":"Raimbault M, Dubois D (2005) Urban soundscapes: experiences and knowledge. Cities 22(5):339\u2013350","journal-title":"Cities"},{"key":"9361_CR10","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1109\/TAFFC.2020.3032373","volume":"14","author":"R Panda","year":"2020","unstructured":"Panda R, Malheiro RM, Paiva RP (2020) Audio features for music emotion recognition: a survey. IEEE Trans Affect Comput 14:68\u201388","journal-title":"IEEE Trans Affect Comput"},{"issue":"3","key":"9361_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3322240","volume":"52","author":"S Chandrakala","year":"2019","unstructured":"Chandrakala S, Jayalakshmi SL (2019) Environmental audio scene and sound event recognition for autonomous surveillance: a survey and comparative studies. ACM Comput Surv 52(3):1\u201334","journal-title":"ACM Comput Surv"},{"key":"9361_CR12","doi-asserted-by":"crossref","unstructured":"Gong Y, Yu J, Glass J (2022) Vocalsound: a dataset for improving human vocal sounds recognition. In: ICASSP 2022\u20142022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 151\u2013155","DOI":"10.1109\/ICASSP43922.2022.9746828"},{"key":"9361_CR13","doi-asserted-by":"crossref","unstructured":"Gairola S, Tom F, Kwatra N, Jain M (2021) Respirenet: a deep neural network for accurately detecting abnormal lung sounds in limited data setting. In: 2021 43rd annual international conference of the IEEE engineering in medicine & biology society (EMBC). IEEE, pp 527\u2013530","DOI":"10.1109\/EMBC46164.2021.9630091"},{"issue":"1","key":"9361_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41746-021-00553-x","volume":"5","author":"J Han","year":"2022","unstructured":"Han J, Xia T, Spathis D, Bondareva E, Brown C, Chauhan J et al (2022) Sounds of COVID-19: exploring realistic performance of audio-based digital testing. NPJ Digit Med 5(1):1\u20139","journal-title":"NPJ Digit Med"},{"issue":"3","key":"9361_CR15","first-page":"535","volume":"14","author":"J Acharya","year":"2020","unstructured":"Acharya J, Basu A (2020) Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning. IEEE Trans Biomed Circuits Syst 14(3):535\u2013544","journal-title":"IEEE Trans Biomed Circuits Syst"},{"key":"9361_CR16","doi-asserted-by":"crossref","unstructured":"Kathan A, Amiriparian S, Christ L, Triantafyllopoulos A, M\u00fcller N, K\u00f6nig A, et\u00a0al (2022) A personalised approach to audiovisual humour recognition and its individual-level fairness. In: Proceedings of the 3rd international on multimodal sentiment analysis workshop and challenge. MuSe\u2019 22. Association for Computing Machinery, New York, NY, USA, pp 29\u201336","DOI":"10.1145\/3551876.3554800"},{"key":"9361_CR17","doi-asserted-by":"publisher","DOI":"10.3389\/fdgth.2022.964582","volume":"4","author":"A Kathan","year":"2022","unstructured":"Kathan A, Harrer M, K\u00fcster L, Triantafyllopoulos A, He X, Milling M et al (2022) Personalised depression forecasting using mobile sensor data and ecological momentary assessment. Front Digit Health 4:964582. https:\/\/doi.org\/10.3389\/fdgth.2022.964582","journal-title":"Front Digit Health"},{"issue":"6","key":"9361_CR18","doi-asserted-by":"publisher","first-page":"1593","DOI":"10.1007\/s00521-019-04182-0","volume":"32","author":"P Wei","year":"2019","unstructured":"Wei P, He F, Li L, Li J (2019) Research on sound classification based on SVM. Neural Comput Appl 32(6):1593\u20131607","journal-title":"Neural Comput Appl"},{"key":"9361_CR19","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1016\/j.patrec.2022.07.012","volume":"161","author":"S Verbitskiy","year":"2022","unstructured":"Verbitskiy S, Berikov V, Vyshegorodtsev V (2022) ERANNs: efficient residual audio neural networks for audio pattern recognition. Pattern Recogn Lett 161:38\u201344","journal-title":"Pattern Recogn Lett"},{"key":"9361_CR20","doi-asserted-by":"crossref","unstructured":"Pham L, Ngo D, Tran K, Hoang T, Schindler A, McLoughlin I (2022) An ensemble of deep learning frameworks for predicting respiratory anomalies. In: 2022 44th annual international conference of the IEEE engineering in medicine & biology society (EMBC), pp 4595\u20134598","DOI":"10.1109\/EMBC48229.2022.9871440"},{"issue":"9","key":"9361_CR21","doi-asserted-by":"publisher","first-page":"2872","DOI":"10.1109\/TBME.2022.3156293","volume":"69","author":"T Nguyen","year":"2022","unstructured":"Nguyen T, Pernkopf F (2022) Lung sound classification using co-tuning and stochastic normalization. IEEE Trans Biomed Eng 69(9):2872\u20132882","journal-title":"IEEE Trans Biomed Eng"},{"key":"9361_CR22","doi-asserted-by":"crossref","unstructured":"Li J, Dai W, Metze F, Qu S, Das S (2017) A comparison of deep learning methods for environmental sound detection. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 126\u2013130","DOI":"10.1109\/ICASSP.2017.7952131"},{"key":"9361_CR23","doi-asserted-by":"crossref","unstructured":"Gong Y, Chung YA, Glass J (2021) AST: audio spectrogram transformer. In: Proceedings of Interspeech 2021, pp 571\u2013575","DOI":"10.21437\/Interspeech.2021-698"},{"key":"9361_CR24","doi-asserted-by":"crossref","unstructured":"Chen K, Du X, Zhu B, Ma Z, Berg-Kirkpatrick T, Dubnov S (2022) HTS-AT: a hierarchical token-semantic audio transformer for sound classification and detection. In: ICASSP 2022\u20142022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 646\u2013650","DOI":"10.1109\/ICASSP43922.2022.9746312"},{"key":"9361_CR25","doi-asserted-by":"publisher","first-page":"3292","DOI":"10.1109\/TASLP.2021.3120633","volume":"29","author":"Y Gong","year":"2021","unstructured":"Gong Y, Chung YA, Glass J (2021) PSLA: improving audio tagging with pretraining, sampling, labeling, and aggregation. IEEE\/ACM Trans Audio Speech Lang Process 29:3292\u20133306","journal-title":"IEEE\/ACM Trans Audio Speech Lang Process"},{"key":"9361_CR26","doi-asserted-by":"crossref","unstructured":"Wang Z, Wang Z (2022) A domain transfer based data augmentation method for automated respiratory classification. In: ICASSP 2022\u20142022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 9017\u20139021","DOI":"10.1109\/ICASSP43922.2022.9746941"},{"key":"9361_CR27","doi-asserted-by":"crossref","unstructured":"Zhou Y, Dou Z, Zhu Y, Rong Wen J (2021) PSSL: self-supervised learning for personalized search with contrastive sampling. In: Proceedings of the 30th ACM international conference on information & knowledge management, pp 2749\u20132758","DOI":"10.1145\/3459637.3482379"},{"issue":"4","key":"9361_CR28","first-page":"33","volume":"33","author":"JC Weiss","year":"2012","unstructured":"Weiss JC, Natarajan S, Peissig PL, McCarty CA, Page D (2012) Machine learning for personalized medicine: predicting primary myocardial infarction from electronic health records. AI Mag 33(4):33","journal-title":"AI Mag"},{"key":"9361_CR29","doi-asserted-by":"crossref","unstructured":"Triantafyllopoulos A, Liu S, Schuller BW (2021) Deep speaker conditioning for speech emotion recognition. In: 2021 IEEE international conference on multimedia and expo (ICME), pp 1\u20136","DOI":"10.1109\/ICME51207.2021.9428217"},{"key":"9361_CR30","doi-asserted-by":"crossref","unstructured":"Eskimez SE, Yoshioka T, Wang H, Wang X, Chen Z, Huang X (2022) Personalized speech enhancement: new models and comprehensive evaluation. In: 2022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 356\u2013360","DOI":"10.1109\/ICASSP43922.2022.9746962"},{"key":"9361_CR31","doi-asserted-by":"crossref","unstructured":"Sivaraman A, Kim S, Kim M (2021) Personalized speech enhancement through self-supervised data augmentation and purification. In: Proceedings of the Interspeech 2021","DOI":"10.21437\/Interspeech.2021-1868"},{"key":"9361_CR32","first-page":"24","volume":"02","author":"T Dang","year":"2022","unstructured":"Dang T, Han J, Xia T, Spathis D, Bondareva E, Brown C et al (2022) Exploring longitudinal cough, breath, and voice data for COVID-19 disease progression prediction via sequential deep learning: model development and validation (preprint). J Med Internet Res 02:24","journal-title":"J Med Internet Res"},{"key":"9361_CR33","doi-asserted-by":"crossref","unstructured":"Hazarika D, Zimmermann R, Poria S (2020) Misa: modality-invariant and-specific representations for multimodal sentiment analysis. In: Proceedings of the 28th ACM international conference on multimedia, pp 1122\u20131131","DOI":"10.1145\/3394171.3413678"},{"key":"9361_CR34","first-page":"18661","volume":"33","author":"P Khosla","year":"2020","unstructured":"Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P et al (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661\u201318673","journal-title":"Adv Neural Inf Process Syst"},{"key":"9361_CR35","doi-asserted-by":"crossref","unstructured":"Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. In: 22nd ACM international conference on multimedia (ACM-MM\u201914). Orlando, FL, USA, pp 1041\u20131044","DOI":"10.1145\/2647868.2655045"},{"key":"9361_CR36","doi-asserted-by":"crossref","unstructured":"Guzhov A, Raue F, Hees J, Dengel A (2021) ESResNet: environmental sound classification based on visual domain models. In: 2020 25th international conference on pattern recognition (ICPR). IEEE Computer Society, Los Alamitos, CA, USA, pp 4933\u20134940","DOI":"10.1109\/ICPR48806.2021.9413035"},{"issue":"21","key":"9361_CR37","doi-asserted-by":"publisher","first-page":"14495","DOI":"10.1007\/s00521-021-06091-7","volume":"33","author":"YA Al-Hattab","year":"2021","unstructured":"Al-Hattab YA, Zaki HF, Shafie AA (2021) Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction. Neural Comput Appl 33(21):14495\u201314506","journal-title":"Neural Comput Appl"},{"key":"9361_CR38","unstructured":"Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105\u20136114"},{"key":"9361_CR39","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"9361_CR40","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1007\/978-3-030-05716-9_13","volume-title":"MultiMedia modeling","author":"D Chong","year":"2019","unstructured":"Chong D, Zou Y, Wang W (2019) Multi-channel convolutional neural networks with multi-level feature fusion for environmental sound classification. In: Kompatsiaris I, Huet B, Mezaris V, Gurrin C, Cheng WH, Vrochidis S (eds) MultiMedia modeling. Springer International Publishing, Cham, pp 157\u2013168"},{"issue":"7","key":"9361_CR41","doi-asserted-by":"publisher","first-page":"1733","DOI":"10.3390\/s19071733","volume":"19","author":"Y Su","year":"2019","unstructured":"Su Y, Zhang K, Wang J, Madani K (2019) Environment sound classification using a two-stream CNN based on decision-level fusion. Sensors 19(7):1733","journal-title":"Sensors"},{"key":"9361_CR42","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2022.108656","volume":"127","author":"V Dentamaro","year":"2022","unstructured":"Dentamaro V, Giglio P, Impedovo D, Moretti L, Pirlo G (2022) AUCO ResNet: an end-to-end network for Covid-19 pre-screening from cough and breath. Pattern Recogn 127:108656","journal-title":"Pattern Recogn"},{"key":"9361_CR43","doi-asserted-by":"crossref","unstructured":"Park DS, Chan W, Zhang Y, Chiu CC, Zoph B, Cubuk ED, et\u00a0al (2019) SpecAugment: a simple data augmentation method for automatic speech recognition. Interspeech 2019. Sep","DOI":"10.21437\/Interspeech.2019-2680"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-023-09361-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-023-09361-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-023-09361-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,16]],"date-time":"2024-03-16T09:09:29Z","timestamp":1710580169000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-023-09361-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,16]]},"references-count":43,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,4]]}},"alternative-id":["9361"],"URL":"https:\/\/doi.org\/10.1007\/s00521-023-09361-8","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"type":"print","value":"0941-0643"},{"type":"electronic","value":"1433-3058"}],"subject":[],"published":{"date-parts":[[2024,1,16]]},"assertion":[{"value":"2 March 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 December 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 January 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"There is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}