{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T10:17:04Z","timestamp":1772101024592,"version":"3.50.1"},"reference-count":63,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,2,11]],"date-time":"2021-02-11T00:00:00Z","timestamp":1613001600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Acoustic event detection and analysis has been widely developed in the last few years for its valuable application in monitoring elderly or dependant people, for surveillance issues, for multimedia retrieval, or even for biodiversity metrics in natural environments. For this purpose, sound source identification is a key issue to give a smart technological answer to all the aforementioned applications. Diverse types of sounds and variate environments, together with a number of challenges in terms of application, widen the choice of artificial intelligence algorithm proposal. This paper presents a comparative study on combining several feature extraction algorithms (Mel Frequency Cepstrum Coefficients (MFCC), Gammatone Cepstrum Coefficients (GTCC), and Narrow Band (NB)) with a group of machine learning algorithms (k-Nearest Neighbor (kNN), Neural Networks (NN), and Gaussian Mixture Model (GMM)), tested over five different acoustic environments. This work has the goal of detailing a best practice method and evaluate the reliability of this general-purpose algorithm for all the classes. Preliminary results show that most of the combinations of feature extraction and machine learning present acceptable results in most of the described corpora. Nevertheless, there is a combination that outperforms the others: the use of GTCC together with kNN, and its results are further analyzed for all the corpora.<\/jats:p>","DOI":"10.3390\/s21041274","type":"journal-article","created":{"date-parts":[[2021,2,12]],"date-time":"2021-02-12T16:12:10Z","timestamp":1613146330000},"page":"1274","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":56,"title":["A Comparative Survey of Feature Extraction and Machine Learning Methods in Diverse Acoustic Environments"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7362-199X","authenticated-orcid":false,"given":"Daniel","family":"Bonet-Sol\u00e0","sequence":"first","affiliation":[{"name":"Grup de Recerca en Tecnologies M\u00e8dia (GTM), La Salle\u2014URL, c\/Quatre Camins, 30, 08022 Barcelona, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2261-5471","authenticated-orcid":false,"given":"Rosa Ma","family":"Alsina-Pag\u00e8s","sequence":"additional","affiliation":[{"name":"Grup de Recerca en Tecnologies M\u00e8dia (GTM), La Salle\u2014URL, c\/Quatre Camins, 30, 08022 Barcelona, Spain"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Davies, A.C., and Velastin, S.A. (2005). A progress review of intelligent CCTV surveillance systems. Proc. IEEE IDAACS, 417\u2013423.","DOI":"10.1109\/IDAACS.2005.283015"},{"key":"ref_2","first-page":"9","article-title":"Chicago\u2019s video surveillance cameras: A pervasive and poorly regulated threat to our privacy","volume":"11","author":"Schwartz","year":"2012","journal-title":"Northwest. J. Technol. Intell. Prop."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Al\u00edas, F., and Alsina-Pag\u00e8s, R.M. (2019). Review of Wireless Acoustic Sensor Networks for Environmental Noise Monitoring in Smart Cities. J. Sens., 2019.","DOI":"10.1155\/2019\/7634860"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wang, W., Seraj, F., Meratnia, N., and Havinga, P. (2019, January 5\u20137). Privacy-aware environmental sound classification for indoor human activity recognition. Proceedings of the PETRA \u201919: 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Island of Rhodes, Greece.","DOI":"10.1145\/3316782.3321521"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Vafeiadis, A., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., and Hamzaoui, R. (2020). Audio content analysis for unobtrusive event detection in smart homes. Eng. Appl. Artif. Intell., 89.","DOI":"10.1016\/j.engappai.2019.08.020"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1109\/TMM.2011.2122247","article-title":"Probabilistic Novelty Detection for Acoustic Surveillance Under Real-World Conditions","volume":"13","author":"Ntalampiras","year":"2011","journal-title":"IEEE Trans. Multimed."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Vacher, M., Portet, F., Fleury, A., and Noury, N. (2010, January 1\u20133). Challenges in the processing of audio channels for ambient assisted living. Proceedings of the 12th IEEE International Conference on e-Health Networking, Applications and Services, Lyon, France.","DOI":"10.1109\/HEALTH.2010.5556546"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"579","DOI":"10.1109\/JBHI.2012.2234129","article-title":"A survey on ambient-assisted living tools for older adults","volume":"17","author":"Rashidi","year":"2012","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Bouakaz, S., Vacher, M., Bobillier Chaumon, M., Aman, F., Bekkadja, S., Portet, F., Guillou, E., Rossato, S., Desser\u00e9e, E., and Traineau, P. (2014). CIRDO: Smart companion for helping elderly to live at home for longer. IRBM, 35.","DOI":"10.1016\/j.irbm.2014.02.011"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Alsina-Pag\u00e8s, R., Navarro, J., Al\u00edas, F., and Herv\u00e1s, M. (2017). HomeSound: Real-Time Audio Event Detection Based on High Performance Computing for Behaviour and Surveillance Remote Monitoring. Sensors, 17.","DOI":"10.3390\/s17040854"},{"key":"ref_11","unstructured":"Socor\u00f3, J., Ribera, G., Sevillano, X., and Al\u00edas, F. (2015, January 12\u201316). Development of an Anomalous Noise Event Detection Algorithm for dynamic road traffic noise mapping. Proceedings of the 22nd International Congress on Sound and Vibration (ICSV22), Florence, Italy."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.landurbplan.2015.05.005","article-title":"Classification of urban park soundscapes through perceptions of the acoustical environments","volume":"141","author":"Jeon","year":"2015","journal-title":"Landsc. Urban Plan."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Chaudhary, M., Prakash, V., and Kumari, N. (2018, January 23\u201324). Identification Vehicle Movement Detection in Forest Area using MFCC and KNN. Proceedings of the 2018 International Conference on System Modeling & Advancement in Research Trends (SMART), Moradabad, India.","DOI":"10.1109\/SYSMART.2018.8746936"},{"key":"ref_14","first-page":"172","article-title":"DYNAMAP\u2014Development of low cost sensors networks for real time noise mapping","volume":"3","author":"Sevillano","year":"2016","journal-title":"Noise Mapp."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1016\/j.apacoust.2016.06.010","article-title":"The implementation of low-cost urban acoustic monitoring devices","volume":"117","author":"Mydlarz","year":"2017","journal-title":"Appl. Acoust."},{"key":"ref_16","unstructured":"Jati, A., Nadarajan, A., Mundnich, K., and Narayanan, S. (2020, January 4\u20138). Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chu, S., Narayanan, S., Kuo, C., and Mataric, M. (2006, January 9\u201312). Where am i? Scene recognition for mobile robots using audio features. Proceedings of the IEEE International Conference on Multimedia and Expo, ICME, Toronto, ON, Canada.","DOI":"10.1109\/ICME.2006.262661"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ozkan, Y., and Barkana, B. (2019, January 5\u20136). Forensic Audio Analysis and Event Recognition for Smart Surveillance Systems. Proceedings of the 2019 IEEE International Symposium on Technologies for Homeland Security (HST), Woburn, MA USA.","DOI":"10.1109\/HST47167.2019.9032996"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1525\/bio.2009.59.5.6","article-title":"New eyes on the world: Advanced sensors for ecology","volume":"59","author":"Porter","year":"2009","journal-title":"BioScience"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Stowell, D., Wood, M., Stylianou, Y., and Glotin, H. (2016, January 13\u201316). Bird detection in audio: A survey and a challenge. Proceedings of the 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), Salerno, Italy.","DOI":"10.1109\/MLSP.2016.7738875"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Herv\u00e1s, M., Alsina-Pag\u00e8s, R., Al\u00edas, F., and Salvador, M. (2017). An FPGA-Based WASN for Remote Real-Time Monitoring of Endangered Species: A Case Study on the Birdsong Recognition of Botaurus stellaris. Sensors, 17.","DOI":"10.3390\/s17061331"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2252","DOI":"10.1109\/TASL.2006.872624","article-title":"Parametric representations of bird sounds for automatic species recognition","volume":"14","author":"Somervuo","year":"2006","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_23","unstructured":"Chen, C.H. (1976). Distance measures for speech recognition, psychological and instrumental. Pattern Recognition and Artificial Intelligence, Academic Press."},{"key":"ref_24","unstructured":"Agrawal, D., Sailor, H., Soni, M., and Patil, H. (September, January 28). Novel TEO-based Gammatone features for environmental sound classification. Proceedings of the European Signal Processing Conf. (EUSIPCO), Kos, Greece."},{"key":"ref_25","unstructured":"Valero, X., and Al\u00edas, F. (2012, January 27\u201331). Classification of audio scenes using Narrow-Band Autocorrelation features. Proceedings of the 20th European Signal Processing Conference (EUSIPCO), Bucharest, Romania."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Socor\u00f3, J., Al\u00edas, F., and Alsina-Pag\u00e8s, R. (2017). An Anomalous Noise Events Detector for Dynamic Road Traffic Noise Mapping in Real-Life Urban and Suburban Environments. Sensors, 17.","DOI":"10.3390\/s17102323"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1007\/s10772-016-9354-4","article-title":"Robust acoustic bird recognition for habitat monitoring with wireless sensor networks","volume":"19","author":"Boulmaiz","year":"2016","journal-title":"Int. J. Speech Technol."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Al\u00edas, F., Socor\u00f3, J.C., Orga, F., and Alsina-Pag\u00e8s, R.M. (2019, January 15\u201330). Characterization of A WASN-Based Urban Acoustic Dataset for the Dynamic Mapping of Road Traffic Noise. Proceedings of the 6th ECSA\u2014Electronic Conference on Sensors and Applications.","DOI":"10.3390\/ecsa-6-06637"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Alsina-Pag\u00e8s, R.M., Orga, F., Al\u00edas, F., and Socor\u00f3, J.C. (2019). A WASN-Based Suburban Dataset for Anomalous Noise Event Detection on Dynamic Road-Traffic Noise Mapping. Sensors, 19.","DOI":"10.3390\/s19112480"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1109\/TASSP.1980.1163420","article-title":"Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences","volume":"28","author":"Davis","year":"1980","journal-title":"IEEE Trans. Acoust. Speech Signal. Process."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Aurino, F., Folla, M., Gargiulo, F., Moscato, V., Picariello, A., and Sansone, C. (2014, January 10\u201312). One-Class SVM Based Approach for Detecting Anomalous Audio Events. Proceedings of the 2014 International Conference on Intelligent Networking and Collaborative Systems, Salerno, Italy.","DOI":"10.1109\/INCoS.2014.59"},{"key":"ref_32","unstructured":"Mesaros, A., Heittola, T., Eronen, A., and Virtanen, T. (2010, January 23\u201327). Acoustic event detection in real life recordings. Proceedings of the 18th European Signal Processing Conference, Aalborg, Denmark."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Salamon, J., Jacoby, C., and Bello, J. (2014, January 3\u20137). A Dataset and Taxonomy for Urban Sound Research. Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, FL, USA.","DOI":"10.1145\/2647868.2655045"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1016\/j.dsp.2014.05.003","article-title":"Universal background modeling for acoustic surveillance of urban traffic","volume":"31","author":"Ntalampiras","year":"2014","journal-title":"Digit. Signal Process."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2096","DOI":"10.1109\/TASLP.2016.2592698","article-title":"Automatic environmental sound recognition: Performance versus computational cost","volume":"24","author":"Sigtia","year":"2016","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Stattner, E., Hunel, P., Vidot, N., and Collard, M. (2011, January 20\u201324). Acoustic scheme to count bird songs with wireless sensor networks. Proceedings of the 2011 IEEE International Symposium onWorld ofWireless, Mobile and Multimedia Networks (WoWMoM), Lucca, Italy.","DOI":"10.1109\/WoWMoM.2011.5986215"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"8463","DOI":"10.1016\/j.eswa.2015.07.002","article-title":"Audio parameterization with robust frame selection for improved bird identification","volume":"42","author":"Ventura","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Vida\u00f1a-Vila, E., Navarro, J., Alsina-Pag\u00e8s, R., and Ram\u00edrez, \u00c1. (2020). A two-stage approach to automatically detect and classify woodpecker (Fam. Picidae) sounds. Appl. Acoust., 166.","DOI":"10.1016\/j.apacoust.2020.107312"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Mulimani, M., and Koolagudi, S. (2019, January 15\u201319). Locality-constrained Linear Coding based Fused Visual Features for Robust Acoustic Event Classification. Proceedings of the Interspeech 2019, Graz, Austria.","DOI":"10.21437\/Interspeech.2019-1421"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Aguilar-Ortega, M., Moh\u00edno-Erranz, I., Utrilla-Manso, M., Garc\u00eda-G\u00f3mez, J., Gil-Pita, R., and Rosa-Zurera, M. (2019, January 22\u201325). Multi-microphone acoustic events detection and classification for indoor monitoring. Proceedings of the 2019 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland.","DOI":"10.23919\/SPA.2019.8936807"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1109\/TSMCC.2013.2257752","article-title":"Review of automatic fault diagnosis systems using audio and vibration signals","volume":"44","author":"Henriquez","year":"2014","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"6098","DOI":"10.1016\/j.eswa.2015.03.036","article-title":"Automated acoustic detection of Vanellus chilensis lampronotus","volume":"42","author":"Ganchev","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Jan\u010dovi\u010d, P., and K\u00f6k\u00fcer, M. (2011). Automatic detection and recognition of tonal bird sounds in noisy environments. EURASIP J. Adv. Signal Process., 2011.","DOI":"10.1155\/2011\/982936"},{"key":"ref_44","unstructured":"Casals, E. (2016). Programaci\u00f3 Paral.lela en Processadors Gr\u00e0fics Per a La Separaci\u00f3 de Fonts Sonores en L`Entorn de La Llar. La Salle. [Master\u2019s Thesis, Ramon Llull University]."},{"key":"ref_45","unstructured":"Collaborative (2021, February 10). The Freesound Project. Available online: https:\/\/freesound.org\/."},{"key":"ref_46","unstructured":"BBC (2021, February 10). The BBC Sound Effects Library: Original Series. Available online: https:\/\/www.sound-ideas.com\/Product\/152\/BBC-Sound-Effects-Library-Original-Series."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1684","DOI":"10.1109\/TMM.2012.2199972","article-title":"Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification","volume":"14","author":"Valero","year":"2012","journal-title":"IEEE Trans. Multimed."},{"key":"ref_48","unstructured":"Valero, X., and Al\u00edas, F. (2012, January 12\u201319). An\u00e1lisis de la se\u00f1al ac\u00fastica mediante coeficientes cepstrales bio-inspirados y su aplicaci\u00f3n al reconocimiento de paisajes sonoros (spanish). Proceedings of the ACUSTICA, Lisbon, Portugal."},{"key":"ref_49","unstructured":"Valero, X., and Al\u00edas, F. (2012, January 27\u201331). Gammatone Wavelet features for Sound Classification in Surveillance Applications. Proceedings of the 20th European Signal Processing Conference (EUSIPCO), Bucharest, Romania."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Vida\u00f1 a Vila, E., Navarro, J., and Alsina-Pag\u00e8s, R. (2017). Towards Automatic Bird Detection: An Annotated and Segmented Acoustic Dataset of Seven Picidae species. Data, 2.","DOI":"10.3390\/data2020018"},{"key":"ref_51","unstructured":"Foundation, X.C. (2017, April 15). Xeno-Canto: Sharing Bird Sounds from around the World. Available online: https:\/\/www.xeno-canto.org\/."},{"key":"ref_52","unstructured":"Patterson, R., and Moore, B. (1986). Auditory filters and excitation patterns as representations of frequency resolution. Frequency Selectivity in Hear-Ing, Academic Press."},{"key":"ref_53","unstructured":"Patterson, R., Nimmo-Smith, I., Holdsworth, J., and Rice, P. (1987, January 14\u201315). An Efficient Auditory Filterbank Based on the Gammatone Function. Proceedings of the IOC Speech Group on Auditory Modelling, Malvern, UK."},{"key":"ref_54","first-page":"554","article-title":"A functional model of neural activity patterns and auditory images","volume":"Volume 3","author":"Ainsworth","year":"1996","journal-title":"Advances in Speech, Hearing and Language Processing"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"880","DOI":"10.1121\/1.4807807","article-title":"Narrow-band autocorrelation function features for the automatic recognition of acoustic environments","volume":"134","author":"Valero","year":"2013","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","article-title":"Nearest neighbor pattern classification","volume":"13","author":"Cover","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_57","unstructured":"Haykin, S. (1993). Neural Networks and Learning Machines, Pearson-Prentice Hall."},{"key":"ref_58","unstructured":"Jaakkola, T., Singh, R., and Mohammad, A. (2021, February 10). 6.867 Machine Learning. Fall 2006. Massachusetts Institute of Technology: MIT OpenCourseWare. Available online: https:\/\/ocw.mit.edu."},{"key":"ref_59","unstructured":"Bilmes, J. (1998). A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models, International Computer Science Institute. Report."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1006\/jsvi.2000.3278","article-title":"Acoustical properties of aircraft noise measured by temporal and spatial factors","volume":"241","author":"Fuiji","year":"2001","journal-title":"J. Sound Vib."},{"key":"ref_61","unstructured":"Valero, X., Al\u00edas, F., Kephalopoulos, S., and Paviotti, M. (2009, January 26\u201328). Pattern recognition and separation of road noise sources by means of ACF, MFCC and probability density estimation. Proceedings of the Euronoise Conference, Edinburgh, UK."},{"key":"ref_62","first-page":"335","article-title":"A revision of Zwicker\u2019s loudness model","volume":"82","author":"Moore","year":"1996","journal-title":"Acta Acust."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Navarro, J., Vida\u00f1 a-Vila, E., Alsina-Pag\u00e8s, R.M., and Herv\u00e1s, M. (2018). Real-Time Distributed architecture for remote acoustic elderly monitoring in Residential-Scale ambient assisted living scenarios. Sensors, 18.","DOI":"10.3390\/s18082492"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1274\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:22:44Z","timestamp":1760160164000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1274"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,11]]},"references-count":63,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["s21041274"],"URL":"https:\/\/doi.org\/10.3390\/s21041274","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,11]]}}}