{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,18]],"date-time":"2026-06-18T19:31:17Z","timestamp":1781811077011,"version":"3.54.5"},"reference-count":135,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T00:00:00Z","timestamp":1760400000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union - Next Generation EU under the Italian National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 1.3","award":["C49J24000240004"],"award-info":[{"award-number":["C49J24000240004"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Speech emotion recognition (SER) has become increasingly important in areas such as healthcare, customer service, robotics, and human\u2013computer interaction. The progress of this field depends not only on advances in algorithms but also on the databases that provide the training material for SER systems. These resources set the boundaries for how well models can generalize across speakers, contexts, and cultures. In this paper, we present a narrative review and comparative analysis of emotional speech corpora released up to mid-2025, bringing together both psychological and technical perspectives. Rather than following a systematic review protocol, our approach focuses on providing a critical synthesis of more than fifty corpora covering acted, elicited, and natural speech. We examine how these databases were collected, how emotions were annotated, their demographic diversity, and their ecological validity, while also acknowledging the limits of available documentation. Beyond description, we identify recurring strengths and weaknesses, highlight emerging gaps, and discuss recent usage patterns to offer researchers both a practical guide for dataset selection and a critical perspective on how corpus design continues to shape the development of robust and generalizable SER systems.<\/jats:p>","DOI":"10.3390\/data10100164","type":"journal-article","created":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T06:06:32Z","timestamp":1760508392000},"page":"164","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Review and Comparative Analysis of Databases for Speech Emotion Recognition"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0507-5186","authenticated-orcid":false,"given":"Salvatore","family":"Serrano","sequence":"first","affiliation":[{"name":"Laboratory of Digital Signal Processing, Department of Engineering, University of Messina, 98122 Messina, Italy"},{"name":"National Inter-University Consortium for Telecommunications, Research Unit at University of Messina, 98122 Messina, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-4404-9074","authenticated-orcid":false,"given":"Omar","family":"Serghini","sequence":"additional","affiliation":[{"name":"Laboratory of Digital Signal Processing, Department of Engineering, University of Messina, 98122 Messina, Italy"},{"name":"National Inter-University Consortium for Telecommunications, Research Unit at University of Messina, 98122 Messina, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-8024-8707","authenticated-orcid":false,"given":"Giulia","family":"Esposito","sequence":"additional","affiliation":[{"name":"Laboratory of Digital Signal Processing, Department of Engineering, University of Messina, 98122 Messina, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2951-9119","authenticated-orcid":false,"given":"Silvia","family":"Carbone","sequence":"additional","affiliation":[{"name":"Dipartimento di Scienze Politiche e Giuridiche, University of Messina, 98122 Messina, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4611-3740","authenticated-orcid":false,"given":"Carmela","family":"Mento","sequence":"additional","affiliation":[{"name":"Department of Biomedical and Dental Sciences and Morphofunctional Imaging, University of Messina, Via Consolare Valeria, 1, 98125 Messina, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8745-1327","authenticated-orcid":false,"given":"Alessandro","family":"Floris","sequence":"additional","affiliation":[{"name":"Department of Electrical and Electronic Engineering, University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy"},{"name":"National Inter-University Consortium for Telecommunications, Research Unit at University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0792-1200","authenticated-orcid":false,"given":"Simone","family":"Porcu","sequence":"additional","affiliation":[{"name":"Department of Electrical and Electronic Engineering, University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy"},{"name":"National Inter-University Consortium for Telecommunications, Research Unit at University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1350-3574","authenticated-orcid":false,"given":"Luigi","family":"Atzori","sequence":"additional","affiliation":[{"name":"Department of Electrical and Electronic Engineering, University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy"},{"name":"National Inter-University Consortium for Telecommunications, Research Unit at University of Cagliari, Via Marengo, 2, 09123 Cagliari, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Munot, R., and Nenkova, A. (2019, January 3\u20135). Emotion impacts speech recognition performance. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, Minneapolis, Minnesota.","DOI":"10.18653\/v1\/N19-3003"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"6005446","DOI":"10.1155\/2022\/6005446","article-title":"Human-computer interaction for recognizing speech emotions using multilayer perceptron classifier","volume":"2022","author":"Alnuaim","year":"2022","journal-title":"J. Healthc. Eng."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lakomkin, E., Zamani, M.A., Weber, C., Magg, S., and Wermter, S. (2018, January 1\u20135). On the robustness of speech emotion recognition for human-robot interaction with deep neural networks. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593571"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Alshamsi, H., Kepuska, V., Alshamsi, H., and Meng, H. (2018, January 8\u201310). Automated speech emotion recognition on smart phones. Proceedings of the 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA.","DOI":"10.1109\/UEMCON.2018.8796594"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bojani\u0107, M., Deli\u0107, V., and Karpov, A. (2020, January 24\u201325). Effect of Emotion Distribution on a Call Processing for an Emergency Call Center. Proceedings of the 2020 28th Telecommunications Forum (TELFOR), Belgrade, Serbia.","DOI":"10.1109\/TELFOR51502.2020.9306564"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Abbaschian, B.J., Sierra-Sosa, D., and Elmaghraby, A. (2021). Deep learning techniques for speech emotion recognition, from databases to models. Sensors, 21.","DOI":"10.3390\/s21041249"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"123723","DOI":"10.1016\/j.eswa.2024.123723","article-title":"Enhancing emotion recognition using multimodal fusion of physiological, environmental, personal data","volume":"249","author":"Kim","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wu, X., and Zhang, Q. (2022). Intelligent aging home control method and system for internet of things emotion recognition. Front. Psychol., 13.","DOI":"10.3389\/fpsyg.2022.882699"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1111\/acps.13388","article-title":"A generalizable speech emotion recognition model reveals depression and remission","volume":"145","author":"Hansen","year":"2022","journal-title":"Acta Psychiatr. Scand."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1472","DOI":"10.1109\/TAFFC.2021.3135152","article-title":"Emonet: A transfer learning framework for multi-corpus speech emotion recognition","volume":"14","author":"Gerczuk","year":"2021","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Livingstone, S.R., and Russo, F.A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0196391"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"29307","DOI":"10.1007\/s11042-023-14656-y","article-title":"Trends in speech emotion recognition: A comprehensive survey","volume":"82","author":"Kaur","year":"2023","journal-title":"Multimed. Tools Appl."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"151122","DOI":"10.1109\/ACCESS.2024.3476960","article-title":"Speech Databases, Speech Features, and Classifiers in Speech Emotion Recognition: A Review","volume":"12","author":"Delhibabu","year":"2024","journal-title":"IEEE Access"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"n71","DOI":"10.1136\/bmj.n71","article-title":"The PRISMA 2020 statement: An updated guideline for reporting systematic reviews","volume":"372","author":"Page","year":"2021","journal-title":"BMJ"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"535","DOI":"10.4103\/indianjpsychiatry.indianjpsychiatry_373_25","article-title":"Understanding different types of review articles: A primer for early career researchers","volume":"67","author":"Ghosh","year":"2025","journal-title":"Indian J. Psychiatry"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1179\/2047480615Z.000000000329","article-title":"Writing narrative style literature reviews","volume":"24","author":"Ferrari","year":"2015","journal-title":"Med. Write"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Matveev, Y., Matveev, A., Frolova, O., Lyakso, E., and Ruban, N. (2022). Automatic speech emotion recognition of younger school age children. Mathematics, 10.","DOI":"10.3390\/math10142373"},{"key":"ref_18","first-page":"4752","article-title":"Creation of speech corpus for emotion analysis in Gujarati language and its evaluation by various speech parameters","volume":"10","author":"Tank","year":"2020","journal-title":"Int. J. Electr. Comput. Eng."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"4459","DOI":"10.1007\/s00034-020-01377-y","article-title":"Excitation features of speech for emotion recognition using neutral speech as reference","volume":"39","author":"Kadiri","year":"2020","journal-title":"Circuits Syst. Signal Process."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Baek, J.Y., and Lee, S.P. (2023). Enhanced speech emotion recognition using dcgan-based data augmentation. Electronics, 12.","DOI":"10.3390\/electronics12183966"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"103010","DOI":"10.1016\/j.specom.2023.103010","article-title":"Multiscale-multichannel feature extraction and classification through one-dimensional convolutional neural network for Speech emotion recognition","volume":"156","author":"Liu","year":"2024","journal-title":"Speech Commun."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Alluhaidan, A.S., Saidani, O., Jahangir, R., Nauman, M.A., and Neffati, O.S. (2023). Speech emotion recognition through hybrid features and convolutional neural network. Appl. Sci., 13.","DOI":"10.3390\/app13084750"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Saumard, M. (2023). Enhancing Speech Emotions Recognition Using Multivariate Functional Data Analysis. Big Data Cogn. Comput., 7.","DOI":"10.3390\/bdcc7030146"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"462","DOI":"10.4218\/etrij.2020-0458","article-title":"Speech emotion recognition based on genetic algorithm\u2013decision tree fusion of deep and acoustic features","volume":"44","author":"Sun","year":"2022","journal-title":"ETRI J."},{"key":"ref_25","unstructured":"Welivita, A., Xie, Y., and Pu, P. (2020). Fine-grained emotion and intent learning in movie dialogues. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"49265","DOI":"10.1109\/ACCESS.2022.3172954","article-title":"Robust speech emotion recognition using CNN+ LSTM based on stochastic fractal search optimization algorithm","volume":"10","author":"Abdelhamid","year":"2022","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"112460","DOI":"10.1109\/ACCESS.2022.3217226","article-title":"3d convolutional neural network for speech emotion recognition with its realization on intel cpu and nvidia gpu","volume":"10","author":"Falahzadeh","year":"2022","journal-title":"IEEE Access"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"9910","DOI":"10.1109\/TCSVT.2024.3405406","article-title":"Multimodal Decoupled Distillation Graph Neural Network for Emotion Recognition in Conversation","volume":"34","author":"Dai","year":"2024","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"22759","DOI":"10.1007\/s11042-023-14680-y","article-title":"End-to-end emotional speech recognition using acoustic model adaptation based on knowledge distillation","volume":"82","author":"Yun","year":"2023","journal-title":"Multimed. Tools Appl."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., and Mahjoub, M.A. (2017, January 22\u201324). A review on speech emotion recognition: Case of pedagogical interaction in classroom. Proceedings of the 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Fez, Morocco.","DOI":"10.1109\/ATSIP.2017.8075575"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1037\/h0030377","article-title":"Constants across cultures in the face and emotion","volume":"17","author":"Ekman","year":"1971","journal-title":"J. Personal. Soc. Psychol."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Ekman, P. (1999). Basic Emotions. Handbook of Cognition and Emotion, John Wiley & Sons, Ltd.. Chapter 3.","DOI":"10.1002\/0470013494.ch3"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1037\/amp0000488","article-title":"What the face displays: Mapping 28 emotions conveyed by naturalistic expression","volume":"75","author":"Cowen","year":"2020","journal-title":"Am. Psychol."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1177\/1754073911410745","article-title":"Don\u2019t give up on basic emotions","volume":"3","author":"Scarantino","year":"2011","journal-title":"Emot. Rev."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/1754073919897295","article-title":"Cross-cultural emotion recognition and in-group advantage in vocal expression: A meta-analysis","volume":"13","author":"Laukka","year":"2021","journal-title":"Emot. Rev."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Dirzyte, A., Antanaitis, F., and Patapas, A. (2022). Law enforcement officers\u2019 ability to recognize emotions: The role of personality traits and Basic needs\u2019 satisfaction. Behav. Sci., 12.","DOI":"10.3390\/bs12100351"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"7241","DOI":"10.1073\/pnas.1200155109","article-title":"Facial expressions of emotion are not culturally universal","volume":"109","author":"Jack","year":"2012","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1037\/h0077714","article-title":"A circumplex model of affect","volume":"39","author":"Russell","year":"1980","journal-title":"J. Personal. Soc. Psychol."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"102019","DOI":"10.1016\/j.inffus.2023.102019","article-title":"Emotion recognition and artificial intelligence: A systematic review (2014\u20132023) and research recommendations","volume":"102","author":"Khare","year":"2024","journal-title":"Inf. Fusion"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1109\/TAFFC.2017.2772882","article-title":"Continuous, real-time emotion annotation: A novel joystick-based analysis framework","volume":"11","author":"Sharma","year":"2017","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/T-AFFC.2010.1","article-title":"Affect detection: An interdisciplinary review of models, methods, and their applications","volume":"1","author":"Calvo","year":"2010","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Guo, R., Guo, H., Wang, L., Chen, M., Yang, D., and Li, B. (2024). Development and application of emotion recognition technology\u2014A systematic literature review. BMC Psychol., 12.","DOI":"10.1186\/s40359-024-01581-4"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"102218","DOI":"10.1016\/j.inffus.2023.102218","article-title":"Multimodal Emotion Recognition with deep learning: Advancements, challenges, and future directions","volume":"105","author":"Geetha","year":"2024","journal-title":"Inf. Fusion"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"8901","DOI":"10.1007\/s00521-024-09426-2","article-title":"Machine learning for human emotion recognition: A comprehensive review","volume":"36","author":"Younis","year":"2024","journal-title":"Neural Comput. Appl."},{"key":"ref_45","first-page":"1","article-title":"Deep emotion recognition in textual conversations: A survey","volume":"58","author":"Pereira","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., and Weiss, B. (2005, January 4\u20138). A database of German emotional speech. Proceedings of the Interspeech, Lisbon, Portugal.","DOI":"10.21437\/Interspeech.2005-446"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Schr\u00f6der, M. (2001, January 3\u20137). Emotional speech synthesis: A review. Proceedings of the Seventh European Conference on Speech Communication and Technology, Aalborg, Denmark.","DOI":"10.21437\/Eurospeech.2001-150"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/79.911197","article-title":"Emotion recognition in human-computer interaction","volume":"18","author":"Cowie","year":"2001","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/S0167-6393(02)00070-5","article-title":"Emotional speech: Towards a new generation of databases","volume":"40","author":"Campbell","year":"2003","journal-title":"Speech Commun."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1007\/s10579-008-9076-6","article-title":"IEMOCAP: Interactive emotional dyadic motion capture database","volume":"42","author":"Busso","year":"2008","journal-title":"Lang. Resour. Eval."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"1162","DOI":"10.1016\/j.specom.2006.04.003","article-title":"Emotional speech recognition: Resources, features, and methods","volume":"48","author":"Ververidis","year":"2006","journal-title":"Speech Commun."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1016\/S0167-6393(02)00084-5","article-title":"Vocal communication of emotion: A review of research paradigms","volume":"40","author":"Scherer","year":"2003","journal-title":"Speech Commun."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1080\/02699939508408966","article-title":"Emotion elicitation using films","volume":"9","author":"Gross","year":"1995","journal-title":"Cogn. Emot."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"1153","DOI":"10.1080\/02699930903274322","article-title":"Assessing the effectiveness of a large database of emotion-eliciting films: A new tool for emotion researchers","volume":"24","author":"Schaefer","year":"2010","journal-title":"Cogn. Emot."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Parsons, T.D. (2015). Virtual reality for enhanced ecological validity and experimental control in the clinical, affective and social neurosciences. Front. Hum. Neurosci., 9.","DOI":"10.3389\/fnhum.2015.00660"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"1062","DOI":"10.1016\/j.specom.2011.01.011","article-title":"Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge","volume":"53","author":"Schuller","year":"2011","journal-title":"Speech Commun."},{"key":"ref_57","unstructured":"Douglas-Cowie, E., Cowie, R., Sneddon, I., Cox, C., Lowry, O., Mcrorie, M., Martin, J.C., Devillers, L., Abrilian, S., and Batliner, A. (2007, January 12\u201314). The HUMAINE database: Addressing the collection and annotation of naturalistic and induced emotional data. Proceedings of the Affective Computing and Intelligent Interaction: Second International Conference, ACII 2007, Lisbon, Portugal. Proceedings 2."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Schuller, B., and Batliner, A. (2014). Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing, John Wiley & Sons Ltd.","DOI":"10.1002\/9781118706664"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"102334","DOI":"10.1016\/j.telpol.2022.102334","article-title":"Privacy in AI and the IoT: The privacy concerns of smart speaker users and the Personal Information Protection Law in China","volume":"46","author":"Liu","year":"2022","journal-title":"Telecommun. Policy"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Schuller, B., Steidl, S., and Batliner, A. (2009, January 6\u201310). The interspeech 2009 emotion challenge 2009. In Proceedings of Interspeech 2009, Brighton, UK.","DOI":"10.21437\/Interspeech.2009-103"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Engberg, I.S., Hansen, A.V., Andersen, O., and Dalsgaard, P. (1997, January 22\u201325). Design, recording and verification of a Danish emotional speech database. Proceedings of the Fifth European Conference on Speech Communication and Technology, Rhodes, Greece.","DOI":"10.21437\/Eurospeech.1997-482"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Hansen, J.H., Bou-Ghazale, S.E., Sarikaya, R., and Pellom, B. (1997, January 22\u201325). Getting started with SUSAS: A speech under simulated and actual stress database. Proceedings of the Eurospeech, Rhodes, Greece.","DOI":"10.21437\/Eurospeech.1997-494"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Campbell, N. (2001, January 3\u20137). Building a Corpus of Natural Speech\u2013and Tools for the Processing of Expressive Speech\u2013the JST CREST ESP Project. Proceedings of the 7th European Conference on Speech Communication and Technology, Aalborg, Denmark.","DOI":"10.21437\/Eurospeech.2001-377"},{"key":"ref_64","unstructured":"Hozjan, V., Kacic, Z., Moreno, A., Bonafonte, A., and Nogueiras, A. (June, January 27). Interface Databases: Design and Collection of a Multilingual Emotional Speech Database. Proceedings of the Third International Conference on Language Resources and Evaluation (LREC\u201902), Las Palmas, Spain."},{"key":"ref_65","unstructured":"Schiel, F., Steininger, S., and T\u00fcrk, U. (June, January 27). The SmartKom Multimodal Corpus at BAS. Proceedings of the Third International Conference on Language Resources and Evaluation (LREC\u201902), Las Palmas, Spain."},{"key":"ref_66","unstructured":"Batliner, A., Hacker, C., Steidl, S., N\u00f6th, E., and Haas, J. (2003, January 28\u201331). User states, user strategies, and system performance: How to match the one with the other. Proceedings of the ITRW on Error Handling in Spoken Dialogue Systems, Chateau d\u2019Oex, Vaud, Switzerland."},{"key":"ref_67","unstructured":"Batliner, A., Hacker, C., Steidl, S., N\u00f6th, E., D\u2019Arcy, S., Russell, M., and Wong, M. (2004, January 26\u201330). \u201cYou Stupid Tin Box\u201d-Children Interacting with the AIBO Robot: A Cross-linguistic Emotional Speech Corpus. Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC\u201904), Lisbon, Portugal."},{"key":"ref_68","unstructured":"Devillers, L., and Vasilescu, I. (2004, January 26\u201330). Reliability of Lexical and Prosodic Cues in Two Real-life Spoken Dialog Corpora. Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC\u201904), Lisbon, Portugal."},{"key":"ref_69","unstructured":"Zovato, E., Pacchiotti, A., Quazza, S., and Sandri, S. (2004, January 14\u201316). Towards emotional speech synthesis: A rule based approach. Proceedings of the Fifth ISCA Workshop on Speech Synthesis, Pittsburgh, PA, USA."},{"key":"ref_70","unstructured":"Abrilian, S., Devillers, L., Buisine, S., and Martin, J.C. (2005, January 22\u201327). EmoTV1: Annotation of real-life emotions for the specification of multimodal affective interfaces. Proceedings of the HCI International, Las Vegas, NV, USA."},{"key":"ref_71","unstructured":"Vidrascu, L., and Devillers, L. (2006, January 22\u201328). Real-life emotions in naturalistic data recorded in a medical call center. Proceedings of the First International Workshop on Emotion: Corpora for Research on Emotion and Affect (International conference on Language Resources and Evaluation (LREC 2006)), Genoa, Italy."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Martin, O., Kotsia, I., Macq, B., and Pitas, I. (2006, January 3\u20137). The eNTERFACE\u201905 audio-visual emotion database. Proceedings of the 22nd International Conference on Data Engineering Workshops (ICDEW\u201906), Atlanta, GA, USA.","DOI":"10.1109\/ICDEW.2006.145"},{"key":"ref_73","unstructured":"Clavel, C., Vasilescu, I., Devillers, L., Richard, G., Ehrette, T., and Sedogbo, C. (2006, January 22\u201328). The SAFE Corpus: Illustrating extreme emotions in dynamic situations. Proceedings of the First International Workshop on Emotion: Corpora for Research on Emotion and Affect (International conference on Language Resources and Evaluation (LREC 2006)), Genoa, Italy."},{"key":"ref_74","unstructured":"Zara, A., Maffiolo, V., Martin, J.C., and Devillers, L. (2007, January 12\u201314). Collection and annotation of a corpus of human-human multimodal interactions: Emotion and others anthropomorphic characteristics. Proceedings of the Affective Computing and Intelligent Interaction: Second International Conference, ACII 2007, Lisbon, Portugal. Proceedings 2."},{"key":"ref_75","unstructured":"Tao, J., Liu, F., Zhang, M., and Jia, H. (2025, October 09). Design of Speech Corpus for Mandarin Text to Speech. Available online: https:\/\/api.semanticscholar.org\/CorpusID:15860480."},{"key":"ref_76","unstructured":"Archetti, F., Arosio, G., Fersini, E., and Messina, E. (2008, January 24). Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain. Proceedings of the 1st International Conference on ICT Solutions for Justice (ICT4Justice \u201908), Thessaloniki, Greece."},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Grimm, M., Kroschel, K., and Narayanan, S. (2008, January 23\u201326). The Vera am Mittag German audio-visual emotional speech database. Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, Hannover, Germany.","DOI":"10.1109\/ICME.2008.4607572"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Wang, W. (2011). Multimodal Emotion Recognition; Machine Audition: Principles, Algorithms and Systems, IGI Global. Chapter 17.","DOI":"10.4018\/978-1-61520-919-4"},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Koolagudi, S.G., Reddy, R., Yadav, J., and Rao, K.S. (2011, January 24\u201325). IITKGP-SEHSC: Hindi speech corpus for emotion analysis. Proceedings of the 2011 International Conference on Devices and Communications (ICDeCom), Mesra, Ranchi, India.","DOI":"10.1109\/ICDECOM.2011.5738540"},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.specom.2011.06.001","article-title":"Emotional states in judicial courtrooms: An experimental investigation","volume":"54","author":"Fersini","year":"2012","journal-title":"Speech Commun."},{"key":"ref_81","first-page":"182","article-title":"Recognition of emotional speech for younger and older talkers: Behavioural findings from the toronto emotional speech set","volume":"39","author":"Dupuis","year":"2011","journal-title":"Can. Acoust."},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1109\/T-AFFC.2011.20","article-title":"The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent","volume":"3","author":"McKeown","year":"2012","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1109\/TAFFC.2014.2336244","article-title":"Crema-d: Crowd-sourced emotional multimodal actors dataset","volume":"5","author":"Cao","year":"2014","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_84","unstructured":"Costantini, G., Iaderola, I., Paoloni, A., and Todisco, M. (2014, January 26\u201331). EMOVO corpus: An Italian emotional speech database. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC\u201914), Reykjavik, Iceland."},{"key":"ref_85","doi-asserted-by":"crossref","first-page":"913","DOI":"10.1007\/s12652-016-0406-z","article-title":"CHEAVD: A Chinese natural emotional audio\u2013visual database","volume":"8","author":"Li","year":"2017","journal-title":"J. Ambient. Intell. Humaniz. Comput."},{"key":"ref_86","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1109\/TAFFC.2016.2515617","article-title":"MSP-IMPROV: An acted corpus of dyadic interactions to study emotion perception","volume":"8","author":"Busso","year":"2016","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Chou, H.C., Lin, W.C., Chang, L.C., Li, C.C., Ma, H.P., and Lee, C.C. (2017, January 23\u201326). NNIME: The NTHU-NTUA Chinese interactive multimodal emotion corpus. Proceedings of the 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), San Antonio, TX, USA.","DOI":"10.1109\/ACII.2017.8273615"},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"457","DOI":"10.17743\/jaes.2018.0036","article-title":"Speech emotion recognition for performance interaction","volume":"66","author":"Vryzas","year":"2018","journal-title":"J. Audio Eng. Soc."},{"key":"ref_89","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1007\/s10470-018-1142-4","article-title":"Emotion recognition in Arabic speech","volume":"96","author":"Klaylat","year":"2018","journal-title":"Analog. Integr. Circuits Signal Process."},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Gournay, P., Lahaie, O., and Lefebvre, R. (2018, January 12\u201315). A canadian french emotional speech dataset. Proceedings of the 9th ACM Multimedia Systems Conference (MMSys \u201918), Amsterdam, The Netherlands.","DOI":"10.1145\/3204949.3208121"},{"key":"ref_91","unstructured":"Zadeh, A.B., Liang, P.P., Poria, S., Cambria, E., and Morency, L.P. (2018, January 15\u201320). Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia."},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10579-018-9427-x","article-title":"ShEMO: A large-scale validated database for Persian speech emotion detection","volume":"53","author":"Karami","year":"2019","journal-title":"Lang. Resour. Eval."},{"key":"ref_93","unstructured":"Aouf, A. (2025, August 03). Basic Arabic Vocal Emotions Database (BAVED). Available online: https:\/\/github.com\/40uf411\/Basic-Arabic-Vocal-Emotions-Dataset."},{"key":"ref_94","unstructured":"Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., and Mihalcea, R. (August, January 28). MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy."},{"key":"ref_95","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1109\/TAFFC.2017.2736999","article-title":"Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech from Existing Podcast Recordings","volume":"10","author":"Lotfian","year":"2019","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_96","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1007\/s10579-019-09450-y","article-title":"DEMoS: An Italian emotional speech corpus: Elicitation methods, machine learning, and perception","volume":"54","author":"Costantini","year":"2020","journal-title":"Lang. Resour. Eval."},{"key":"ref_97","doi-asserted-by":"crossref","unstructured":"Wang, K., Wu, Q., Song, L., Yang, Z., Wu, W., Qian, C., He, R., Qiao, Y., and Loy, C.C. (2020, January 23\u201328). MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation. Proceedings of the Computer Vision\u2013ECCV 2020, Glasgow, UK.","DOI":"10.1007\/978-3-030-58589-1_42"},{"key":"ref_98","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pone.0250173","article-title":"SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla","volume":"16","author":"Sultana","year":"2021","journal-title":"PLoS ONE"},{"key":"ref_99","doi-asserted-by":"crossref","unstructured":"Cui, C., Ren, Y., Liu, J., Chen, F., Huang, R., Lei, M., and Zhao, Z. (September, January 30). EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model. Proceedings of the Interspeech 2021, Brno, Czechia.","DOI":"10.21437\/Interspeech.2021-1148"},{"key":"ref_100","doi-asserted-by":"crossref","first-page":"1441","DOI":"10.3758\/s13428-022-01868-7","article-title":"The Mandarin Chinese auditory emotions stimulus database: A validated set of Chinese pseudo-sentences","volume":"55","author":"Gong","year":"2023","journal-title":"Behav. Res. Methods"},{"key":"ref_101","doi-asserted-by":"crossref","first-page":"52","DOI":"10.17762\/ijritcc.v10i10.5734","article-title":"PEMO: A new validated dataset for Punjabi speech emotion detection","volume":"10","author":"Singla","year":"2022","journal-title":"Int. J. Recent Innov. Trends Comput. Commun"},{"key":"ref_102","doi-asserted-by":"crossref","first-page":"108091","DOI":"10.1016\/j.dib.2022.108091","article-title":"BanglaSER: A speech emotion recognition dataset for the Bangla language","volume":"42","author":"Das","year":"2022","journal-title":"Data Brief"},{"key":"ref_103","doi-asserted-by":"crossref","unstructured":"Chauhan, K., Sharma, K.K., and Varma, T. (2023, January 26\u201328). MNITJ-SEHSD: A Hindi Emotional Speech Database. Proceedings of the 2023 International Conference on Communication, Circuits, and Systems (IC3S), Bhubaneswar, India.","DOI":"10.1109\/IC3S57698.2023.10169497"},{"key":"ref_104","doi-asserted-by":"crossref","first-page":"23055","DOI":"10.1007\/s11042-023-14577-w","article-title":"A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora","volume":"82","author":"Singh","year":"2023","journal-title":"Multimed. Tools Appl."},{"key":"ref_105","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3529759","article-title":"A New Amharic Speech Emotion Dataset and Classification Benchmark","volume":"22","author":"Retta","year":"2023","journal-title":"ACM Trans. Asian Low-Resour. Lang. Inf. Process."},{"key":"ref_106","first-page":"13093","article-title":"EmoMatchSpanishDB: Study of speech emotion recognition machine learning models in a new Spanish elicited database","volume":"83","author":"Salvador","year":"2024","journal-title":"Multimed. Tools Appl."},{"key":"ref_107","unstructured":"Christop, I. (2024, January 22\u201324). nEMO: Dataset of Emotional Speech in Polish. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy."},{"key":"ref_108","doi-asserted-by":"crossref","first-page":"5264","DOI":"10.3758\/s13428-023-02270-7","article-title":"A Cantonese Audio-Visual Emotional Speech (CAVES) dataset","volume":"56","author":"Chong","year":"2024","journal-title":"Behav. Res. Methods"},{"key":"ref_109","doi-asserted-by":"crossref","first-page":"1142","DOI":"10.1109\/TASLPRO.2025.3540662","article-title":"Emozionalmente: A Crowdsourced Corpus of Simulated Emotional Speech in Italian","volume":"33","author":"Catania","year":"2025","journal-title":"IEEE Trans. Audio, Speech Lang. Process."},{"key":"ref_110","first-page":"213","article-title":"A Novel Multi-Task and Ensembled Optimized Parallel Convolutional Autoencoder and Transformer for Speech Emotion Recognition","volume":"56","author":"Seyedin","year":"2024","journal-title":"AUT J. Electr. Eng."},{"key":"ref_111","doi-asserted-by":"crossref","first-page":"110070","DOI":"10.1016\/j.apacoust.2024.110070","article-title":"Enhancing speech emotion recognition through deep learning and handcrafted feature fusion","volume":"222","author":"Akbal","year":"2024","journal-title":"Appl. Acoust."},{"key":"ref_112","doi-asserted-by":"crossref","first-page":"110046","DOI":"10.1016\/j.apacoust.2024.110046","article-title":"Speech emotion recognition using a combination of variational mode decomposition and Hilbert transform","volume":"222","author":"Mishra","year":"2024","journal-title":"Appl. Acoust."},{"key":"ref_113","doi-asserted-by":"crossref","first-page":"112123","DOI":"10.1016\/j.knosys.2024.112123","article-title":"Speech emotion recognition based on bi-directional acoustic-articulatory conversion","volume":"299","author":"Li","year":"2024","journal-title":"Knowl.-Based Syst."},{"key":"ref_114","doi-asserted-by":"crossref","unstructured":"Li, L., Glackin, C., Cannings, N., Veneziano, V., Barker, J., Oduola, O., Woodruff, C., Laird, T., Laird, J., and Sun, Y. (2024, January 16\u201320). Investigating HuBERT-based Speech Emotion Recognition Generalisation Capability. Proceedings of the The 23rd International Conference on Artificial Intelligence and Soft Computing 2024, Zakopane, Poland.","DOI":"10.1007\/978-3-031-84353-2_16"},{"key":"ref_115","doi-asserted-by":"crossref","first-page":"0088","DOI":"10.34133\/icomputing.0088","article-title":"A Systematic Evaluation of Adversarial Attacks against Speech Emotion Recognition Models","volume":"3","author":"Facchinetti","year":"2024","journal-title":"Intell. Comput."},{"key":"ref_116","doi-asserted-by":"crossref","first-page":"36397","DOI":"10.1109\/JIOT.2024.3406771","article-title":"An HASM-Assisted Voice Disguise Scheme for Emotion Recognition of IoT-enabled Voice Interface","volume":"11","author":"Chen","year":"2024","journal-title":"IEEE Internet Things J."},{"key":"ref_117","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1109\/TAFFC.2024.3411290","article-title":"Minority Views Matter: Evaluating Speech Emotion Classifiers with Human Subjective Annotations by an All-Inclusive Aggregation Rule","volume":"16","author":"Chou","year":"2024","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_118","doi-asserted-by":"crossref","first-page":"10623","DOI":"10.1109\/TMM.2024.3410133","article-title":"Improving Pre-trained Model-based Speech Emotion Recognition from a Low-level Speech Feature Perspective","volume":"26","author":"Liu","year":"2024","journal-title":"IEEE Trans. Multimed."},{"key":"ref_119","doi-asserted-by":"crossref","first-page":"426","DOI":"10.33411\/ijist\/202462426433","article-title":"Alex Net-Based Speech Emotion Recognition Using 3D Mel-Spectrograms","volume":"6","author":"Ali","year":"2024","journal-title":"Int. J. Innov. Sci. Technol."},{"key":"ref_120","doi-asserted-by":"crossref","unstructured":"Yue, L., Hu, P., and Zhu, J. (2024). Gender-Driven English Speech Emotion Recognition with Genetic Algorithm. Biomimetics, 9.","DOI":"10.3390\/biomimetics9060360"},{"key":"ref_121","doi-asserted-by":"crossref","unstructured":"Yan, J., Li, H., Xu, F., Zhou, X., Liu, Y., and Yang, Y. (2024). Speech Emotion Recognition Based on Temporal-Spatial Learnable Graph Convolutional Neural Network. Electronics, 13.","DOI":"10.3390\/electronics13112010"},{"key":"ref_122","doi-asserted-by":"crossref","unstructured":"Yu, S., Meng, J., Fan, W., Chen, Y., Zhu, B., Yu, H., Xie, Y., and Sun, Q. (2024). Speech Emotion Recognition Using Dual-Stream Representation and Cross-Attention Fusion. Electronics, 13.","DOI":"10.3390\/electronics13112191"},{"key":"ref_123","first-page":"4","article-title":"Odyssey 2024-Speech Emotion Recognition Challenge: Dataset, Baseline Framework, and Results","volume":"10","author":"Goncalves","year":"2024","journal-title":"Development"},{"key":"ref_124","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10489-024-05536-5","article-title":"Unveiling hidden factors: Explainable AI for feature boosting in speech emotion recognition","volume":"54","author":"Nfissi","year":"2024","journal-title":"Appl. Intell."},{"key":"ref_125","doi-asserted-by":"crossref","first-page":"5090","DOI":"10.1007\/s00034-024-02687-1","article-title":"Transfer Accent Identification Learning for Enhancing Speech Emotion Recognition","volume":"43","year":"2024","journal-title":"Circuits Syst. Signal Process."},{"key":"ref_126","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1007\/s10772-024-10109-5","article-title":"Speech emotion recognition with transfer learning and multi-condition training for noisy environments","volume":"27","author":"Haque","year":"2024","journal-title":"Int. J. Speech Technol."},{"key":"ref_127","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1007\/s11518-024-5607-y","article-title":"Self-supervised Learning for Speech Emotion Recognition Task Using Audio-visual Features and Distil Hubert Model on BAVED and RAVDESS Databases","volume":"33","author":"Dabbabi","year":"2024","journal-title":"J. Syst. Sci. Syst. Eng."},{"key":"ref_128","doi-asserted-by":"crossref","first-page":"111969","DOI":"10.1016\/j.knosys.2024.111969","article-title":"Speaker-aware cognitive network with cross-modal attention for multimodal emotion recognition in conversation","volume":"296","author":"Guo","year":"2024","journal-title":"Knowl.-Based Syst."},{"key":"ref_129","doi-asserted-by":"crossref","first-page":"14029","DOI":"10.1007\/s11042-024-19590-1","article-title":"Hierarchical speech emotion recognition using the valence-arousal model","volume":"84","author":"Haque","year":"2024","journal-title":"Multimed. Tools Appl."},{"key":"ref_130","doi-asserted-by":"crossref","first-page":"10155","DOI":"10.1007\/s11042-024-19321-6","article-title":"ADAM optimised human speech emotion recogniser based on statistical information distribution of chroma, MFCC, and MBSE features","volume":"84","author":"Khurana","year":"2024","journal-title":"Multimed. Tools Appl."},{"key":"ref_131","doi-asserted-by":"crossref","unstructured":"Tyagi, S., and Sz\u00e9n\u00e1si, S. (2024, January 5\u20137). Revolutionizing Speech Emotion Recognition: A Novel Hilbert Curve Approach for Two-Dimensional Representation and Convolutional Neural Network Classification. Proceedings of the International Conference on Robotics in Alpe-Adria Danube Region, Cluj-Napoca, Romania.","DOI":"10.1007\/978-3-031-59257-7_8"},{"key":"ref_132","unstructured":"Hama, K., Otsuka, A., and Ishii, R. (July, January 29). Emotion Recognition in Conversation with Multi-step Prompting Using Large Language Model. Proceedings of the 26th International Conference on Human-Computer Interaction, Washington Hilton Hotel, Washington DC, USA."},{"key":"ref_133","doi-asserted-by":"crossref","unstructured":"Akinpelu, S., Viriri, S., and Adegun, A. (2024). An enhanced speech emotion recognition using vision transformer. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-63776-4"},{"key":"ref_134","doi-asserted-by":"crossref","unstructured":"Singla, C., Singh, S., Sharma, P., Mittal, N., and Gared, F. (2024). Emotion recognition for human\u2013computer interaction using high-level descriptors. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-59294-y"},{"key":"ref_135","doi-asserted-by":"crossref","first-page":"606","DOI":"10.12928\/telkomnika.v22i3.25708","article-title":"Advancements in accurate speech emotion recognition through the integration of CNN-AM model","volume":"22","author":"Adebiyi","year":"2024","journal-title":"TELKOMNIKA (Telecommun. Comput. Electron. Control.)"}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/10\/164\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T04:57:02Z","timestamp":1760590622000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/10\/164"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,14]]},"references-count":135,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["data10100164"],"URL":"https:\/\/doi.org\/10.3390\/data10100164","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,14]]}}}