{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T20:25:15Z","timestamp":1776889515370,"version":"3.51.2"},"reference-count":24,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2023,1,7]],"date-time":"2023-01-07T00:00:00Z","timestamp":1673049600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61975053"],"award-info":[{"award-number":["61975053"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["22A510013"],"award-info":[{"award-number":["22A510013"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["22A520004"],"award-info":[{"award-number":["22A520004"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["22A510001"],"award-info":[{"award-number":["22A510001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2018BS037"],"award-info":[{"award-number":["2018BS037"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Natural Science Project of Henan Education Department","award":["61975053"],"award-info":[{"award-number":["61975053"]}]},{"name":"Natural Science Project of Henan Education Department","award":["22A510013"],"award-info":[{"award-number":["22A510013"]}]},{"name":"Natural Science Project of Henan Education Department","award":["22A520004"],"award-info":[{"award-number":["22A520004"]}]},{"name":"Natural Science Project of Henan Education Department","award":["22A510001"],"award-info":[{"award-number":["22A510001"]}]},{"name":"Natural Science Project of Henan Education Department","award":["2018BS037"],"award-info":[{"award-number":["2018BS037"]}]},{"name":"Start-up Fund for High-level Talents of Henan University of Technology","award":["61975053"],"award-info":[{"award-number":["61975053"]}]},{"name":"Start-up Fund for High-level Talents of Henan University of Technology","award":["22A510013"],"award-info":[{"award-number":["22A510013"]}]},{"name":"Start-up Fund for High-level Talents of Henan University of Technology","award":["22A520004"],"award-info":[{"award-number":["22A520004"]}]},{"name":"Start-up Fund for High-level Talents of Henan University of Technology","award":["22A510001"],"award-info":[{"award-number":["22A510001"]}]},{"name":"Start-up Fund for High-level Talents of Henan University of Technology","award":["2018BS037"],"award-info":[{"award-number":["2018BS037"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition tasks, this paper proposed an emotion recognition model based on multi-task learning and subdomain adaptation, which alleviates the impact on emotion recognition. Existing methods have shortcomings in speech feature representation and cross-corpus feature distribution alignment. The proposed model uses a deep denoising auto-encoder as a shared feature extraction network for multi-task learning, and the fully connected layer and softmax layer are added before each recognition task as task-specific layers. Subsequently, the subdomain adaptation algorithm of emotion and gender features is added to the shared network to obtain the shared emotion features and gender features of the source domain and target domain, respectively. Multi-task learning effectively enhances the representation ability of features, a subdomain adaptive algorithm promotes the migrating ability of features and effectively alleviates the impact of feature distribution differences in emotional features. The average results of six cross-corpus speech emotion recognition experiments show that, compared with other models, the weighted average recall rate is increased by 1.89~10.07%, the experimental results verify the validity of the proposed model.<\/jats:p>","DOI":"10.3390\/e25010124","type":"journal-article","created":{"date-parts":[[2023,1,9]],"date-time":"2023-01-09T02:09:39Z","timestamp":1673230179000},"page":"124","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation"],"prefix":"10.3390","volume":"25","author":[{"given":"Hongliang","family":"Fu","sequence":"first","affiliation":[{"name":"College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China"},{"name":"Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China"},{"name":"Key Laboratory of Food Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhihao","family":"Zhuang","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China"},{"name":"Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yang","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China"},{"name":"Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chen","family":"Huang","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China"},{"name":"Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenzhuo","family":"Duan","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China"},{"name":"Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1109\/MSP.2021.3106890","article-title":"On the Evolution of Speech Representations for Affective Computing: A brief history and critical overview","volume":"38","author":"Alisamir","year":"2021","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"9411","DOI":"10.1007\/s11042-020-10073-7","article-title":"Automatic speech recognition: A survey","volume":"80","author":"Malik","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1853","DOI":"10.1109\/TASLP.2022.3178225","article-title":"Neonatal Bowel Sound Detection Using Convolutional Neural Network and Laplace Hidden Semi-Markov Model","volume":"30","author":"Sitaula","year":"2022","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Burne, L., Sitaula, C., Priyadarshi, A., Tracy, M., Kavehei, O., Hinder, M., Withana, A., McEwan, A., and Marzbanrad, F. Ensemble Approach on Deep and Handcrafted Features for Neonatal Bowel Sound Detection. IEEE J. Biomed. Health Inform., 2022.","DOI":"10.1109\/JBHI.2022.3217559"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lee, S. (2021, January 19\u201322). Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition. Proceedings of the IEEE Spoken Language Technology Workshop, Shenzhen, China.","DOI":"10.1109\/SLT48900.2021.9383534"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Antoniadis, P., Filntisis, P.P., and Maragos, P. (2021, January 15\u201318). Exploiting Emotional Dependencies with Graph Convolutional Networks for Facial Expression Recognition. Proceedings of the 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), Jodhpur, India.","DOI":"10.1109\/FG52635.2021.9667014"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1016\/j.neucom.2022.10.013","article-title":"In search of a robust facial expressions recognition model: A large-scale visual cross-corpus study","volume":"514","author":"Ryumina","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2132","DOI":"10.1109\/TAFFC.2022.3188390","article-title":"Classifying Emotions and Engagement in Online Learning Based on a Single Facial Expression Recognition Neural Network","volume":"13","author":"Savchenko","year":"2022","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3204314","DOI":"10.1109\/TIM.2022.3204314","article-title":"A Multi-Dimensional Graph Convolution Network for EEG Emotion Recognition","volume":"71","author":"Du","year":"2022","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"5321","DOI":"10.1109\/JBHI.2021.3083525","article-title":"3DCANN: A spatio-temporal convolution attention neural network for EEG emotion recognition","volume":"26","author":"Liu","year":"2021","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1068","DOI":"10.1109\/LSP.2014.2324759","article-title":"Autoencoder-based unsupervised domain adaptation for speech emotion recognition","volume":"21","author":"Deng","year":"2014","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"6785","DOI":"10.1007\/s11042-016-3354-x","article-title":"Unsupervised domain adaptation for speech emotion recognition using PCANet","volume":"76","author":"Huang","year":"2017","journal-title":"Multimed. Tools Appl."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1109\/LSP.2016.2537926","article-title":"Cross-corpus speech emotion recognition based on domain-adaptive least-squares regression","volume":"23","author":"Zong","year":"2016","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Liu, N., Zong, Y., Zhang, B., Liu, L., Chen, J., Zhao, G., and Zhu, J. (2018, January 15\u201320). Unsupervised cross-corpus speech emotion recognition using domain-adaptive subspace learning. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada.","DOI":"10.1109\/ICASSP.2018.8461848"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1109\/TAFFC.2017.2705696","article-title":"Transfer linear subspace learning for cross-corpus speech emotion recognition","volume":"10","author":"Song","year":"2019","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2047","DOI":"10.1109\/TASLP.2020.3006331","article-title":"Nonnegative matrix factorization based transfer subspace learning for cross-corpus speech emotion recognition","volume":"28","author":"Luo","year":"2020","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1587\/transinf.2019EDL8136","article-title":"Cross-corpus speech emotion recognition based on deep domain-adaptive convolutional neural network","volume":"103","author":"Liu","year":"2020","journal-title":"IEICE Trans. Inf. Syst."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1713","DOI":"10.1109\/TNNLS.2020.2988928","article-title":"Deep subdomain adaptation network for image classification","volume":"32","author":"Zhu","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., and Weiss, B. (2005, January 4\u20138). A-corpus of German emotional speech. Proceedings of the Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal.","DOI":"10.21437\/Interspeech.2005-446"},{"key":"ref_20","unstructured":"Martin, O., Kotsia, I., Macq, B., and Pitas, I. (2006, January 3\u20137). The eNTERFACE\u201905 audio-visual emotion-corpus. Proceedings of the 22nd International Conference on Data Engineering Workshops, Atlanta, GA, USA."},{"key":"ref_21","unstructured":"Tao, J., Liu, F., Zhang, M., and Jia, H. (2008, January 20). Design of speech corpus for mandarin text to speech. Proceedings of the Blizzard Challenge 2008 Workshop, Brisbane Australia."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1109\/TASLP.2019.2955252","article-title":"Transfer sparse discriminant subspace learning for cross-corpus speech emotion recognition","volume":"28","author":"Zhang","year":"2019","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Eyben, F., W\u00f6llmer, M., and Schuller, B. (2010, January 25\u201329). Opensmile: The munich versatile and fast open-source audio feature extractor. Proceedings of the 18th ACM International Conference on Multimedia, Firenze Italy.","DOI":"10.1145\/1873951.1874246"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Latif, S., Rana, R., Younis, S., Qadir, J., and Epps, J. (2018). Transfer learning for improving speech emotion classification accuracy. arXiv.","DOI":"10.21437\/Interspeech.2018-1625"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/1\/124\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:02:36Z","timestamp":1760119356000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/1\/124"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,7]]},"references-count":24,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,1]]}},"alternative-id":["e25010124"],"URL":"https:\/\/doi.org\/10.3390\/e25010124","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,7]]}}}