{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,24]],"date-time":"2025-08-24T01:11:40Z","timestamp":1755997900231,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,20]],"date-time":"2021-10-20T00:00:00Z","timestamp":1634688000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"https:\/\/www.snf.ch"},{"name":"https:\/\/www.innosuisse.ch"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,24]]},"DOI":"10.1145\/3475957.3484448","type":"proceedings-article","created":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T23:34:16Z","timestamp":1634340856000},"page":"51-59","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Fusion of Acoustic and Linguistic Information using Supervised Autoencoder for Improved Emotion Recognition"],"prefix":"10.1145","author":[{"given":"Bogdan","family":"Vlasenko","sequence":"first","affiliation":[{"name":"Idiap Research Institute, Martigny, Switzerland"}]},{"given":"RaviShankar","family":"Prasad","sequence":"additional","affiliation":[{"name":"Idiap Research Institute, Martigny, Switzerland"}]},{"given":"Mathew","family":"Magimai.-Doss","sequence":"additional","affiliation":[{"name":"Idiap Research Institute, Martigny, Switzerland"}]}],"member":"320","published-online":{"date-parts":[[2021,10,20]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2017.2764438"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414866"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3129340"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3423327.3423673"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2020.01.011"},{"key":"e_1_3_2_1_6_1","first-page":"369","volume-title":"Hongtao Song. WISE: Word-Level Interaction-Based Multimodal Fusion for Speech Emotion Recognition. In Proc. Interspeech 2020","author":"Shen Guang","year":"2020"},{"key":"e_1_3_2_1_7_1","first-page":"384","volume-title":"Shiva Sundaram. Multi-Modal Embeddings Using Multi-Task Learning for Emotion Recognition. In Proc. Interspeech 2020","author":"Khare Aparna","year":"2020"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT.2018.8639583"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Agnes Moors Jan De Houwer Dirk Hermans Sabine Wanmaker Kevin Van Schie Anne-Laura Van Harmelen Maarten De Schryver Jeffrey De Winne and Marc Brysbaert. Norms of valence arousal dominance and age of acquisition for 4 300 dutch words. Behavior research methods 45 (1): 169--177 2013.  Agnes Moors Jan De Houwer Dirk Hermans Sabine Wanmaker Kevin Van Schie Anne-Laura Van Harmelen Maarten De Schryver Jeffrey De Winne and Marc Brysbaert. Norms of valence arousal dominance and age of acquisition for 4 300 dutch words. Behavior research methods 45 (1): 169--177 2013.","DOI":"10.3758\/s13428-012-0243-8"},{"key":"e_1_3_2_1_10_1","article-title":"Real-time video emotion recognition based on reinforcement learning and domain knowledge","author":"Zhang Ke","year":"2021","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2021.3062200"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT48900.2021.9383542"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2017.8019533"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2010.8"},{"key":"e_1_3_2_1_15_1","first-page":"6525","volume-title":"Proceedings ICASSP 2019","author":"Pavan Kumar D S","year":"2019"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2021-1288"},{"volume-title":"The multimodal sentiment analysis in car reviews (muse-car) dataset: Collection, insights and improvements","year":"2021","author":"Stappen Lukas","key":"e_1_3_2_1_17_1"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1874246"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Busso Devillers Epps Laukka Narayanan etal]eyben2015genevaFlorian Eyben Klaus R Scherer BjH orn W Schuller Johan Sundberg Elisabeth Andr\u00e9 Carlos Busso Laurence Y Devillers Julien Epps Petri Laukka Shrikanth S Narayanan et al. The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing. IEEE transactions on affective computing 7 (2): 190--202 2015.  Busso Devillers Epps Laukka Narayanan et al.]eyben2015genevaFlorian Eyben Klaus R Scherer BjH orn W Schuller Johan Sundberg Elisabeth Andr\u00e9 Carlos Busso Laurence Y Devillers Julien Epps Petri Laukka Shrikanth S Narayanan et al. The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing. IEEE transactions on affective computing 7 (2): 190--202 2015.","DOI":"10.1109\/TAFFC.2015.2457417"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMSP.2019.8901779"},{"volume-title":"Casper Kaandorp. The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates. In Proceedings INTERSPEECH 2021","year":"2021","author":"Schuller W.","key":"e_1_3_2_1_21_1"},{"volume-title":"wav2vec 2.0: A framework for self-supervised learning of speech representations. arXiv preprint arXiv:2006.11477","year":"2020","author":"Baevski Alexei","key":"e_1_3_2_1_22_1"},{"volume-title":"Emotion recognition from speech using wav2vec 2.0 embeddings. arXiv preprint arXiv:2104.03502","year":"2021","author":"Pepino Leonardo","key":"e_1_3_2_1_23_1"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3423327.3423672"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2015-670"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-1124"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/3122009.3176840"},{"volume-title":"Proceedings of the 5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS), number CONF. CEUR Workshop Proceedings","year":"2020","author":"Parida Shantipriya","key":"e_1_3_2_1_29_1"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/3326943.3326954"},{"volume-title":"A classification supervised auto-encoder based on predefined evenly-distributed class centroids. arXiv preprint arXiv:1902.00220","year":"2019","author":"Zhu Qiuyu","key":"e_1_3_2_1_31_1"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999325.2999464"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/2503308.2188395"}],"event":{"name":"MM '21: ACM Multimedia Conference","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Virtual Event China","acronym":"MM '21"},"container-title":["Proceedings of the 2nd on Multimodal Sentiment Analysis Challenge"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3475957.3484448","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3475957.3484448","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:33Z","timestamp":1750195713000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3475957.3484448"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,20]]},"references-count":33,"alternative-id":["10.1145\/3475957.3484448","10.1145\/3475957"],"URL":"https:\/\/doi.org\/10.1145\/3475957.3484448","relation":{},"subject":[],"published":{"date-parts":[[2021,10,20]]},"assertion":[{"value":"2021-10-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}