{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:12:19Z","timestamp":1750219939714,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":38,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,11,7]],"date-time":"2022-11-07T00:00:00Z","timestamp":1667779200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,11,7]]},"DOI":"10.1145\/3536221.3557033","type":"proceedings-article","created":{"date-parts":[[2022,11,4]],"date-time":"2022-11-04T15:54:14Z","timestamp":1667577254000},"page":"724-729","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Multimodal Representations and Assessments of Emotional Fluctuations of Speakers in Call Centers Conversations"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6904-6005","authenticated-orcid":false,"given":"Yajing","family":"Feng","sequence":"first","affiliation":[{"name":"CNRS-LISN\/Paris-Saclay University, France and Axys Consultants, France"}]}],"member":"320","published-online":{"date-parts":[[2022,11,7]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication 116 (Jan","author":"Ak\u00e7ay Mehmet\u00a0Berkehan","year":"2020","unstructured":"Mehmet\u00a0Berkehan Ak\u00e7ay and Kaya O\u011fuz . 2020. Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication 116 (Jan . 2020 ), 56\u201376. https:\/\/doi.org\/10.1016\/j.specom.2019.12.001 10.1016\/j.specom.2019.12.001 Mehmet\u00a0Berkehan Ak\u00e7ay and Kaya O\u011fuz. 2020. Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication 116 (Jan. 2020), 56\u201376. https:\/\/doi.org\/10.1016\/j.specom.2019.12.001"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN48605.2020.9207250"},{"key":"e_1_3_2_1_3_1","volume-title":"Multimodal fusion for multimedia analysis: A survey. Multimedia Systems 16 (Nov","author":"Atrey K.","year":"2010","unstructured":"Pradeep\u00a0 K. Atrey , M.\u00a0 Anwar Hossain , Abdulmotaleb\u00a0El Saddik , and Mohan\u00a0 S. Kankanhalli . 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Systems 16 (Nov . 2010 ), 345\u2013379. https:\/\/doi.org\/10.1007\/s00530-010-0182-0 10.1007\/s00530-010-0182-0 Pradeep\u00a0K. Atrey, M.\u00a0Anwar Hossain, Abdulmotaleb\u00a0El Saddik, and Mohan\u00a0S. Kankanhalli. 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Systems 16 (Nov. 2010), 345\u2013379. https:\/\/doi.org\/10.1007\/s00530-010-0182-0"},{"key":"e_1_3_2_1_4_1","volume-title":"LREC 2020 Proceedings of the 12th Language Resources and Evaluation Conference (May 2020","author":"Chen Y.","year":"2020","unstructured":"Eric\u00a0 Y. Chen , Zhiyun Lu , Hao Xu , Liangliang Cao , Yu Zhang , and James Fan . 2020 . A Large Scale Speech Sentiment Corpus . LREC 2020 Proceedings of the 12th Language Resources and Evaluation Conference (May 2020 ), 6549\u20136555. https:\/\/aclanthology.org\/2020.lrec-1.806 Eric\u00a0Y. Chen, Zhiyun Lu, Hao Xu, Liangliang Cao, Yu Zhang, and James Fan. 2020. A Large Scale Speech Sentiment Corpus. LREC 2020 Proceedings of the 12th Language Resources and Evaluation Conference (May 2020), 6549\u20136555. https:\/\/aclanthology.org\/2020.lrec-1.806"},{"key":"e_1_3_2_1_5_1","first-page":"6","article-title":"Development and validation of brief measures of positive and negative affect: the PANAS scales","volume":"54","author":"Watson D, Clark","year":"1988","unstructured":"Watson D, Clark LA, and Tellegen A. 1988 . Development and validation of brief measures of positive and negative affect: the PANAS scales . J Pers Soc Psychol 54 , 6 (June 1988), 1063\u201370. https:\/\/doi.org\/10.1037\/\/0022-3514.54.6.1063 Watson D, Clark LA, and Tellegen A.1988. Development and validation of brief measures of positive and negative affect: the PANAS scales. J Pers Soc Psychol 54, 6 (June 1988), 1063\u201370. https:\/\/doi.org\/10.1037\/\/0022-3514.54.6.1063","journal-title":"J Pers Soc Psychol"},{"key":"e_1_3_2_1_6_1","volume-title":"INTERSPEECH 2010 (Jan.","author":"Devillers Laurence","year":"2020","unstructured":"Laurence Devillers , Christophe Vaudable , and Cl\u00e9ment Chastagnol . 2020 . Real-life emotion-related states detection in call centers: a cross-corpora study . INTERSPEECH 2010 (Jan. 2020), 2350\u20132353. Laurence Devillers, Christophe Vaudable, and Cl\u00e9ment Chastagnol. 2020. Real-life emotion-related states detection in call centers: a cross-corpora study. INTERSPEECH 2010 (Jan. 2020), 2350\u20132353."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2006-275"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1002\/0470013494.ch3"},{"volume-title":"Emotion in the Human Face: Guidelines for Research and an Integration of Findings","author":"Ekman Paul","key":"e_1_3_2_1_9_1","unstructured":"Paul Ekman , Wallace\u00a0 V Friesen , and Phoebe Ellsworth . 2013. Emotion in the Human Face: Guidelines for Research and an Integration of Findings . Elsevier Science . Paul Ekman, Wallace\u00a0V Friesen, and Phoebe Ellsworth. 2013. Emotion in the Human Face: Guidelines for Research and an Integration of Findings. Elsevier Science."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2015.2457417"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2133366.2133372"},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the ACM Multimedia 2010 International Conference (Jan. 2010","author":"Eyben Florian","year":"2010","unstructured":"Florian Eyben , Martin W\u00f6llmer , and Bj\u00f6rn Schuller . 2010 . openSMILE \u2013 The Munich Versatile and Fast Open-Source Audio Feature Extractor. MM\u201910 - Proceedings of the ACM Multimedia 2010 International Conference (Jan. 2010 ), 1459\u20131462. https:\/\/doi.org\/10.1145\/1873951.1874246 10.1145\/1873951.1874246 Florian Eyben, Martin W\u00f6llmer, and Bj\u00f6rn Schuller. 2010. openSMILE \u2013 The Munich Versatile and Fast Open-Source Audio Feature Extractor. MM\u201910 - Proceedings of the ACM Multimedia 2010 International Conference (Jan. 2010), 1459\u20131462. https:\/\/doi.org\/10.1145\/1873951.1874246"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-9280.2007.02024.x"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.3758\/s13428-017-0915-5"},{"key":"e_1_3_2_1_15_1","volume-title":"ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing 1 (March 1992","volume":"1","author":"Godfrey J.J.","year":"1992","unstructured":"J.J. Godfrey , E.C. Holliman , and J. McDaniel . 1992. SWITCHBOARD: telephone speech corpus for research and development . ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing 1 (March 1992 ), 517\u2013520 vol. 1 . https:\/\/doi.org\/10.1109\/ICASSP. 1992 .225858 10.1109\/ICASSP.1992.225858 J.J. Godfrey, E.C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: telephone speech corpus for research and development. ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing 1 (March 1992), 517\u2013520 vol.1. https:\/\/doi.org\/10.1109\/ICASSP.1992.225858"},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 3072\u20133079","author":"Prakash\u00a0Reddy Guda Bhanu","year":"2021","unstructured":"Bhanu Prakash\u00a0Reddy Guda , Aparna Garimella , and Niyati Chhaya . 2021 . EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction . In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 3072\u20133079 . https:\/\/doi.org\/10.18653\/v1\/2021.eacl-main.268 10.18653\/v1 Bhanu Prakash\u00a0Reddy Guda, Aparna Garimella, and Niyati Chhaya. 2021. EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 3072\u20133079. https:\/\/doi.org\/10.18653\/v1\/2021.eacl-main.268"},{"key":"e_1_3_2_1_17_1","volume-title":"EmoBed: Strengthening Monomodal Emotion Recognition via Training with Crossmodal Emotion Embeddings","author":"Han Jing","year":"2019","unstructured":"Jing Han , Zixing Zhang , Zhao Ren , and Bj\u00f6rn Schuller . 2019. EmoBed: Strengthening Monomodal Emotion Recognition via Training with Crossmodal Emotion Embeddings . IEEE Transactions on Affective Computing PP ( 07 2019 ), 1\u20131. https:\/\/doi.org\/10.1109\/TAFFC.2019.2928297 10.1109\/TAFFC.2019.2928297 Jing Han, Zixing Zhang, Zhao Ren, and Bj\u00f6rn Schuller. 2019. EmoBed: Strengthening Monomodal Emotion Recognition via Training with Crossmodal Emotion Embeddings. IEEE Transactions on Affective Computing PP (07 2019), 1\u20131. https:\/\/doi.org\/10.1109\/TAFFC.2019.2928297"},{"key":"#cr-split#-e_1_3_2_1_18_1.1","doi-asserted-by":"crossref","unstructured":"Jing Han Zixing Zhang Fabien Ringeval and Bj\u00f6rn Schuller. 2017. Prediction-based learning for continuous emotion recognition in speech. 5005-5009. https:\/\/doi.org\/10.1109\/ICASSP.2017.7953109 10.1109\/ICASSP.2017.7953109","DOI":"10.1109\/ICASSP.2017.7953109"},{"key":"#cr-split#-e_1_3_2_1_18_1.2","doi-asserted-by":"crossref","unstructured":"Jing Han Zixing Zhang Fabien Ringeval and Bj\u00f6rn Schuller. 2017. Prediction-based learning for continuous emotion recognition in speech. 5005-5009. https:\/\/doi.org\/10.1109\/ICASSP.2017.7953109","DOI":"10.1109\/ICASSP.2017.7953109"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.3390\/electronics10101163"},{"key":"e_1_3_2_1_20_1","volume-title":"AlloSat: A New Call Center French Corpus for Satisfaction and Frustration Analysis. Language Resources and Evaluation Conference, LREC 2020 (May","author":"Macary Manon","year":"2020","unstructured":"Manon Macary , Marie Tahon , Yannick Est\u00e8ve , and Anthony Rousseau . 2020 . AlloSat: A New Call Center French Corpus for Satisfaction and Frustration Analysis. Language Resources and Evaluation Conference, LREC 2020 (May 2020). https:\/\/hal.archives-ouvertes.fr\/hal-02506086 Manon Macary, Marie Tahon, Yannick Est\u00e8ve, and Anthony Rousseau. 2020. AlloSat: A New Call Center French Corpus for Satisfaction and Frustration Analysis. Language Resources and Evaluation Conference, LREC 2020 (May 2020). https:\/\/hal.archives-ouvertes.fr\/hal-02506086"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT48900.2021.9383456"},{"key":"e_1_3_2_1_22_1","volume-title":"Phonetics of Emotion in Russian Speech. (01","author":"Makarova Veronika","year":"2003","unstructured":"Veronika Makarova and Valery Petrushin . 2003. Phonetics of Emotion in Russian Speech. (01 2003 ). Veronika Makarova and Valery Petrushin. 2003. Phonetics of Emotion in Russian Speech. (01 2003)."},{"key":"e_1_3_2_1_23_1","volume-title":"Yoann Dupont, Laurent Romary, \u00c9ric Villemonte de\u00a0la Clergerie, Djam\u00e9 Seddah, and Beno\u00eet Sagot.","author":"Martin Louis","year":"2020","unstructured":"Louis Martin , Benjamin Muller , Pedro Javier\u00a0Ortiz Su\u00e1rez , Yoann Dupont, Laurent Romary, \u00c9ric Villemonte de\u00a0la Clergerie, Djam\u00e9 Seddah, and Beno\u00eet Sagot. 2020 . CamemBERT: a Tasty French Language Model. Association for Computational Linguistics (July 2020), 7203\u20137219. https:\/\/www.aclweb.org\/anthology\/2020.acl-main.645 Louis Martin, Benjamin Muller, Pedro Javier\u00a0Ortiz Su\u00e1rez, Yoann Dupont, Laurent Romary, \u00c9ric Villemonte de\u00a0la Clergerie, Djam\u00e9 Seddah, and Beno\u00eet Sagot. 2020. CamemBERT: a Tasty French Language Model. Association for Computational Linguistics (July 2020), 7203\u20137219. https:\/\/www.aclweb.org\/anthology\/2020.acl-main.645"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/T-AFFC.2011.9"},{"volume-title":"a Psychoevolutionary Synthesis","author":"Plutchik R.","key":"e_1_3_2_1_25_1","unstructured":"R. Plutchik . 1980. Emotion , a Psychoevolutionary Synthesis . Harper & Row . https:\/\/books.google.fr\/books?id=G5t9AAAAMAAJ R. Plutchik. 1980. Emotion, a Psychoevolutionary Synthesis. Harper & Row. https:\/\/books.google.fr\/books?id=G5t9AAAAMAAJ"},{"key":"e_1_3_2_1_26_1","volume-title":"In IEEE 2011 workshop.","author":"Povey Daniel","year":"2011","unstructured":"Daniel Povey , Arnab Ghoshal , Gilles Boulianne , Nagendra Goel , Mirko Hannemann , Yanmin Qian , Petr Schwarz , and Georg Stemmer . 2011 . The kaldi speech recognition toolkit . In In IEEE 2011 workshop. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Nagendra Goel, Mirko Hannemann, Yanmin Qian, Petr Schwarz, and Georg Stemmer. 2011. The kaldi speech recognition toolkit. In In IEEE 2011 workshop."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3266302.3266316"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2013.6553805"},{"key":"e_1_3_2_1_29_1","volume-title":"Continuous Emotion Recognition in Speech \u2014 Do We Need Recurrence?Interspeech 2019 (Sept","author":"Schmitt Maximilian","year":"2019","unstructured":"Maximilian Schmitt , Nicholas Cummins , and Bj\u00f6rn Schuller . 2019. Continuous Emotion Recognition in Speech \u2014 Do We Need Recurrence?Interspeech 2019 (Sept . 2019 ), 2808\u20132812. https:\/\/doi.org\/10.21437\/Interspeech.2019-2710 10.21437\/Interspeech.2019-2710 Maximilian Schmitt, Nicholas Cummins, and Bj\u00f6rn Schuller. 2019. Continuous Emotion Recognition in Speech \u2014 Do We Need Recurrence?Interspeech 2019 (Sept. 2019), 2808\u20132812. https:\/\/doi.org\/10.21437\/Interspeech.2019-2710"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-1873"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3017462"},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019","author":"Javier\u00a0Ortiz Su\u00e1rez Pedro","year":"2019","unstructured":"Pedro Javier\u00a0Ortiz Su\u00e1rez , Beno\u00eet Sagot , and Laurent Romary . 2019 . Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures . Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019 . Cardiff, 22nd July 2019 (July 2019). https:\/\/doi.org\/10.14618\/ids-pub-9021 10.14618\/ids-pub-9021 Pedro Javier\u00a0Ortiz Su\u00e1rez, Beno\u00eet Sagot, and Laurent Romary. 2019. Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures. Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019. Cardiff, 22nd July 2019 (July 2019). https:\/\/doi.org\/10.14618\/ids-pub-9021"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472669"},{"key":"e_1_3_2_1_34_1","volume-title":"IEEE International Conference on Acoustics, Speech and Signal Processing (March 2012","author":"Vaudable Christophe","year":"2012","unstructured":"Christophe Vaudable and Laurence Devillers . 2012 . Negative emotions detection as an indicator of dialogs quality in call centers. ICASSP , IEEE International Conference on Acoustics, Speech and Signal Processing (March 2012 ). https:\/\/doi.org\/10.1109\/ICASSP.2012.6289070 10.1109\/ICASSP.2012.6289070 Christophe Vaudable and Laurence Devillers. 2012. Negative emotions detection as an indicator of dialogs quality in call centers. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing (March 2012). https:\/\/doi.org\/10.1109\/ICASSP.2012.6289070"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2005-582"},{"key":"e_1_3_2_1_36_1","volume-title":"2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 (02 2015","author":"Wei Jiamei","year":"2015","unstructured":"Jiamei Wei , Ercheng Pei , Dongmei Jiang , Hichem Sahli , Lei Xie , and Zhong-hua Fu. 2015 . Multimodal continuous affect recognition based on LSTM and multiple kernel learning . 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 (02 2015 ). https:\/\/doi.org\/10.1109\/APSIPA.2014.7041743 10.1109\/APSIPA.2014.7041743 Jiamei Wei, Ercheng Pei, Dongmei Jiang, Hichem Sahli, Lei Xie, and Zhong-hua Fu. 2015. Multimodal continuous affect recognition based on LSTM and multiple kernel learning. 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 (02 2015). https:\/\/doi.org\/10.1109\/APSIPA.2014.7041743"},{"key":"e_1_3_2_1_37_1","volume-title":"Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. (01","author":"Weninger Felix","year":"2016","unstructured":"Felix Weninger , Fabien Ringeval , Erik Marchi , and Bj\u00f6rn Schuller . 2016. Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. (01 2016 ). Felix Weninger, Fabien Ringeval, Erik Marchi, and Bj\u00f6rn Schuller. 2016. Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. (01 2016)."}],"event":{"name":"ICMI '22: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"],"location":"Bengaluru India","acronym":"ICMI '22"},"container-title":["Proceedings of the 2022 International Conference on Multimodal Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3536221.3557033","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3536221.3557033","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:53Z","timestamp":1750182533000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3536221.3557033"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,7]]},"references-count":38,"alternative-id":["10.1145\/3536221.3557033","10.1145\/3536221"],"URL":"https:\/\/doi.org\/10.1145\/3536221.3557033","relation":{},"subject":[],"published":{"date-parts":[[2022,11,7]]},"assertion":[{"value":"2022-11-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}