{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T18:00:54Z","timestamp":1774720854522,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,20]],"date-time":"2021-10-20T00:00:00Z","timestamp":1634688000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,24]]},"DOI":"10.1145\/3475957.3484455","type":"proceedings-article","created":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T23:34:16Z","timestamp":1634340856000},"page":"21-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Multi-modal Fusion for Continuous Emotion Recognition by Using Auto-Encoders"],"prefix":"10.1145","author":[{"given":"Salam","family":"Hamieh","sequence":"first","affiliation":[{"name":"CEA, Grenoble, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vincent","family":"Heiries","sequence":"additional","affiliation":[{"name":"CEA, Grenoble, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hussein","family":"Al Osman","sequence":"additional","affiliation":[{"name":"University of Ottawa, Ottawa, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christelle","family":"Godin","sequence":"additional","affiliation":[{"name":"CEA, Grenoble , France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,20]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-434"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3347320.3357690"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133944.3133949"},{"key":"e_1_3_2_1_4_1","volume-title":"Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. ArXiv14061078 Cs Stat (September","author":"Cho Kyunghyun","year":"2014","unstructured":"Kyunghyun Cho , Bart van Merrienboer , Caglar Gulcehre , Dzmitry Bahdanau , Fethi Bougares , Holger Schwenk , and Yoshua Bengio . 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. ArXiv14061078 Cs Stat (September 2014 ). Retrieved August 6, 2021 from http:\/\/arxiv.org\/abs\/1406.1078 Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. ArXiv14061078 Cs Stat (September 2014). Retrieved August 6, 2021 from http:\/\/arxiv.org\/abs\/1406.1078"},{"key":"e_1_3_2_1_5_1","volume-title":"Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. ArXiv14123555 Cs (December","author":"Chung Junyoung","year":"2014","unstructured":"Junyoung Chung , Caglar Gulcehre , KyungHyun Cho , and Yoshua Bengio . 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. ArXiv14123555 Cs (December 2014 ). Retrieved August 6, 2021 from http:\/\/arxiv.org\/abs\/1412.3555 Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. ArXiv14123555 Cs (December 2014). Retrieved August 6, 2021 from http:\/\/arxiv.org\/abs\/1412.3555"},{"key":"e_1_3_2_1_6_1","volume-title":"Retrieved","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv181004805 Cs (May 2019) . Retrieved July 21, 2021 from http:\/\/arxiv.org\/abs\/1810.04805 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv181004805 Cs (May 2019). Retrieved July 21, 2021 from http:\/\/arxiv.org\/abs\/1810.04805"},{"key":"e_1_3_2_1_7_1","volume-title":"Handbook of cognition and emotion","author":"Ekman Paul","unstructured":"Paul Ekman . 1999. Basic emotions . In Handbook of cognition and emotion . John Wiley & Sons Ltd , New York, NY , US, 45--60. DOI:https:\/\/doi.org\/10.1002\/0470013494.ch3 10.1002\/0470013494.ch3 Paul Ekman. 1999. Basic emotions. In Handbook of cognition and emotion. John Wiley & Sons Ltd, New York, NY, US, 45--60. DOI:https:\/\/doi.org\/10.1002\/0470013494.ch3"},{"key":"e_1_3_2_1_8_1","volume-title":"The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing","author":"Eyben Florian","unstructured":"Florian Eyben , Klaus Scherer , Laurence Devillers , Julien Epps , Petri Laukka , Shrikanth Narayanan , and Khiet Truong . The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing . IEEE Trans. Affect. Comput ., 14. Florian Eyben, Klaus Scherer, Laurence Devillers, Julien Epps, Petri Laukka, Shrikanth Narayanan, and Khiet Truong. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Trans. Affect. Comput., 14."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1874246"},{"key":"e_1_3_2_1_10_1","first-page":"26","article-title":"Theory of communication","volume":"93","author":"Gabor D.","year":"1946","unstructured":"D. Gabor . 1946 . Theory of communication . Part 1: The analysis of J. Inst. Electr. Eng. - Part III Radio Commun. Eng. 93 , 26 (November 1946), 429--441. DOI:https:\/\/doi.org\/10.1049\/ji-3--2.1946.0074 10.1049\/ji-3--2.1946.0074 D. Gabor. 1946. Theory of communication. Part 1: The analysis of information. J. Inst. Electr. Eng. - Part III Radio Commun. Eng. 93, 26 (November 1946), 429--441. DOI:https:\/\/doi.org\/10.1049\/ji-3--2.1946.0074","journal-title":"J. Inst. Electr. Eng. - Part III Radio Commun. Eng."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123383"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952132"},{"key":"e_1_3_2_1_13_1","volume-title":"Hockenbury","author":"Hockenbury Don H.","year":"2007","unstructured":"Don H. Hockenbury and Sandra E . Hockenbury . 2007 . Discovering psychology, 4 th ed. Worth Publishers , New York, NY, US. Don H. Hockenbury and Sandra E. Hockenbury. 2007. Discovering psychology, 4th ed. Worth Publishers, New York, NY, US.","edition":"4"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3266302.3266304"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1159\/000119004"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.2307\/2532051"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988257.2988267"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1155\/2017\/4694860"},{"key":"e_1_3_2_1_19_1","volume-title":"Interspeech","author":"Niu Mingyue","year":"2019","unstructured":"Mingyue Niu , Jianhua Tao , Bin Liu , and Cunhang Fan . 2019. Automatic Depression Level Detection via \"p-Norm Pooling . In Interspeech 2019 , ISCA , 4559--4563. DOI:https:\/\/doi.org\/10.21437\/Interspeech.2019--1617 10.21437\/Interspeech.2019--1617 Mingyue Niu, Jianhua Tao, Bin Liu, and Cunhang Fan. 2019. Automatic Depression Level Detection via \"p-Norm Pooling. In Interspeech 2019, ISCA, 4559--4563. DOI:https:\/\/doi.org\/10.21437\/Interspeech.2019--1617"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2002.1017623"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2019.2905234"},{"key":"e_1_3_2_1_22_1","volume-title":"Retrieved","author":"Parkhi O. M.","year":"2021","unstructured":"O. M. Parkhi , A. Vedaldi , and A. Zisserman . 2015. Deep face recognition. (2015) . Retrieved July 21, 2021 from https:\/\/ora.ox.ac.uk\/objects\/uuid:a5f2e93f-2768--45bb-8508--74747f85cad1 O. M. Parkhi, A. Vedaldi, and A. Zisserman. 2015. Deep face recognition. (2015). Retrieved July 21, 2021 from https:\/\/ora.ox.ac.uk\/objects\/uuid:a5f2e93f-2768--45bb-8508--74747f85cad1"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2015.2446462"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TETCI.2017.2762739"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3347320.3357688"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133944.3133953"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2808196.2811642"},{"key":"e_1_3_2_1_28_1","volume-title":"Rosenberg and Paul Ekman","author":"Erika","year":"2020","unstructured":"Erika L. Rosenberg and Paul Ekman . 2020 . What the Face Reveals : Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press . Erika L. Rosenberg and Paul Ekman. 2020. What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0077714"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2512530.2512534"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/2062850.2062907"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2388676.2388758"},{"key":"e_1_3_2_1_33_1","volume-title":"Retrieved","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman . 2015 . Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv14091556 Cs (April 2015) . Retrieved July 18, 2021 from http:\/\/arxiv.org\/abs\/1409.1556 Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv14091556 Cs (April 2015). Retrieved July 18, 2021 from http:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3475957.3484450"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3423327.3423673"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3423327.3423672"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472669"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2017.2764438"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988257.2988258"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661806.2661807"},{"key":"e_1_3_2_1_41_1","volume-title":"Mohamad Nizam Bin Ayub, Hannyzzura Binti Affal, and Nornazlita Binti Hussin.","author":"Yadegaridehkordi Elaheh","year":"2019","unstructured":"Elaheh Yadegaridehkordi , Nurul Fazmidar Binti Mohd Noor , Mohamad Nizam Bin Ayub, Hannyzzura Binti Affal, and Nornazlita Binti Hussin. 2019 . Affective computing in education: A systematic review and future research. Comput. Educ . 142, (December 2019), 103649. DOI:https:\/\/doi.org\/10.1016\/j.compedu.2019.103649 10.1016\/j.compedu.2019.103649 Elaheh Yadegaridehkordi, Nurul Fazmidar Binti Mohd Noor, Mohamad Nizam Bin Ayub, Hannyzzura Binti Affal, and Nornazlita Binti Hussin. 2019. Affective computing in education: A systematic review and future research. Comput. Educ. 142, (December 2019), 103649. DOI:https:\/\/doi.org\/10.1016\/j.compedu.2019.103649"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2018.2871949"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3266302.3266313"},{"key":"e_1_3_2_1_44_1","volume-title":"An Imperative Style","year":"2021","unstructured":"PyTorch : An Imperative Style , High-Performance Deep Learning Library . Retrieved August 7, 2021 from https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/bdbca288fee7f92f2bfa9f7012727740-Abstract.html PyTorch: An Imperative Style, High-Performance Deep Learning Library. Retrieved August 7, 2021 from https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/bdbca288fee7f92f2bfa9f7012727740-Abstract.html"}],"event":{"name":"MM '21: ACM Multimedia Conference","location":"Virtual Event China","acronym":"MM '21","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 2nd on Multimodal Sentiment Analysis Challenge"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3475957.3484455","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3475957.3484455","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:33Z","timestamp":1750195713000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3475957.3484455"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,20]]},"references-count":44,"alternative-id":["10.1145\/3475957.3484455","10.1145\/3475957"],"URL":"https:\/\/doi.org\/10.1145\/3475957.3484455","relation":{},"subject":[],"published":{"date-parts":[[2021,10,20]]},"assertion":[{"value":"2021-10-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}