{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T11:51:16Z","timestamp":1770292276415,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T00:00:00Z","timestamp":1665360000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China (NSFC)","award":["No. 61831022? No.U21B2010? No.61901473? No.62101553"],"award-info":[{"award-number":["No. 61831022? No.U21B2010? No.61901473? No.62101553"]}]},{"name":"Open Research Projects of Zhejiang Lab","award":["NO. 2021KH0AB06"],"award-info":[{"award-number":["NO. 2021KH0AB06"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,10]]},"DOI":"10.1145\/3551876.3554811","type":"proceedings-article","created":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T22:17:21Z","timestamp":1664403441000},"page":"61-66","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":21,"title":["Multimodal Temporal Attention in Sentiment Analysis"],"prefix":"10.1145","author":[{"given":"Yu","family":"He","sequence":"first","affiliation":[{"name":"School of Artificial Intelligence, University of Chinese Academy of Sciences &amp; NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Licai","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence, University of Chinese Academy of Sciences &amp; NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Zheng","family":"Lian","sequence":"additional","affiliation":[{"name":"NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Bin","family":"Liu","sequence":"additional","affiliation":[{"name":"NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Jianhua","family":"Tao","sequence":"additional","affiliation":[{"name":"NLPR, Institute of Automation, Chinese Academy of Sciences &amp; School of Artificial Intelligence, University of Chinese Academy of Sciences &amp; CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China"}]},{"given":"Meng","family":"Wang","sequence":"additional","affiliation":[{"name":"Ant Financial Services Group, Hangzhou, China"}]},{"given":"Yuan","family":"Cheng","sequence":"additional","affiliation":[{"name":"Ant Financial Services Group, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2022,10,10]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3551792"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"crossref","unstructured":"Shahin Amiriparian Maurice Gerczuk Sandra Ottl Nicholas Cummins Michael Freitag Sergey Pugachevskiy Alice Baird and Bj\u00f6rn Schuller. 2017. Snore sound classification using image-based deep spectrum features. (2017).  Shahin Amiriparian Maurice Gerczuk Sandra Ottl Nicholas Cummins Michael Freitag Sergey Pugachevskiy Alice Baird and Bj\u00f6rn Schuller. 2017. Snore sound classification using image-based deep spectrum features. (2017).","DOI":"10.21437\/Interspeech.2017-434"},{"key":"e_1_3_2_2_3_1","volume-title":"Jamie Ryan Kiros, and Geoffrey E Hinton","author":"Ba Jimmy Lei","year":"2016","unstructured":"Jimmy Lei Ba , Jamie Ryan Kiros, and Geoffrey E Hinton . 2016 . Layer normalization. arXiv preprint arXiv:1607.06450 (2016). Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016)."},{"key":"e_1_3_2_2_4_1","volume-title":"An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271","author":"Bai Shaojie","year":"2018","unstructured":"Shaojie Bai , J Zico Kolter , and Vladlen Koltun . 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 ( 2018 ). Shaojie Bai, J Zico Kolter, and Vladlen Koltun. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018)."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2993148.2993165"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3475957.3484454"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.3037496"},{"key":"e_1_3_2_2_8_1","volume-title":"Proceedings of the 3rd Multimodal Sentiment Analysis Challenge. Association for Computing Machinery","author":"Christ Lukas","year":"2022","unstructured":"Lukas Christ , Shahin Amiriparian , Alice Baird , Panagiotis Tzirakis , Alexander Kathan , Niklas M\u00fcller , Lukas Stappen , Eva-Maria Me\u00dfner , Andreas K\u00f6nig , Alan Cowen , Erik Cambria , and Bj\u00f6rn W. Schuller . 2022. The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress . In Proceedings of the 3rd Multimodal Sentiment Analysis Challenge. Association for Computing Machinery , Lisbon, Portugal. Workshop held at ACM Multimedia 2022 , to appear. Lukas Christ, Shahin Amiriparian, Alice Baird, Panagiotis Tzirakis, Alexander Kathan, Niklas M\u00fcller, Lukas Stappen, Eva-Maria Me\u00dfner, Andreas K\u00f6nig, Alan Cowen, Erik Cambria, and Bj\u00f6rn W. Schuller. 2022. The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress. In Proceedings of the 3rd Multimodal Sentiment Analysis Challenge. Association for Computing Machinery, Lisbon, Portugal. Workshop held at ACM Multimedia 2022, to appear."},{"key":"e_1_3_2_2_9_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs\/1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs\/1810.04805 (2018). arXiv:1810.04805 http:\/\/arxiv.org\/abs\/1810.04805 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs\/1810.04805 (2018). arXiv:1810.04805 http:\/\/arxiv.org\/abs\/1810.04805"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2015.2457417"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1874246"},{"key":"e_1_3_2_2_12_1","volume-title":"Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412","author":"Foret Pierre","year":"2020","unstructured":"Pierre Foret , Ariel Kleiner , Hossein Mobahi , and Behnam Neyshabur . 2020. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 ( 2020 ). Pierre Foret, Ariel Kleiner, Hossein Mobahi, and Behnam Neyshabur. 2020. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020)."},{"key":"e_1_3_2_2_13_1","volume-title":"Long short-term memory. Neural computation 9, 8","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber . 1997. Long short-term memory. Neural computation 9, 8 ( 1997 ), 1735--1780. Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780."},{"key":"e_1_3_2_2_14_1","volume-title":"Multimodal Transformer Fusion for Continuous Emotion Recognition. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3507--3511","author":"Huang Jian","year":"2020","unstructured":"Jian Huang , Jianhua Tao , Bin Liu , Zheng Lian , and Mingyue Niu . 2020 . Multimodal Transformer Fusion for Continuous Emotion Recognition. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3507--3511 . Jian Huang, Jianhua Tao, Bin Liu, Zheng Lian, and Mingyue Niu. 2020. Multimodal Transformer Fusion for Continuous Emotion Recognition. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3507--3511."},{"key":"e_1_3_2_2_15_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1159\/000119004"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2020.3030497"},{"key":"e_1_3_2_2_18_1","volume-title":"A concordance correlation coefficient to evaluate reproducibility. Biometrics","author":"Lawrence I","year":"1989","unstructured":"I Lawrence and Kuei Lin . 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics ( 1989 ), 255--268. I Lawrence and Kuei Lin. 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics (1989), 255--268."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2022.3160373"},{"key":"e_1_3_2_2_20_1","volume-title":"NeuroKit2: A Pythotoolbox for neurophysiological signal processing. Behavior research methods 53, 4","author":"Makowski Dominique","year":"2021","unstructured":"Dominique Makowski , Tam Pham , Zen J Lau , Jan C Brammer , Fran\u00e7ois Lespinasse , Hung Pham , Christopher Sch\u00f6lzel , and SH Chen . 2021. NeuroKit2: A Pythotoolbox for neurophysiological signal processing. Behavior research methods 53, 4 ( 2021 ), 1689--1696. Dominique Makowski, Tam Pham, Zen J Lau, Jan C Brammer, Fran\u00e7ois Lespinasse, Hung Pham, Christopher Sch\u00f6lzel, and SH Chen. 2021. NeuroKit2: A Pythotoolbox for neurophysiological signal processing. Behavior research methods 53, 4 (2021), 1689--1696."},{"key":"e_1_3_2_2_21_1","volume-title":"Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026--8037.","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , 2019 . Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026--8037. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026--8037."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00806"},{"key":"e_1_3_2_2_23_1","volume-title":"Julien Chappe, and Fan Yang.","author":"Sabour Rita Meziati","year":"2021","unstructured":"Rita Meziati Sabour , Yannick Benezeth , Pierre De Oliveira , Julien Chappe, and Fan Yang. 2021 . Ubfc-phys : A multimodal database for psychophysiological studies of social stress. IEEE Transactions on Affective Computing ( 2021). Rita Meziati Sabour, Yannick Benezeth, Pierre De Oliveira, Julien Chappe, and Fan Yang. 2021. Ubfc-phys: A multimodal database for psychophysiological studies of social stress. IEEE Transactions on Affective Computing (2021)."},{"key":"e_1_3_2_2_24_1","volume-title":"The INTERSPEECH 2010 paralinguistic challenge. In Proc. INTERSPEECH 2010","author":"Schuller Bj\u00f6rn","year":"2010","unstructured":"Bj\u00f6rn Schuller , Stefan Steidl , Anton Batliner , Felix Burkhardt , Laurence Devillers , Christian M\u00fcller , and Shrikanth Narayanan . 2010 . The INTERSPEECH 2010 paralinguistic challenge. In Proc. INTERSPEECH 2010 , Makuhari, Japan. 2794-- 2797. Bj\u00f6rn Schuller, Stefan Steidl, Anton Batliner, Felix Burkhardt, Laurence Devillers, Christian M\u00fcller, and Shrikanth Narayanan. 2010. The INTERSPEECH 2010 paralinguistic challenge. In Proc. INTERSPEECH 2010, Makuhari, Japan. 2794-- 2797."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3475957.3484450"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3423327.3423672"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3475957.3484456"},{"key":"e_1_3_2_2_28_1","volume-title":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2019","author":"Hubert Tsai Yao-Hung","year":"2019","unstructured":"Yao-Hung Hubert Tsai , Shaojie Bai , Paul Pu Liang , J Zico Kolter , Louis-Philippe Morency , and Ruslan Salakhutdinov . 2019 . Multimodal transformer for unaligned multimodal language sequences . In Proceedings of the conference. Association for Computational Linguistics. Meeting , Vol. 2019 . NIH Public Access, 6558. Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J Zico Kolter, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2019. Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the conference. Association for Computational Linguistics. Meeting, Vol. 2019. NIH Public Access, 6558."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01271"}],"event":{"name":"MM '22: The 30th ACM International Conference on Multimedia","location":"Lisboa Portugal","acronym":"MM '22","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3551876.3554811","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3551876.3554811","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:17Z","timestamp":1750186817000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3551876.3554811"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,10]]},"references-count":29,"alternative-id":["10.1145\/3551876.3554811","10.1145\/3551876"],"URL":"https:\/\/doi.org\/10.1145\/3551876.3554811","relation":{},"subject":[],"published":{"date-parts":[[2022,10,10]]},"assertion":[{"value":"2022-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}