{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T10:23:31Z","timestamp":1779359011272,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":53,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T00:00:00Z","timestamp":1602460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["CNS-1629898"],"award-info":[{"award-number":["CNS-1629898"]}]},{"name":"Center of Imaging, Acoustics, and Perception Science (CIAPS) of Binghamton University."}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,12]]},"DOI":"10.1145\/3394171.3413538","type":"proceedings-article","created":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T13:10:18Z","timestamp":1602508218000},"page":"2982-2990","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Adaptive Multimodal Fusion for Facial Action Units Recognition"],"prefix":"10.1145","author":[{"given":"Huiyuan","family":"Yang","sequence":"first","affiliation":[{"name":"State Univerisity of New York at Binghamton, Binghamton, NY, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Taoyue","family":"Wang","sequence":"additional","affiliation":[{"name":"State Univerisity of New York at Binghamton, Binghamton, NY, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lijun","family":"Yin","sequence":"additional","affiliation":[{"name":"State University of New York at Binghamton, Binghamton, NY, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,10,12]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00126"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"crossref","unstructured":"Tadas Baltruvs aitis Chaitanya Ahuja and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence Vol. 41 2 (2018) 423--443.  Tadas Baltruvs aitis Chaitanya Ahuja and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence Vol. 41 2 (2018) 423--443.","DOI":"10.1109\/TPAMI.2018.2798607"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2017.86"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-74161-1_41"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01079"},{"key":"e_1_3_2_2_6_1","unstructured":"Wen-Sheng Chu Fernando De la Torre Frade and Jeffrey Cohn. 2017. Learning Spatial and Temporal Cues for Multi-label Facial Action Unit Detection. In Automatic Face and Gesture Recognition (FG) .  Wen-Sheng Chu Fernando De la Torre Frade and Jeffrey Cohn. 2017. Learning Spatial and Temporal Cues for Multi-label Facial Action Unit Detection. In Automatic Face and Gesture Recognition (FG) ."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01258-8_19"},{"key":"e_1_3_2_2_8_1","unstructured":"Terrance DeVries and Graham W Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).  Terrance DeVries and Graham W Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)."},{"key":"e_1_3_2_2_9_1","unstructured":"Yaroslav Ganin and Victor Lempitsky. 2014. Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495 (2014).  Yaroslav Ganin and Victor Lempitsky. 2014. Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495 (2014)."},{"key":"e_1_3_2_2_10_1","first-page":"6","article-title":"Real-time tracking via on-line boosting","volume":"1","author":"Grabner Helmut","year":"2006","journal-title":"Bmvc"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2015.7284873"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2830661"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"crossref","unstructured":"Yipeng Hu Marc Modat Eli Gibson Wenqi Li Nooshin Ghavami Ester Bonmati Guotai Wang Steven Bandula Caroline M Moore Mark Emberton et almbox. 2018. Weakly-supervised convolutional neural networks for multimodal image registration. Medical image analysis Vol. 49 (2018) 1--13.  Yipeng Hu Marc Modat Eli Gibson Wenqi Li Nooshin Ghavami Ester Bonmati Guotai Wang Steven Bandula Caroline M Moore Mark Emberton et almbox. 2018. Weakly-supervised convolutional neural networks for multimodal image registration. Medical image analysis Vol. 49 (2018) 1--13.","DOI":"10.1016\/j.media.2018.07.002"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2733373.2806293"},{"key":"e_1_3_2_2_16_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 88--95","author":"Irani Ramin","year":"2015"},{"key":"e_1_3_2_2_17_1","unstructured":"Eric Jang Shixiang Gu and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).  Eric Jang Shixiang Gu and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)."},{"key":"e_1_3_2_2_18_1","unstructured":"Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).  Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2019.8756629"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018594"},{"key":"e_1_3_2_2_21_1","unstructured":"Huibin Li Huaxiong Ding Di Huang Yunhong Wang Xi Zhao Jean-Marie Morvan and Liming Chen. 2015. An efficient multimodal 2D  Huibin Li Huaxiong Ding Di Huang Yunhong Wang Xi Zhao Jean-Marie Morvan and Liming Chen. 2015. An efficient multimodal 2D"},{"key":"e_1_3_2_2_22_1","volume-title":"Computer Vision and Image Understanding","volume":"140","year":"2015"},{"key":"e_1_3_2_2_23_1","unstructured":"Huibin Li Jian Sun Zongben Xu and Liming Chen. 2017c. Multimodal 2D  Huibin Li Jian Sun Zongben Xu and Liming Chen. 2017c. Multimodal 2D"},{"key":"e_1_3_2_2_24_1","volume-title":"IEEE Transactions on Multimedia","volume":"19","year":"2017"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.716"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"crossref","unstructured":"Wei Li Farnaz Abtahi Zhigang Zhu and Lijun Yin. 2017b. EAC-Net: A Region-based Deep Enhancing and Cropping Approach for Facial Action Unit Detection. In Automatic Face and Gesture Recognition (FG) .  Wei Li Farnaz Abtahi Zhigang Zhu and Lijun Yin. 2017b. EAC-Net: A Region-based Deep Enhancing and Cropping Approach for Facial Action Unit Detection. In Automatic Face and Gesture Recognition (FG) .","DOI":"10.1109\/FG.2017.136"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"crossref","unstructured":"Wei Li Farnaz Abtahi Zhigang Zhu and Lijun Yin. 2018. EAC-Net: Deep Nets with Enhancing and Cropping for Facial Action Unit Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2018).  Wei Li Farnaz Abtahi Zhigang Zhu and Lijun Yin. 2018. EAC-Net: Deep Nets with Enhancing and Cropping for Facial Action Unit Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2018).","DOI":"10.1109\/TPAMI.2018.2791608"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2019.00235"},{"key":"e_1_3_2_2_29_1","unstructured":"Chris J Maddison Andriy Mnih and Yee Whye Teh. 2016. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016).  Chris J Maddison Andriy Mnih and Yee Whye Teh. 2016. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016)."},{"key":"e_1_3_2_2_30_1","unstructured":"Jiquan Ngiam Aditya Khosla Mingyu Kim Juhan Nam Honglak Lee and Andrew Y Ng. 2011. Multimodal deep learning. ICML (2011).  Jiquan Ngiam Aditya Khosla Mingyu Kim Juhan Nam Honglak Lee and Andrew Y Ng. 2011. Multimodal deep learning. ICML (2011)."},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01219"},{"key":"e_1_3_2_2_32_1","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga et almbox. 2019. PyTorch: An imperative style high-performance deep learning library. In Advances in Neural Information Processing Systems. 8024--8035.  Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga et almbox. 2019. PyTorch: An imperative style high-performance deep learning library. In Advances in Neural Information Processing Systems. 8024--8035."},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01261-8_43"},{"key":"e_1_3_2_2_34_1","unstructured":"Zhiwen Shao Zhilei Liu Jianfei Cai Yunsheng Wu and Lizhuang Ma. 2019. Facial action unit detection using attention and relation learning. IEEE Transactions on Affective Computing (2019).  Zhiwen Shao Zhilei Liu Jianfei Cai Yunsheng Wu and Lizhuang Ma. 2019. Facial action unit detection using attention and relation learning. IEEE Transactions on Affective Computing (2019)."},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2872503"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910377533"},{"key":"e_1_3_2_2_37_1","unstructured":"Yao-Hung Hubert Tsai Paul Pu Liang Amir Zadeh Louis-Philippe Morency and Ruslan Salakhutdinov. 2019. Learning factorized multimodal representations. ICLR (2019).  Yao-Hung Hubert Tsai Paul Pu Liang Amir Zadeh Louis-Philippe Morency and Ruslan Salakhutdinov. 2019. Learning factorized multimodal representations. ICLR (2019)."},{"key":"e_1_3_2_2_38_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.  Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008."},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuroimage.2013.11.007"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3136755.3143011"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240508.3240613"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2964284.2967295"},{"key":"e_1_3_2_2_43_1","volume-title":"International conference on machine learning. 2048--2057","author":"Xu Kelvin","year":"2015"},{"key":"e_1_3_2_2_44_1","volume-title":"Learning Temporal Information From A Single Image For AU Detection. In 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG","author":"Yang Huiyuan","year":"2019"},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00612"},{"key":"e_1_3_2_2_46_1","doi-asserted-by":"crossref","unstructured":"Amir Zadeh Minghai Chen Soujanya Poria Erik Cambria and Louis-Philippe Morency. 2017. Tensor fusion network for multimodal sentiment analysis. EMNLP (2017).  Amir Zadeh Minghai Chen Soujanya Poria Erik Cambria and Louis-Philippe Morency. 2017. Tensor fusion network for multimodal sentiment analysis. EMNLP (2017).","DOI":"10.18653\/v1\/D17-1115"},{"key":"e_1_3_2_2_47_1","unstructured":"Hongyi Zhang Moustapha Cisse Yann N Dauphin and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).  Hongyi Zhang Moustapha Cisse Yann N Dauphin and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)."},{"key":"e_1_3_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2015.04.012"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00245"},{"key":"e_1_3_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2014.06.002"},{"key":"e_1_3_2_2_51_1","volume-title":"Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .","author":"Zhang Zheng","year":"2016"},{"key":"e_1_3_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298833"},{"key":"e_1_3_2_2_53_1","unstructured":"Jun-Yan Zhu Richard Zhang Deepak Pathak Trevor Darrell Alexei A Efros Oliver Wang and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Advances in neural information processing systems. 465--476.  Jun-Yan Zhu Richard Zhang Deepak Pathak Trevor Darrell Alexei A Efros Oliver Wang and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Advances in neural information processing systems. 465--476."}],"event":{"name":"MM '20: The 28th ACM International Conference on Multimedia","location":"Seattle WA USA","acronym":"MM '20","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 28th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3413538","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394171.3413538","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:13Z","timestamp":1750193233000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3413538"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,12]]},"references-count":53,"alternative-id":["10.1145\/3394171.3413538","10.1145\/3394171"],"URL":"https:\/\/doi.org\/10.1145\/3394171.3413538","relation":{},"subject":[],"published":{"date-parts":[[2020,10,12]]},"assertion":[{"value":"2020-10-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}