{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:25:34Z","timestamp":1750220734194,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,21]],"date-time":"2020-10-21T00:00:00Z","timestamp":1603238400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,21]]},"DOI":"10.1145\/3382507.3418830","type":"proceedings-article","created":{"date-parts":[[2020,10,22]],"date-time":"2020-10-22T10:04:35Z","timestamp":1603361075000},"page":"343-350","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["LDNN: Linguistic Knowledge Injectable Deep Neural Network for Group Cohesiveness Understanding"],"prefix":"10.1145","author":[{"given":"Yanan","family":"Wang","sequence":"first","affiliation":[{"name":"KDDI Research, Inc., Fujimino-shi, Saitama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianming","family":"Wu","sequence":"additional","affiliation":[{"name":"KDDI Research, Inc., Fujimino-shi, Saitama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinfa","family":"Huang","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gen","family":"Hattori","sequence":"additional","affiliation":[{"name":"KDDI Research, Inc., Fujimino-shi, Saitama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yasuhiro","family":"Takishima","sequence":"additional","affiliation":[{"name":"KDDI Research, Inc., Fujimino-shi, Saitama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shinya","family":"Wada","sequence":"additional","affiliation":[{"name":"KDDI Research, Inc., Fujimino-shi, Saitama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rui","family":"Kimura","sequence":"additional","affiliation":[{"name":"KDDI Research, Inc., Fujimino-shi, Saitama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jie","family":"Chen","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Satoshi","family":"Kurihara","sequence":"additional","affiliation":[{"name":"Keio University, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,10,22]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"crossref","unstructured":"Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In CVPR.  Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In CVPR.","DOI":"10.1109\/CVPR.2018.00636"},{"volume-title":"Convolutional Image Captioning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Aneja Jyoti","key":"e_1_3_2_2_2_1","unstructured":"Jyoti Aneja , Aditya Deshpande , and Alexander G. Schwing . 2018 . Convolutional Image Captioning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Jyoti Aneja, Aditya Deshpande, and Alexander G. Schwing. 2018. Convolutional Image Captioning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_2_3_1","volume-title":"Vqa: Visual question answering. In ICCV.","author":"Antol Stanislaw","year":"2015","unstructured":"Stanislaw Antol , Aishwarya Agrawal , Jiasen Lu , Margaret Mitchell , Dhruv Batra , C Lawrence Zitnick , and Devi Parikh . 2015 . Vqa: Visual question answering. In ICCV. Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. 2015. Vqa: Visual question answering. In ICCV."},{"key":"e_1_3_2_2_4_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL.","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3340555.3355710"},{"volume-title":"The more the merrier: Analysing the affect of a group of people in images","author":"Dhall Abhinav","key":"e_1_3_2_2_6_1","unstructured":"Abhinav Dhall , Jyoti Joshi , Karan Sikka , Roland Goecke , and Nicu Sebe . 2015. The more the merrier: Analysing the affect of a group of people in images . In FG. IEEE. Abhinav Dhall, Jyoti Joshi, Karan Sikka, Roland Goecke, and Nicu Sebe. 2015. The more the merrier: Analysing the affect of a group of people in images. In FG. IEEE."},{"key":"e_1_3_2_2_7_1","volume-title":"What do we perceive in a glance of a real-world scene? Journal of vision","author":"Fei-Fei Li","year":"2007","unstructured":"Li Fei-Fei , Asha Iyer , Christof Koch , and Pietro Perona . 2007. What do we perceive in a glance of a real-world scene? Journal of vision ( 2007 ). Li Fei-Fei, Asha Iyer, Christof Koch, and Pietro Perona. 2007. What do we perceive in a glance of a real-world scene? Journal of vision (2007)."},{"volume-title":"Robotics Research","author":"Fong Terrence","key":"e_1_3_2_2_8_1","unstructured":"Terrence Fong , Charles Thorpe , and Charles Baur . 2003. Collaboration , dialogue, human-robot interaction . In Robotics Research . Springer . Terrence Fong, Charles Thorpe, and Charles Baur. 2003. Collaboration, dialogue, human-robot interaction. In Robotics Research. Springer."},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2019.8852184"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3340555.3355712"},{"key":"e_1_3_2_2_11_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR."},{"key":"e_1_3_2_2_12_1","unstructured":"Geoffrey Hinton Oriol Vinyals and Jeff Dean. 2015. Distilling the knowledge in a neural network. In NIPS.  Geoffrey Hinton Oriol Vinyals and Jeff Dean. 2015. Distilling the knowledge in a neural network. In NIPS."},{"key":"e_1_3_2_2_13_1","volume-title":"Laurens Van Der Maaten, and Kilian Q Weinberger","author":"Huang Gao","year":"2017","unstructured":"Gao Huang , Zhuang Liu , Laurens Van Der Maaten, and Kilian Q Weinberger . 2017 . Densely connected convolutional networks. In CVPR. Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In CVPR."},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"crossref","unstructured":"Xin Huang Yuxin Peng and Mingkuan Yuan. 2017. Cross-modal common representation learning by hybrid transfer network. In IJCAI.  Xin Huang Yuxin Peng and Mingkuan Yuan. 2017. Cross-modal common representation learning by hybrid transfer network. In IJCAI.","DOI":"10.24963\/ijcai.2017\/263"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2010.2055233"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"crossref","unstructured":"Xiuyi Jia Xiang Zheng Weiwei Li Changqing Zhang and Zechao Li. 2019. Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally. In CVPR.  Xiuyi Jia Xiang Zheng Weiwei Li Changqing Zhang and Zechao Li. 2019. Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally. In CVPR.","DOI":"10.1109\/CVPR.2019.01007"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298932"},{"key":"e_1_3_2_2_18_1","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NIPS.  Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NIPS."},{"key":"e_1_3_2_2_19_1","volume-title":"The role of language in emotion: Predictions from psychological constructionism. Frontiers in Psychology","author":"Lindquist Kristen A","year":"2015","unstructured":"Kristen A Lindquist , Jennifer K MacCormack , and Holly Shablack . 2015. The role of language in emotion: Predictions from psychological constructionism. Frontiers in Psychology ( 2015 ). Kristen A Lindquist, Jennifer K MacCormack, and Holly Shablack. 2015. The role of language in emotion: Predictions from psychological constructionism. Frontiers in Psychology (2015)."},{"key":"e_1_3_2_2_20_1","volume-title":"Paul Pu Liang, AmirAli Bagher Zadeh, and Louis-Philippe Morency.","author":"Liu Zhun","year":"2018","unstructured":"Zhun Liu , Ying Shen , Varun Bharadhwaj Lakshminarasimhan , Paul Pu Liang, AmirAli Bagher Zadeh, and Louis-Philippe Morency. 2018 . Efficient Low-rank Multimodal Fusion With Modality-Specific Factors. In ACL. Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, AmirAli Bagher Zadeh, and Louis-Philippe Morency. 2018. Efficient Low-rank Multimodal Fusion With Modality-Specific Factors. In ACL."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"crossref","unstructured":"Pranav Rajpurkar Robin Jia and Percy Liang. 2018. Know What You Don?t Know: Unanswerable Questions for SQuAD. In ACL.  Pranav Rajpurkar Robin Jia and Percy Liang. 2018. Know What You Don?t Know: Unanswerable Questions for SQuAD. In ACL.","DOI":"10.18653\/v1\/P18-2124"},{"key":"e_1_3_2_2_22_1","volume-title":"Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808","author":"Raschka Sebastian","year":"2018","unstructured":"Sebastian Raschka . 2018. Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808 ( 2018 ). Sebastian Raschka. 2018. Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808 (2018)."},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACIIW.2019.8925231"},{"key":"e_1_3_2_2_24_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR.  Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR."},{"key":"e_1_3_2_2_25_1","volume-title":"Vl-bert: Pre-training of generic visual-linguistic representations. In ICLR.","author":"Su Weijie","year":"2019","unstructured":"Weijie Su , Xizhou Zhu , Yue Cao , Bin Li , Lewei Lu , Furu Wei , and Jifeng Dai . 2019 . Vl-bert: Pre-training of generic visual-linguistic representations. In ICLR. Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. 2019. Vl-bert: Pre-training of generic visual-linguistic representations. In ICLR."},{"key":"e_1_3_2_2_26_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In NIPS.  Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In NIPS."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"crossref","unstructured":"Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In CVPR.  Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In CVPR.","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"crossref","unstructured":"Yanan Wang Jianming Wu and Keiichiro Hoashi. 2019. Multi-Attention Fusion Network for Video-Based Emotion Recognition. In ICMI.  Yanan Wang Jianming Wu and Keiichiro Hoashi. 2019. Multi-Attention Fusion Network for Video-Based Emotion Recognition. In ICMI.","DOI":"10.1145\/3340555.3355720"},{"key":"e_1_3_2_2_29_1","volume-title":"Group-Level Cohesion Prediction Using Deep Learning Models with A Multi-Stream Hybrid Network. In 2019 International Conference on Multimodal Interaction (ICMI '19)","author":"Dang Tien Xuan","year":"2019","unstructured":"Tien Xuan Dang , Soo-Hyung Kim , Hyung-Jeong Yang , Guee-Sang Lee , and Thanh-Hung Vo . 2019 . Group-Level Cohesion Prediction Using Deep Learning Models with A Multi-Stream Hybrid Network. In 2019 International Conference on Multimodal Interaction (ICMI '19) . Tien Xuan Dang, Soo-Hyung Kim, Hyung-Jeong Yang, Guee-Sang Lee, and Thanh-Hung Vo. 2019. Group-Level Cohesion Prediction Using Deep Learning Models with A Multi-Stream Hybrid Network. In 2019 International Conference on Multimodal Interaction (ICMI '19)."},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"crossref","unstructured":"An Yang Quan Wang Jing Liu Kai Liu Yajuan Lyu Hua Wu Qiaoqiao She and Sujian Li. 2019. Enhancing pre-trained language representations with rich knowledge for machine reading comprehension. In ACL.  An Yang Quan Wang Jing Liu Kai Liu Yajuan Lyu Hua Wu Qiaoqiao She and Sujian Li. 2019. Enhancing pre-trained language representations with rich knowledge for machine reading comprehension. In ACL.","DOI":"10.18653\/v1\/P19-1226"},{"key":"e_1_3_2_2_31_1","volume-title":"Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency.","author":"Zadeh Amir","year":"2018","unstructured":"Amir Zadeh , Paul Pu Liang , Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018 . Memory fusion network for multi-view sequential learning. In AAAI. Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018. Memory fusion network for multi-view sequential learning. In AAAI."},{"key":"e_1_3_2_2_32_1","volume-title":"Soujanya Poria, Erik Cambria, and Louis-Philippe Morency.","author":"Zadeh Amir","year":"2018","unstructured":"Amir Zadeh , Paul Pu Liang , Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018 . Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In ACL. Amir Zadeh, Paul Pu Liang, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018. Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In ACL."},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"crossref","unstructured":"Yue Zheng Yali Li and Shengjin Wang. 2019. Intention Oriented Image Captions With Guiding Objects. In CVPR.  Yue Zheng Yali Li and Shengjin Wang. 2019. Intention Oriented Image Captions With Guiding Objects. In CVPR.","DOI":"10.1109\/CVPR.2019.00859"},{"key":"e_1_3_2_2_34_1","volume-title":"Automatic Group Cohesiveness Detection With Multi-Modal Features. In 2019 International Conference on Multimodal Interaction (ICMI '19)","author":"Zhu Bin","year":"2019","unstructured":"Bin Zhu , Xin Guo , Kenneth Barner , and Charles Boncelet . 2019 . Automatic Group Cohesiveness Detection With Multi-Modal Features. In 2019 International Conference on Multimodal Interaction (ICMI '19) . Bin Zhu, Xin Guo, Kenneth Barner, and Charles Boncelet. 2019. Automatic Group Cohesiveness Detection With Multi-Modal Features. In 2019 International Conference on Multimodal Interaction (ICMI '19)."}],"event":{"name":"ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"],"location":"Virtual Event Netherlands","acronym":"ICMI '20"},"container-title":["Proceedings of the 2020 International Conference on Multimodal Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3382507.3418830","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3382507.3418830","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:27Z","timestamp":1750199907000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3382507.3418830"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,21]]},"references-count":34,"alternative-id":["10.1145\/3382507.3418830","10.1145\/3382507"],"URL":"https:\/\/doi.org\/10.1145\/3382507.3418830","relation":{},"subject":[],"published":{"date-parts":[[2020,10,21]]},"assertion":[{"value":"2020-10-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}