{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T22:21:29Z","timestamp":1766269289574,"version":"3.41.0"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"9","license":[{"start":{"date-parts":[[2024,8,16]],"date-time":"2024-08-16T00:00:00Z","timestamp":1723766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"crossref","award":["19H05692"],"award-info":[{"award-number":["19H05692"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2024,9,30]]},"abstract":"<jats:p>When individuals communicate, they use different vocabularies, speaking speeds, facial expressions, and gestural languages, depending on those with whom they are speaking. This study focuses on the age of the speaker as a factor that affects the style of communication. We collected a multimodal dialogue corpus with various speaker ages. We used travel as the topic, as it interests people of all ages, and we set up a task based on a tourism consultation between an operator and a customer at a travel agency. This article presents the details of the dialogue task, collection procedures and annotations, and analysis of the characteristics of the dialogues and facial expressions, focusing on the age of the speakers. The results of the analysis suggest that the adult speakers have more independent opinions, the older speakers express their opinions more frequently than other age groups, and those in the operator role smile more frequently at minors.<\/jats:p>","DOI":"10.1145\/3675166","type":"journal-article","created":{"date-parts":[[2024,6,26]],"date-time":"2024-06-26T11:25:43Z","timestamp":1719401143000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Travel Agency Task Dialogue Corpus: A Multimodal Dataset with Age-Diverse Speakers"],"prefix":"10.1145","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3190-9044","authenticated-orcid":false,"given":"Michimasa","family":"Inaba","sequence":"first","affiliation":[{"name":"The University of Electro-Communications, Chofu, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1987-4368","authenticated-orcid":false,"given":"Yuya","family":"Chiba","sequence":"additional","affiliation":[{"name":"NTT Corporation, Chiyoda-ku, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-0007-555X","authenticated-orcid":false,"given":"Zhiyang","family":"Qi","sequence":"additional","affiliation":[{"name":"The University of Electro-Communications, Chofu, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6994-3977","authenticated-orcid":false,"given":"Ryuichiro","family":"Higashinaka","sequence":"additional","affiliation":[{"name":"Nagoya University, Nagoya, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6052-600X","authenticated-orcid":false,"given":"Kazunori","family":"Komatani","sequence":"additional","affiliation":[{"name":"Osaka University, Suita, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7271-0253","authenticated-orcid":false,"given":"Yusuke","family":"Miyao","sequence":"additional","affiliation":[{"name":"The University of Tokyo, Bunkyo-ku, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2778-2180","authenticated-orcid":false,"given":"Takayuki","family":"Nagai","sequence":"additional","affiliation":[{"name":"Osaka University, Suita, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,8,16]]},"reference":[{"key":"e_1_3_4_2_2","first-page":"277","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops","author":"Aubrey Andrew J.","year":"2013","unstructured":"Andrew J. Aubrey, David Marshall, Paul L. Rosin, Jason Vendeventer, Douglas W. Cunningham, and Christian Wallraven. 2013. Cardiff Conversation Database (CCDb): A database of natural dyadic conversations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 277\u2013282."},{"doi-asserted-by":"publisher","key":"e_1_3_4_3_2","DOI":"10.1109\/WACV.2016.7477553"},{"doi-asserted-by":"publisher","key":"e_1_3_4_4_2","DOI":"10.1016\/0004-3702(77)90018-2"},{"doi-asserted-by":"publisher","key":"e_1_3_4_5_2","DOI":"10.18653\/v1\/D18-1547"},{"key":"e_1_3_4_6_2","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1007\/978-3-319-42816-1_6","volume-title":"Multimodal Interaction with W3C Standards","author":"Bunt Harry","year":"2017","unstructured":"Harry Bunt, Volha Petukhova, David Traum, and Jan Alexandersson. 2017. Dialogue act annotation with the ISO 24617-2 standard. In Multimodal Interaction with W3C Standards. Springer, 109\u2013135."},{"doi-asserted-by":"publisher","key":"e_1_3_4_7_2","DOI":"10.1007\/s10579-008-9076-6"},{"doi-asserted-by":"publisher","key":"e_1_3_4_8_2","DOI":"10.1145\/3136755.3136780"},{"doi-asserted-by":"publisher","key":"e_1_3_4_9_2","DOI":"10.1007\/s10579-007-9040-x"},{"key":"e_1_3_4_10_2","first-page":"7521","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Chen Lu","year":"2020","unstructured":"Lu Chen, Boer Lv, Chi Wang, Su Zhu, Bowen Tan, and Kai Yu. 2020. Schema-guided multi-domain dialogue state tracking with graph attention neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 7521\u20137528."},{"key":"e_1_3_4_11_2","first-page":"40","volume-title":"Proceedings of the 2005 International Workshop on Machine Learning for Multimodal Interaction","author":"Chen Lei","year":"2005","unstructured":"Lei Chen, R. Travis Rose, Ying Qiao, Irene Kimbara, Fey Parrill, Haleema Welji, Tony Xu Han, Jilin Tu, Zhongqiang Huang, Mary Harper, Francis Quek, Yingen Xiong, David McNeill, Ronald Tuttle, and Thomas Huang. 2005. VACE multimodal meeting corpus. In Proceedings of the 2005 International Workshop on Machine Learning for Multimodal Interaction. 40\u201351."},{"doi-asserted-by":"publisher","key":"e_1_3_4_12_2","DOI":"10.18653\/v1\/P19-1360"},{"doi-asserted-by":"publisher","key":"e_1_3_4_13_2","DOI":"10.1016\/j.neuroimage.2022.119734"},{"key":"e_1_3_4_14_2","first-page":"5759","volume-title":"Proceedings of the 13th Language Resources and Evaluation Conference","author":"Inaba Michimasa","year":"2022","unstructured":"Michimasa Inaba, Yuya Chiba, Ryuichiro Higashinaka, Kazunori Komatani, Yusuke Miyao, and Takayuki Nagai. 2022. Collection and analysis of travel agency task dialogues with age-diverse speakers. In Proceedings of the 13th Language Resources and Evaluation Conference. 5759\u20135767."},{"key":"e_1_3_4_15_2","first-page":"I-364\u2013I-367","volume-title":"Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"1","author":"Janin Adam","year":"2003","unstructured":"Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, and. C. Wooters. 2003. The ICSI meeting corpus. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. I-364\u2013I-367."},{"key":"e_1_3_4_16_2","volume-title":"Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH","author":"Kawano Seiya","year":"2022","unstructured":"Seiya Kawano, Muteki Arioka, Akishige Yuguchi, Kenta Yamamoto, Koji Inoue, Tatsuya Kawahara, Satoshi Nakamura, and Koichiro Yoshino. 2022. Multimodal persuasive dialogue corpus using a teleoperated Android. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH\u201922). 2308\u20132312."},{"doi-asserted-by":"publisher","key":"e_1_3_4_17_2","DOI":"10.1109\/ACII52823.2021.9597447"},{"key":"e_1_3_4_18_2","first-page":"166","volume-title":"Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction","author":"McKeown Gary","year":"2015","unstructured":"Gary McKeown, William Curran, Johannes Wagner, Florian Lingenfelser, and Elisabeth Andr\u00e9. 2015. The Belfast storytelling database: A spontaneous social interaction database with laughter focused annotation. In Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction. 166\u2013172."},{"doi-asserted-by":"publisher","key":"e_1_3_4_19_2","DOI":"10.5555\/1708376.1708393"},{"doi-asserted-by":"publisher","key":"e_1_3_4_20_2","DOI":"10.18653\/v1\/P17-1163"},{"key":"e_1_3_4_21_2","first-page":"973","volume-title":"Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"P\u00e9rez-Rosas Ver\u00f3nica","year":"2013","unstructured":"Ver\u00f3nica P\u00e9rez-Rosas, Rada Mihalcea, and Louis-Philippe Morency. 2013. Utterance-level multimodal sentiment analysis. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 973\u2013982."},{"doi-asserted-by":"publisher","key":"e_1_3_4_22_2","DOI":"10.18653\/v1\/P19-1050"},{"doi-asserted-by":"publisher","key":"e_1_3_4_23_2","DOI":"10.1609\/aaai.v34i05.6394"},{"key":"e_1_3_4_24_2","volume-title":"Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH","author":"Raux Antoine","year":"2005","unstructured":"Antoine Raux, Brian Langner, Dan Bohus, Alan W. Black, and Maxine Eskenazi. 2005. Let\u2019s go public! Taking a spoken dialog system to the real world. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH\u201905)."},{"issue":"13","key":"e_1_3_4_25_2","doi-asserted-by":"crossref","first-page":"eadf3197","DOI":"10.1126\/sciadv.adf3197","article-title":"The CANDOR corpus: Insights from a large multimodal dataset of naturalistic conversation","volume":"9","author":"Reece Andrew","year":"2023","unstructured":"Andrew Reece, Gus Cooney, Peter Bull, Christine Chung, Bryn Dawson, Casey Fitzpatrick, Tamara Glazer, Dean Knox, Alex Liebscher, and Sebastian Marin. 2023. The CANDOR corpus: Insights from a large multimodal dataset of naturalistic conversation. Science Advances 9, 13 (2023), eadf3197.","journal-title":"Science Advances"},{"key":"e_1_3_4_26_2","first-page":"2517","volume-title":"Proceedings of the 13th Language Resources and Evaluation Conference","author":"Reverdy Justine","year":"2022","unstructured":"Justine Reverdy, Sam O\u2019Connor Russell, Louise Duquenne, Diego Garaialde, Benjamin R. Cowan, and Naomi Harte. 2022. RoomReader: A multimodal corpus of online multiparty conversational interactions. In Proceedings of the 13th Language Resources and Evaluation Conference. 2517\u20132527."},{"unstructured":"Hiroaki Sugiyama Masahiro Mizukami Tsunehiro Arimoto Hiromi Narimatsu Yuya Chiba Hideharu Nakajima and Toyomi Meguro. 2021. Empirical analysis of training strategies of Transformer-based Japanese chit-chat systems. arxiv:2109.05217[cs.CL] (2021).","key":"e_1_3_4_27_2"},{"key":"e_1_3_4_28_2","volume-title":"Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services","author":"Waibel Alex","year":"2005","unstructured":"Alex Waibel, Hartwig Steusloff, and Rainer Stiefelhagen. 2005. CHIL: Computers in the Human Interaction Loop. In Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services."},{"key":"e_1_3_4_29_2","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1007\/978-3-031-44699-3_37","volume-title":"Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing","author":"Wang Yueqian","year":"2023","unstructured":"Yueqian Wang, Yuxuan Wang, and Dongyan Zhao. 2023. Overview of the NLPCC 2023 shared task 10: Learn to watch TV: Multimodal dialogue understanding and response generation. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing. 412\u2013419."},{"key":"e_1_3_4_30_2","first-page":"438","volume-title":"Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics","volume":"1","author":"Wen T. H.","year":"2017","unstructured":"T. H. Wen, D. Vandyke, N. Mrk\u0161\u00edc, M. Ga\u0161\u00edc, L. M. Rojas-Barahona, P. H. Su, S. Ultes, and S. Young. 2017. A network-based end-to-end trainable task-oriented dialogue system. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Vol. 1. 438\u2013449."},{"doi-asserted-by":"publisher","key":"e_1_3_4_31_2","DOI":"10.1109\/MIS.2013.34"},{"key":"e_1_3_4_32_2","article-title":"MOSI: Multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos","author":"Zadeh Amir","year":"2016","unstructured":"Amir Zadeh, Rowan Zellers, Eli Pincus, and Louis-Philippe Morency. 2016. MOSI: Multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos. arXiv preprint arXiv:1606.06259 (2016).","journal-title":"arXiv preprint arXiv:1606.06259"},{"doi-asserted-by":"publisher","key":"e_1_3_4_33_2","DOI":"10.18653\/v1\/P18-1208"},{"key":"e_1_3_4_34_2","first-page":"9604","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Zhang Yichi","year":"2020","unstructured":"Yichi Zhang, Zhijian Ou, and Zhou Yu. 2020. Task-oriented dialog systems that consider multiple appropriate responses under the same context. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9604\u20139611."},{"key":"e_1_3_4_35_2","doi-asserted-by":"crossref","first-page":"5699","DOI":"10.18653\/v1\/2022.acl-long.391","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Zhao Jinming","year":"2022","unstructured":"Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, and Haizhou Li. 2022. M3ED: Multi-modal multi-scene multi-label emotional dialogue database. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 5699\u20135710."},{"key":"e_1_3_4_36_2","doi-asserted-by":"crossref","first-page":"1458","DOI":"10.18653\/v1\/P18-1135","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Zhong Victor","year":"2018","unstructured":"Victor Zhong, Caiming Xiong, and Richard Socher. 2018. Global-locally self-attentive encoder for dialogue state tracking. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1458\u20131467."},{"key":"e_1_3_4_37_2","first-page":"713","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing","author":"Zue Victor","year":"1991","unstructured":"Victor Zue, James Glass, David Goodine, Hong Leung, Michael Phillips, Joseph Polifroni, and Stephanie Seneff. 1991. Integration of speech recognition and natural language processing in the MIT VOYAGER system. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. 713\u2013716."}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3675166","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3675166","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:36Z","timestamp":1750291536000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3675166"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,16]]},"references-count":36,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2024,9,30]]}},"alternative-id":["10.1145\/3675166"],"URL":"https:\/\/doi.org\/10.1145\/3675166","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2024,8,16]]},"assertion":[{"value":"2023-08-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-22","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}