{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T00:56:12Z","timestamp":1760057772284,"version":"build-2065373602"},"reference-count":47,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,2,20]],"date-time":"2025-02-20T00:00:00Z","timestamp":1740009600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"JSPS KAKENHI","doi-asserted-by":"publisher","award":["22H04860","22K21304","22H00536","23H03506","JPMJCR20G6"],"award-info":[{"award-number":["22H04860","22K21304","22H00536","23H03506","JPMJCR20G6"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]},{"name":"JST AIP Trilateral AI Research, Japan","award":["22H04860","22K21304","22H00536","23H03506","JPMJCR20G6"],"award-info":[{"award-number":["22H04860","22K21304","22H00536","23H03506","JPMJCR20G6"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MTI"],"abstract":"<jats:p>The automatic recognition of user rapport at the dialogue level for multimodal dialogue systems (MDSs) is a critical component of effective dialogue system management. Both the dialogue systems and their evaluations need to be based on user expressions. Numerous studies have demonstrated that user personalities and demographic data such as age and gender significantly affect user expression. Neglecting users\u2019 personalities and demographic data will result in less accurate user expression and rapport recognition. To the best of our knowledge, no existing studies have considered the effects of users\u2019 personalities and demographic data on the automatic recognition of user rapport in MDSs. To analyze the influence of users\u2019 personalities and demographic data on dialogue level user rapport recognition, we first used a Hazummi dataset which is an online dataset containing users\u2019 personal information (personality, age, and gender information). Based on this dataset, we analyzed the relationship between user rapport in dialogue systems and users\u2019 traits, finding that gender and age significantly influence the recognition of user rapport. These factors could potentially introduce biases into the model. To mitigate the impact of users\u2019 traits, we introduced an adversarial-based model. Experimental results showed a significant improvement in user rapport recognition compared to models that do not account for users\u2019 traits. To validate our multimodal modeling approach, we compared it to human perception and instruction-based Large Language Models (LLMs). The results showed that our model outperforms that of human and instruction-based LLM models.<\/jats:p>","DOI":"10.3390\/mti9030018","type":"journal-article","created":{"date-parts":[[2025,2,20]],"date-time":"2025-02-20T04:53:39Z","timestamp":1740027219000},"page":"18","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Influence of Personality Traits and Demographics on Rapport Recognition Using Adversarial Learning"],"prefix":"10.3390","volume":"9","author":[{"given":"Wenqing","family":"Wei","sequence":"first","affiliation":[{"name":"Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi 923-1292, Japan"}]},{"given":"Sixia","family":"Li","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi 923-1292, Japan"}]},{"given":"Candy Olivia","family":"Mawalim","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi 923-1292, Japan"}]},{"given":"Xiguang","family":"Li","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi 923-1292, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6052-600X","authenticated-orcid":false,"given":"Kazunori","family":"Komatani","sequence":"additional","affiliation":[{"name":"Department of Knowledge Science, Institute of Scientific and Industrial Research, Osaka University, Ibaraki 567-0047, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9260-0403","authenticated-orcid":false,"given":"Shogo","family":"Okada","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi 923-1292, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1145\/3383123","article-title":"Challenges in building intelligent open-domain dialog systems","volume":"38","author":"Huang","year":"2020","journal-title":"ACM Trans. Inf. Syst. (TOIS)"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"11251","DOI":"10.1007\/s10489-022-04067-1","article-title":"Sequential or jumping: Context-adaptive response generation for open-domain dialogue systems","volume":"53","author":"Ling","year":"2022","journal-title":"Appl. Intell."},{"key":"ref_3","unstructured":"Young, T., Xing, F., Pandelea, V., Ni, J., and Cambria, E. (March, January 22). Fusing task-oriented and open-domain dialogues in conversational agents. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Iizuka, S., Mochizuki, S., Ohashi, A., Yamashita, S., Guo, A., and Higashinaka, R. (2023, January 27\u201329). Clarifying the Dialogue-Level Performance of GPT-3.5 and GPT-4 in Task-Oriented and Non-Task-Oriented Dialogue Systems. Proceedings of the AAAI Symposium Series, Burlingame, CA, USA.","DOI":"10.1609\/aaaiss.v2i1.27668"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1207\/s15327965pli0104_1","article-title":"The nature of rapport and its nonverval correlates","volume":"1","author":"Degnen","year":"1990","journal-title":"Psychol. Inq."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"M\u00fcller, P., Huang, M.X., and Bulling, A. (2018, January 27\u201331). Detecting low rapport during natural interactions in small groups from non-verbal behaviour. Proceedings of the 23rd International Conference on Intelligent User Interfaces, Sydney, Australia.","DOI":"10.1145\/3172944.3172969"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1002\/jip.1386","article-title":"The role of rapport in investigative interviewing: A review","volume":"10","author":"Abbe","year":"2013","journal-title":"J. Investig. Psychol. Offender Profiling"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"73024","DOI":"10.1109\/ACCESS.2023.3287984","article-title":"A ranking model for evaluation of conversation partners based on rapport levels","volume":"11","author":"Hayashi","year":"2023","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1109\/TAFFC.2016.2545650","article-title":"Rapport with Virtual Agents: What Do Human Social Cues and Personality Explain?","volume":"8","author":"Cerekovic","year":"2017","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1080\/17439760500510833","article-title":"Positive emotion dispositions differentially associated with Big Five personality and attachment style","volume":"1","author":"Shiota","year":"2006","journal-title":"J. Posit. Psychol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1215","DOI":"10.1016\/S0191-8869(01)00087-3","article-title":"Associations among the Big Five, emotional responses, and coping with acute stress","volume":"32","author":"Penley","year":"2002","journal-title":"Personal. Individ. Differ."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wei, Z., Liu, Q., Peng, B., Tou, H., Chen, T., Huang, X.J., Wong, K.F., and Dai, X. (2018, January 15\u201320). Task-oriented dialogue system for automatic diagnosis. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia. Short Papers.","DOI":"10.18653\/v1\/P18-2033"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2011","DOI":"10.1007\/s11431-020-1692-3","article-title":"Recent advances and challenges in task-oriented dialog systems","volume":"63","author":"Zhang","year":"2020","journal-title":"Sci. China Technol. Sci."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"110069","DOI":"10.1016\/j.knosys.2022.110069","article-title":"Multi-task learning with graph attention networks for multi-domain task-oriented dialogue systems","volume":"259","author":"Zhao","year":"2023","journal-title":"Knowl.-Based Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kobyashi, S., and Hagiwara, M. (2016). Non-task-oriented dialogue system considering user\u2019s preference and human relations. Trans. Jpn. Soc. Artif. Intell., 31.","DOI":"10.1527\/tjsai.DSF-502"},{"key":"ref_16","first-page":"14","article-title":"Constructing a non-task-oriented dialogue agent using statistical response method and gamification","volume":"Volume 2","author":"Inaba","year":"2014","journal-title":"Proceedings of the International Conference on Agents and Artificial Intelligence"},{"key":"ref_17","unstructured":"Jekosch, U. (2006). Voice and Speech Quality Perception: Assessment and Evaluation, Springer Science & Business Media."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Engelbrecht, K.P., G\u00f6dde, F., Hartard, F., Ketabdar, H., and M\u00f6ller, S. (2009, January 11\u201312). Modeling user satisfaction with hidden Markov models. Proceedings of the SIGDIAL 2009 Conference, London, UK.","DOI":"10.3115\/1708376.1708402"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ultes, S. (2020). Improving interaction quality estimation with BiLSTMs and the impact on dialogue policy learning. arXiv.","DOI":"10.18653\/v1\/W19-5902"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Hirano, Y., Okada, S., Nishimoto, H., and Komatani, K. (2019, January 14\u201318). Multitask prediction of exchange-level annotations for multimodal dialogue systems. Proceedings of the 2019 International Conference on Multimodal Interaction, Suzhou, China.","DOI":"10.1145\/3340555.3353730"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kim, T.E., and Lipani, A. (2022, January 11\u201315). A multi-task based neural model to simulate users in goal oriented dialogue systems. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain.","DOI":"10.1145\/3477495.3531814"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.specom.2015.06.003","article-title":"Interaction quality: Assessing the quality of ongoing spoken dialog interaction by experts\u2014And how it relates to user satisfaction","volume":"74","author":"Schmitt","year":"2015","journal-title":"Speech Commun."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wei, W., Li, S., Okada, S., and Komatani, K. (2021, January 18\u201322). Multimodal user satisfaction recognition for non-task oriented dialogue systems. Proceedings of the 2021 International Conference on Multimodal Interaction, Montr\u00e9al, QC, Canada.","DOI":"10.1145\/3462244.3479928"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Bodigutla, P.K., Tiwari, A., Vargas, J.V., Polymenakos, L., and Matsoukas, S. (2020). Joint turn and dialogue-level user satisfaction estimation on multi-domain conversations. arXiv.","DOI":"10.18653\/v1\/2020.findings-emnlp.347"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Mendon\u00e7a, J., Trancoso, I., and Lavie, A. (2024). Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs. arXiv.","DOI":"10.18653\/v1\/2024.findings-emnlp.684"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Mendon\u00e7a, J., Trancoso, I., and Lavie, A. (2024). ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues. arXiv.","DOI":"10.18653\/v1\/2024.sigdial-1.44"},{"key":"ref_27","first-page":"46595","article-title":"Judging llm-as-a-judge with mt-bench and chatbot arena","volume":"36","author":"Zheng","year":"2023","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Laskar, M.T.R., Alqahtani, S., Bari, M.S., Rahman, M., Khan, M.A.M., Khan, H., Jahan, I., Bhuiyan, A., Tan, C.W., and Parvez, M.R. (2024, January 12\u201316). A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA.","DOI":"10.18653\/v1\/2024.emnlp-main.764"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Meng, Z., Li, J., and Gong, Y. (2019, January 12\u201317). Adversarial speaker adaptation. Proceedings of the ICASSP 2019\u20142019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK.","DOI":"10.1109\/ICASSP.2019.8682510"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gao, Y., Okada, S., Wang, L., Liu, J., and Dang, J. (2022, January 22\u201327). Domain-invariant feature learning for cross corpus speech emotion recognition. Proceedings of the ICASSP 2022\u20142022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9747129"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Komatani, K., and Okada, S. (October, January 28). Multimodal human-agent dialogue corpus with annotations at utterance and dialogue levels. Proceedings of the 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), Nara, Japan.","DOI":"10.1109\/ACII52823.2021.9597447"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Katada, S., Okada, S., and Komatani, K. (2022, January 7\u201311). Transformer-based physiological feature learning for multimodal analysis of self-reported sentiment. Proceedings of the 2022 International Conference on Multimodal Interaction, Bengaluru, India.","DOI":"10.1145\/3536221.3556576"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wei, W., Li, S., and Okada, S. (2022, January 7\u201311). Investigating the relationship between dialogue and exchange-level impression. Proceedings of the 2022 International Conference on Multimodal Interaction, Bengaluru, India.","DOI":"10.1145\/3536221.3556602"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Lee, A., Oura, K., and Tokuda, K. (2013, January 26\u201331). MMDAgent\u2014A fully open-source toolkit for voice interaction systems. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada.","DOI":"10.1109\/ICASSP.2013.6639300"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Komatani, K., Takeda, R., and Okada, S. (2023, January 11\u201315). Analyzing differences in subjective annotations by participants and third-party annotators in multimodal dialogue corpus. Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Prague, Czechia.","DOI":"10.18653\/v1\/2023.sigdial-1.9"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1037\/0022-3514.71.1.110","article-title":"Dyad rapport and the accuracy of its judgment across situations: A lens model analysis","volume":"71","author":"Bernieri","year":"1996","journal-title":"J. Personal. Soc. Psychol."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"40","DOI":"10.2132\/personality.21.40","article-title":"Development, reliability, and validity of the Japanese version of Ten Item Personality Inventory (TIPI-J)","volume":"21","author":"Oshio","year":"2012","journal-title":"Jpn. J. Personal.\/Pasonariti Kenkyu"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1109\/TAFFC.2015.2457417","article-title":"The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing","volume":"7","author":"Eyben","year":"2015","journal-title":"IEEE Trans. Affect. Comput."},{"key":"ref_39","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Baltrusaitis, T., Zadeh, A., Lim, Y.C., and Morency, L.P. (2018, January 15\u201319). Openface 2.0: Facial behavior analysis toolkit. Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi\u2019an, China.","DOI":"10.1109\/FG.2018.00019"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1093\/oso\/9780195169157.003.0014","article-title":"Observer-based measurement of facial expression with the Facial Action Coding System","volume":"1","author":"Cohn","year":"2007","journal-title":"Handb. Emot. Elicitation Assess."},{"key":"ref_42","unstructured":"Ganin, Y., and Lempitsky, V. (2015). Unsupervised Domain Adaptation by Backpropagation. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Latif, N., Barbosa, A.V., Vatiokiotis-Bateson, E., Castelhano, M.S., and Munhall, K. (2014). Movement coordination during conversation. PLoS ONE, 9.","DOI":"10.1371\/journal.pone.0105036"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1111\/j.1468-2958.1981.tb00564.x","article-title":"Talk and silence sequences in informal conversations III: Interspeaker influence","volume":"7","author":"Cappella","year":"1981","journal-title":"Hum. Commun. Res."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1037\/0022-3514.49.6.1513","article-title":"Emotional reactions to a political leader\u2019s expressive displays","volume":"49","author":"McHugo","year":"1985","journal-title":"J. Personal. Soc. Psychol."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"2382","DOI":"10.1121\/1.2178720","article-title":"On phonetic convergence during conversational interaction","volume":"119","author":"Pardo","year":"2006","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ivanov, A.V., Riccardi, G., Sporka, A.J., and Franc, J. (2011, January 27\u201331). Recognition of personality traits from human spoken conversations. Proceedings of the Twelfth annual Conference of the International Speech Communication Association, Florence, Italy.","DOI":"10.21437\/Interspeech.2011-467"}],"container-title":["Multimodal Technologies and Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2414-4088\/9\/3\/18\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:38:37Z","timestamp":1760027917000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2414-4088\/9\/3\/18"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,20]]},"references-count":47,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["mti9030018"],"URL":"https:\/\/doi.org\/10.3390\/mti9030018","relation":{},"ISSN":["2414-4088"],"issn-type":[{"type":"electronic","value":"2414-4088"}],"subject":[],"published":{"date-parts":[[2025,2,20]]}}}