{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T02:28:33Z","timestamp":1772591313546,"version":"3.50.1"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2025,4,17]],"date-time":"2025-04-17T00:00:00Z","timestamp":1744848000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62172226"],"award-info":[{"award-number":["62172226"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62272230"],"award-info":[{"award-number":["62272230"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Ministry of Education, Singapore, under its AcRF Tier 2 Funding","award":["T2EP20123-0052"],"award-info":[{"award-number":["T2EP20123-0052"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2025,4,30]]},"abstract":"<jats:p>Task-oriented conversational agents strive to aid users across various tasks by concentrating on generating suitable responses to guarantee successful task accomplishment. Nonetheless, several factors have a substantial influence on user contentment beyond task fulfillment, requiring further investigation. Within this work, we aim to analyze diverse behavioral patterns of conversational agents with the goal of enhancing user satisfaction. Our findings lead to the exploration of three different enriched response generation schemes: EnRG-ATT, EnRG-TIP, and EnRG-SIM. Specifically, EnRG-ATT is designed to integrate the model's capabilities with a dual attention mechanism across two distinct modalities of external resources. It employs a pair of gates to regulate the utilization of such sources efficiently. More elegantly, we introduce EnRG-TIP, which simplifies response enrichment as a sequence prediction problem and exploits the pre-trained language model to capture user tips related to the conversation. Moreover, building on the efficiency of grounding on similar responses, EnRG-SIM further enhances response generation by inserting similar responses into the training sequences, to direct the pre-trained model's attention towards this additional knowledge. Our comprehensive experiments demonstrate that our three proposed methods not only achieve good task completion but also generate responses that yield higher user satisfaction.<\/jats:p>","DOI":"10.1145\/3714474","type":"journal-article","created":{"date-parts":[[2025,1,22]],"date-time":"2025-01-22T14:28:08Z","timestamp":1737556088000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Enriching Responses with Crowd-Sourced Knowledge for Task-Oriented Conversational Agents"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-0643-3582","authenticated-orcid":false,"given":"Zhaohui","family":"Wei","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9973-3305","authenticated-orcid":false,"given":"Lizi","family":"Liao","sequence":"additional","affiliation":[{"name":"School of Computing and Information Systems, Singapore Management University, Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2344-6174","authenticated-orcid":false,"given":"Xinguang","family":"Xiang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4641-1994","authenticated-orcid":false,"given":"Xiaoyu","family":"Du","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,4,17]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"5016","article-title":"MultiWOZ-A large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling","author":"Budzianowski Pawe\u0142","year":"2018","unstructured":"Pawe\u0142 Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, I\u00f1igo Casanueva, Stefan Ultes, Osman Ramadan, and Milica Gasic. 2018. MultiWOZ-A large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In EMNLP, 5016\u20135026.","journal-title":"EMNLP"},{"key":"e_1_3_1_3_2","first-page":"3696","article-title":"Semantically conditioned dialog response generation via hierarchical disentangled self-attention","author":"Chen Wenhu","year":"2019","unstructured":"Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, and William Yang Wang. 2019. Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In ACL, 3696\u20133709.","journal-title":"ACL"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3606368"},{"key":"e_1_3_1_5_2","first-page":"183","article-title":"Few-shot NLG with pre-trained language model","author":"Chen Zhiyu","year":"2020","unstructured":"Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, and William Yang Wang. 2020. Few-shot NLG with pre-trained language model. In ACL, 183\u2013190.","journal-title":"ACL"},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","unstructured":"Zhiyu Chen Bing Liu Seungwhan Moon Chinnadhurai Sankar Paul Crook and William Yang Wang. 2022. KETOD: Knowledge-enriched task-oriented dialogue. arXiv:2205.05589. Retrieved from https:\/\/arxiv.org\/abs\/2205.05589","DOI":"10.18653\/v1\/2022.findings-naacl.197"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3291060"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3507782"},{"key":"e_1_3_1_9_2","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 4171\u20134186.","journal-title":"NAACL"},{"key":"e_1_3_1_10_2","unstructured":"Mihail Eric Rahul Goel Shachi Paul Adarsh Kumar Abhishek Sethi Peter Ku Anuj Kumar Goyal Sanchit Agarwal Shuyang Gao and Dilek Hakkani-Tur. 2019. MultiWOZ 2.1: A consolidated multi-domain dialogue dataset with state corrections and state tracking baselines. arXiv:1907.01669. Retrieved from https:\/\/arxiv.org\/abs\/1907.01669"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3432689"},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","unstructured":"Marjan Ghazvininejad Chris Brockett Ming-Wei Chang Bill Dolan Jianfeng Gao Wen-tau Yih and Michel Galley. 2017. A knowledge-grounded neural conversation model. arXiv:1702.01932. Retrieved from https:\/\/arxiv.org\/abs\/1702.01932","DOI":"10.1609\/aaai.v32i1.11977"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3475959.3485392"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_1_15_2","first-page":"1","article-title":"A simple language model for task-oriented dialogue","author":"Hosseini-Asl Ehsan","year":"2020","unstructured":"Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu, Semih Yavuz, and Richard Socher. 2020. A simple language model for task-oriented dialogue. In NeurIPS, 1\u201313.","journal-title":"NeurIPS"},{"key":"e_1_3_1_16_2","first-page":"2879","volume-title":"WWW","author":"Jiang Shaojie","year":"2019","unstructured":"Shaojie Jiang, Pengjie Ren, Christof Monz, and Maarten de Rijke. 2019. Improving neural response diversity with frequency-aware cross-entropy loss. In WWW, 2879\u20132885."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.527"},{"key":"e_1_3_1_18_2","first-page":"1188","volume-title":"ICML","author":"Le Quoc","year":"2014","unstructured":"Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In ICML, 1188\u20131196."},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1128"},{"key":"e_1_3_1_20_2","unstructured":"Jiwei Li Michel Galley Chris Brockett Jianfeng Gao and Bill Dolan. 2016. A diversity-promoting objective function for neural conversation models. arXiv:1510.03055. Retrieved from https:\/\/arxiv.org\/abs\/1510.03055"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462970"},{"key":"e_1_3_1_22_2","first-page":"74","article-title":"Rouge: A package for automatic evaluation of summaries","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out, 74\u201381.","journal-title":"Text Summarization Branches Out"},{"key":"e_1_3_1_23_2","unstructured":"Phillip Lippe Pengjie Ren Hinda Haned Bart Voorn and Maarten de Rijke. 2020. Diversifying task-oriented dialogue response generation with prototype guided paraphrasing. arXiv:2008.03391. Retrieved from https:\/\/arxiv.org\/abs\/2008.03391"},{"key":"e_1_3_1_24_2","first-page":"4881","article-title":"Table-to-text generation by structure-aware Seq2Seq learning","author":"Liu Tianyu","year":"2018","unstructured":"Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang, and Zhifang Sui. 2018. Table-to-text generation by structure-aware Seq2Seq learning. In AAAI, 4881\u20134888.","journal-title":"AAAI"},{"key":"e_1_3_1_25_2","first-page":"1468","article-title":"Mem2Seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems","author":"Madotto Andrea","year":"2018","unstructured":"Andrea Madotto, Chien-Sheng Wu, and Pascale Fung. 2018. Mem2Seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In ACL, 1468\u20131478.","journal-title":"ACL"},{"key":"e_1_3_1_26_2","first-page":"165","article-title":"Structured fusion networks for dialog","author":"Mehri Shikib","year":"2019","unstructured":"Shikib Mehri, Tejas Srinivasan, and Maxine Eskenazi. 2019. Structured fusion networks for dialog. In SIGdial, 165\u2013177.","journal-title":"SIGdial"},{"key":"e_1_3_1_27_2","first-page":"3111","volume-title":"NeurIPS","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In NeurIPS, 3111\u20133119."},{"key":"e_1_3_1_28_2","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume":"35","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et\u00a0al. 2022. Training language models to follow instructions with human feedback. In NeurIPS, Vol. 35, 27730\u201327744.","journal-title":"NeurIPS"},{"key":"e_1_3_1_29_2","first-page":"311","article-title":"BLEU: A method for automatic evaluation of machine translation","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In ACL, 311\u2013318.","journal-title":"ACL"},{"key":"e_1_3_1_30_2","unstructured":"Jiahuan Pei Pengjie Ren and Maarten de Rijke. 2019. A modular task-oriented dialogue system using a neural mixture-of-experts. arXiv:1907.05346. Retrieved from https:\/\/arxiv.org\/abs\/1907.05346"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"issue":"8","key":"e_1_3_1_32_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_1_33_2","doi-asserted-by":"crossref","unstructured":"Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. arXiv:1908.10084. Retrieved from https:\/\/arxiv.org\/abs\/1908.10084","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1561\/1500000019"},{"key":"e_1_3_1_35_2","first-page":"1715","author":"Sennrich Rico","year":"2016","unstructured":"Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In ACL, 1715\u20131725.","journal-title":"Neural machine translation of rare words with subword units. In ACL"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3624989"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3475872"},{"key":"e_1_3_1_38_2","unstructured":"Ilya Sutskever Oriol Vinyals and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. arXiv:1409.3215. Retrieved from https:\/\/arxiv.org\/abs\/1409.3215"},{"key":"e_1_3_1_39_2","first-page":"6105","article-title":"EfficientNet: Rethinking model scaling for convolutional neural networks","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In ICML, 6105\u20136114.","journal-title":"ICML"},{"key":"e_1_3_1_40_2","unstructured":"Hugo Touvron Louis Martin Kevin Stone Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale et\u00a0al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288. Retrieved from https:\/\/arxiv.org\/abs\/2307.09288"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-023-08409-z"},{"key":"e_1_3_1_42_2","unstructured":"Jianhong Wang Yuan Zhang Tae-Kyun Kim and Yunjie Gu. 2020. Modelling hierarchical structure between dialogue policy and natural language generator with option framework for task-oriented dialogue system. arXiv:2006.06814. Retrieved from https:\/\/arxiv.org\/abs\/2006.06814"},{"key":"e_1_3_1_43_2","unstructured":"Kai Wang Junfeng Tian and Wang. 2020. Multi-domain dialogue acts and response co-generation. arXiv:2004.12363. Retrieved from https:\/\/arxiv.org\/abs\/2004.12363"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.638"},{"key":"e_1_3_1_45_2","first-page":"1711","article-title":"Semantically conditioned LSTM-based natural language generation for spoken dialogue systems","author":"Wen T. H.","year":"2015","unstructured":"T. H. Wen, M. Ga\u0161i\u0107, N. Mrk\u0161i\u0107, P. H. Su, D. Vandyke, and S. Young. 2015. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In EMNLP, 1711\u20131721.","journal-title":"EMNLP"},{"key":"e_1_3_1_46_2","first-page":"38","article-title":"Transformers: State-of-the-art natural language processing","author":"Wolf Thomas","year":"2020","unstructured":"Thomas Wolf, Julien Chaumond, Lysandre Debut, Victor Sanh, Clement Delangue, Anthony Moi, Pierric Cistac, Morgan Funtowicz, Joe Davison, Sam Shleifer, et\u00a0al. 2020. Transformers: State-of-the-art natural language processing. In EMNLP, 38\u201345.","journal-title":"EMNLP"},{"key":"e_1_3_1_47_2","unstructured":"Chien-Sheng Wu Richard Socher and Caiming Xiong. 2018. Global-to-local memory pointer networks for task-oriented dialogue. arXiv:1901.04713. Retrieved from https:\/\/arxiv.org\/abs\/1901.04713"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3532063"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3384675"},{"key":"e_1_3_1_50_2","doi-asserted-by":"crossref","unstructured":"Yizhe Zhang Siqi Sun Michel Galley Yen-Chun Chen Chris Brockett Xiang Gao Jianfeng Gao Jingjing Liu and Bill Dolan. 2019. DialoGPT: Large-scale generative pre-training for conversational response generation. arXiv:1911.00536. Retrieved from https:\/\/arxiv.org\/abs\/1911.00536","DOI":"10.18653\/v1\/2020.acl-demos.30"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3714474","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3714474","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:17:56Z","timestamp":1750295876000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3714474"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,17]]},"references-count":49,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,4,30]]}},"alternative-id":["10.1145\/3714474"],"URL":"https:\/\/doi.org\/10.1145\/3714474","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,17]]},"assertion":[{"value":"2023-12-12","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-01-04","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}