{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T21:55:03Z","timestamp":1775253303906,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":52,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,10,15]],"date-time":"2018-10-15T00:00:00Z","timestamp":1539561600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001381","name":"National Research Foundation Singapore","doi-asserted-by":"publisher","award":["IRC@Singapore Funding Initiative"],"award-info":[{"award-number":["IRC@Singapore Funding Initiative"]}],"id":[{"id":"10.13039\/501100001381","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,10,15]]},"DOI":"10.1145\/3240508.3240605","type":"proceedings-article","created":{"date-parts":[[2018,10,18]],"date-time":"2018-10-18T13:52:08Z","timestamp":1539870728000},"page":"801-809","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":99,"title":["Knowledge-aware Multimodal Dialogue Systems"],"prefix":"10.1145","author":[{"given":"Lizi","family":"Liao","sequence":"first","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"given":"Yunshan","family":"Ma","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"given":"Xiangnan","family":"He","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]},{"given":"Richang","family":"Hong","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, Hefei, China"}]},{"given":"Tat-Seng","family":"Chua","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2018,10,15]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Sungjin Ahn Heeyoul Choi Tanel P\"arnamaa and Yoshua Bengio. 2016. A neural knowledge language model. arXiv preprint arXiv:1608.00318 (2016). Sungjin Ahn Heeyoul Choi Tanel P\"arnamaa and Yoshua Bengio. 2016. A neural knowledge language model. arXiv preprint arXiv:1608.00318 (2016)."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.279"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"S\u00f6ren Auer Christian Bizer Georgi Kobilarov Jens Lehmann Richard Cyganiak and Zachary Ives. 2007. DBpedia: A nucleus for a web of open data. The semantic web . 722--735. S\u00f6ren Auer Christian Bizer Georgi Kobilarov Jens Lehmann Richard Cyganiak and Zachary Ives. 2007. DBpedia: A nucleus for a web of open data. The semantic web . 722--735.","DOI":"10.1007\/978-3-540-76298-0_52"},{"key":"e_1_3_2_1_4_1","unstructured":"Dzmitry Bahdanau Philemon Brakel Kelvin Xu Anirudh Goyal Ryan Lowe Joelle Pineau Aaron Courville and Yoshua Bengio. 2017. An actor-critic algorithm for sequence prediction. ICLR . Dzmitry Bahdanau Philemon Brakel Kelvin Xu Anirudh Goyal Ryan Lowe Joelle Pineau Aaron Courville and Yoshua Bengio. 2017. An actor-critic algorithm for sequence prediction. ICLR ."},{"key":"e_1_3_2_1_5_1","unstructured":"Antoine Bordes Y-Lan Boureau and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. In ICLR . Antoine Bordes Y-Lan Boureau and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. In ICLR ."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Antoine Bordes Jason Weston Ronan Collobert Yoshua Bengio et almbox. 2011. Learning Structured Embeddings of Knowledge Bases. AAAI. 301--306. Antoine Bordes Jason Weston Ronan Collobert Yoshua Bengio et almbox. 2011. Learning Structured Embeddings of Knowledge Bases. AAAI. 301--306.","DOI":"10.1609\/aaai.v25i1.7917"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3166054.3166058"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Abhishek Das Satwik Kottur Khushi Gupta Avi Singh Deshraj Yadav Jos\u00e9 MF Moura Devi Parikh and Dhruv Batra. 2017. Visual Dialog. In CVPR . 1080--1089. Abhishek Das Satwik Kottur Khushi Gupta Avi Singh Deshraj Yadav Jos\u00e9 MF Moura Devi Parikh and Dhruv Batra. 2017. Visual Dialog. In CVPR . 1080--1089.","DOI":"10.1109\/CVPR.2017.121"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Harm De Vries Florian Strub Sarath Chandar Olivier Pietquin Hugo Larochelle and Aaron Courville. 2017. GuessWhat?! Visual object discovery through multi-modal dialogue. In CVPR . 5503--5512. Harm De Vries Florian Strub Sarath Chandar Olivier Pietquin Hugo Larochelle and Aaron Courville. 2017. GuessWhat?! Visual object discovery through multi-modal dialogue. In CVPR . 5503--5512.","DOI":"10.1109\/CVPR.2017.475"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Bhuwan Dhingra Lihong Li Xiujun Li Jianfeng Gao Yun-Nung Chen Faisal Ahmed and Li Deng. 2017. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access. In ACL . 484--495. Bhuwan Dhingra Lihong Li Xiujun Li Jianfeng Gao Yun-Nung Chen Faisal Ahmed and Li Deng. 2017. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access. In ACL . 484--495.","DOI":"10.18653\/v1\/P17-1045"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Mihail Eric Lakshmi Krishnan Francois Charette and Christopher D Manning. 2017. Key-Value Retrieval Networks for Task-Oriented Dialogue. In SIGDIAL. 37--49. Mihail Eric Lakshmi Krishnan Francois Charette and Christopher D Manning. 2017. Key-Value Retrieval Networks for Task-Oriented Dialogue. In SIGDIAL. 37--49.","DOI":"10.18653\/v1\/W17-5506"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Marjan Ghazvininejad Chris Brockett Ming-Wei Chang Bill Dolan Jianfeng Gao Wen-tau Yih and Michel Galley. 2018. A knowledge-grounded neural conversation model. In AAAI. 5110--5117. Marjan Ghazvininejad Chris Brockett Ming-Wei Chang Bill Dolan Jianfeng Gao Wen-tau Yih and Michel Galley. 2018. A knowledge-grounded neural conversation model. In AAAI. 5110--5117.","DOI":"10.1609\/aaai.v32i1.11977"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Yash Goyal Tejas Khot Douglas Summers-Stay Dhruv Batra and Devi Parikh. 2017. Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering. In CVPR. 6904--6913. Yash Goyal Tejas Khot Douglas Summers-Stay Dhruv Batra and Devi Parikh. 2017. Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering. In CVPR. 6904--6913.","DOI":"10.1109\/CVPR.2017.670"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Kelvin Guu John Miller and Percy Liang. 2015. Traversing Knowledge Graphs in Vector Space. In EMNLP. 318--327. Kelvin Guu John Miller and Percy Liang. 2015. Traversing Knowledge Graphs in Vector Space. In EMNLP. 318--327.","DOI":"10.18653\/v1\/D15-1038"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Sangdo Han Jeesoo Bang Seonghan Ryu and Gary Geunbae Lee. 2015. Exploiting knowledge base to generate responses for natural language dialog listening agents. In SIGDAL. 129--133. Sangdo Han Jeesoo Bang Seonghan Ryu and Gary Geunbae Lee. 2015. Exploiting knowledge base to generate responses for natural language dialog listening agents. In SIGDAL. 129--133.","DOI":"10.18653\/v1\/W15-4616"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052569"},{"key":"e_1_3_2_1_17_1","unstructured":"Sungjin Lee and Maxine Eskenazi. 2013. Recipe for building robust spoken dialog state trackers: Dialog state tracking challenge system description. In SIGDIAL. 414--422. Sungjin Lee and Maxine Eskenazi. 2013. Recipe for building robust spoken dialog state trackers: Dialog state tracking challenge system description. In SIGDIAL. 414--422."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Jiwei Li Will Monroe Alan Ritter Dan Jurafsky Michel Galley and Jianfeng Gao. 2016. Deep Reinforcement Learning for Dialogue Generation. In EMNLP . 1192--1202. Jiwei Li Will Monroe Alan Ritter Dan Jurafsky Michel Galley and Jianfeng Gao. 2016. Deep Reinforcement Learning for Dialogue Generation. In EMNLP . 1192--1202.","DOI":"10.18653\/v1\/D16-1127"},{"key":"e_1_3_2_1_19_1","unstructured":"Xiujun Li Yun-Nung Chen Lihong Li Jianfeng Gao and Asli Celikyilmaz. 2017. End-to-End Task-Completion Neural Dialogue Systems. In IJCNLP. 733--743. Xiujun Li Yun-Nung Chen Lihong Li Jianfeng Gao and Asli Celikyilmaz. 2017. End-to-End Task-Completion Neural Dialogue Systems. In IJCNLP. 733--743."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240508.3240646"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240508.3241399"},{"key":"e_1_3_2_1_22_1","unstructured":"Siqi Liu Zhenhai Zhu Ning Ye Sergio Guadarrama and Kevin Murphy. {n. d.}. Optimization of image description metrics using policy gradient methods. In ICCV . 873--881. Siqi Liu Zhenhai Zhu Ning Ye Sergio Guadarrama and Kevin Murphy. {n. d.}. Optimization of image description metrics using policy gradient methods. In ICCV . 873--881."},{"key":"e_1_3_2_1_23_1","unstructured":"Jiasen Lu Caiming Xiong Devi Parikh and Richard Socher. 2017. Knowing when to look: Adaptive attention via a visual sentinel for image captioning. 375--383. Jiasen Lu Caiming Xiong Devi Parikh and Richard Socher. 2017. Knowing when to look: Adaptive attention via a visual sentinel for image captioning. 375--383."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"crossref","unstructured":"Alexander Miller Adam Fisch Jesse Dodge Amir-Hossein Karimi Antoine Bordes and Jason Weston. 2016. Key-Value Memory Networks for Directly Reading Documents. In EMNLP. 1400--1409. Alexander Miller Adam Fisch Jesse Dodge Amir-Hossein Karimi Antoine Bordes and Jason Weston. 2016. Key-Value Memory Networks for Directly Reading Documents. In EMNLP. 1400--1409.","DOI":"10.18653\/v1\/D16-1147"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"e_1_3_2_1_26_1","unstructured":"Tom M Mitchell William W Cohen Estevam R Hruschka Jr Partha Pratim Talukdar Justin Betteridge Andrew Carlson Bhavana Dalvi Mishra Matthew Gardner Bryan Kisiel Jayant Krishnamurthy et almbox. 2015. Never Ending Learning.. In AAAI. 2302--2310. Tom M Mitchell William W Cohen Estevam R Hruschka Jr Partha Pratim Talukdar Justin Betteridge Andrew Carlson Bhavana Dalvi Mishra Matthew Gardner Bryan Kisiel Jayant Krishnamurthy et almbox. 2015. Never Ending Learning.. In AAAI. 2302--2310."},{"key":"e_1_3_2_1_27_1","unstructured":"Nasrin Mostafazadeh Chris Brockett Bill Dolan Michel Galley Jianfeng Gao Georgios Spithourakis and Lucy Vanderwende. 2017. Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation. In IJCNLP. 462--472. Nasrin Mostafazadeh Chris Brockett Bill Dolan Michel Galley Jianfeng Gao Georgios Spithourakis and Lucy Vanderwende. 2017. Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation. In IJCNLP. 462--472."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"crossref","unstructured":"Nikola Mrkvs i\u0107 Diarmuid \u00d3 S\u00e9aghdha Tsung-Hsien Wen Blaise Thomson and Steve Young. 2017. Neural Belief Tracker: Data-Driven Dialogue State Tracking. In ACL. 1777--1788. Nikola Mrkvs i\u0107 Diarmuid \u00d3 S\u00e9aghdha Tsung-Hsien Wen Blaise Thomson and Steve Young. 2017. Neural Belief Tracker: Data-Driven Dialogue State Tracking. In ACL. 1777--1788.","DOI":"10.18653\/v1\/P17-1163"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2015.2483592"},{"key":"e_1_3_2_1_30_1","unstructured":"Marc'Aurelio Ranzato Sumit Chopra Michael Auli and Wojciech Zaremba. 2016. Sequence level training with recurrent neural networks. In ICLR . Marc'Aurelio Ranzato Sumit Chopra Michael Auli and Wojciech Zaremba. 2016. Sequence level training with recurrent neural networks. In ICLR ."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Amrita Saha Mitesh M Khapra and Karthik Sankaranarayanan. 2018. Towards Building Large Scale Multimodal Domain-Aware Conversation Systems. In AAAI . 696--704. Amrita Saha Mitesh M Khapra and Karthik Sankaranarayanan. 2018. Towards Building Large Scale Multimodal Domain-Aware Conversation Systems. In AAAI . 696--704.","DOI":"10.1609\/aaai.v32i1.11331"},{"key":"e_1_3_2_1_32_1","unstructured":"Iulian V Serban Alessandro Sordoni Yoshua Bengio Aaron Courville and Joelle Pineau. 2015. Hierarchical Neural Network Generative Models for Movie Dialogues. arXiv preprint arXiv:1507.04808 (2015). Iulian V Serban Alessandro Sordoni Yoshua Bengio Aaron Courville and Joelle Pineau. 2015. Hierarchical Neural Network Generative Models for Movie Dialogues. arXiv preprint arXiv:1507.04808 (2015)."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Iulian Vlad Serban Alessandro Sordoni Yoshua Bengio Aaron C Courville and Joelle Pineau. 2016. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.. In AAAI. 3776--3784. Iulian Vlad Serban Alessandro Sordoni Yoshua Bengio Aaron C Courville and Joelle Pineau. 2016. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.. In AAAI. 3776--3784.","DOI":"10.1609\/aaai.v30i1.9883"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Iulian Vlad Serban Alessandro Sordoni Ryan Lowe Laurent Charlin Joelle Pineau Aaron Courville and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In AAAI . Iulian Vlad Serban Alessandro Sordoni Ryan Lowe Laurent Charlin Joelle Pineau Aaron Courville and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In AAAI .","DOI":"10.1609\/aaai.v31i1.10983"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Alessandro Sordoni Michel Galley Michael Auli Chris Brockett Yangfeng Ji Margaret Mitchell Jian-Yun Nie Jianfeng Gao and Bill Dolan. 2015. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. In NAACL . 196--205. Alessandro Sordoni Michel Galley Michael Auli Chris Brockett Yangfeng Ji Margaret Mitchell Jian-Yun Nie Jianfeng Gao and Bill Dolan. 2015. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. In NAACL . 196--205.","DOI":"10.3115\/v1\/N15-1020"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Florian Strub Harm De Vries Jeremie Mary Bilal Piot Aaron Courville and Olivier Pietquin. 2017. End-to-end optimization of goal-driven and visually grounded dialogue systems. In IJCAI . 2765--2771. Florian Strub Harm De Vries Jeremie Mary Bilal Piot Aaron Courville and Olivier Pietquin. 2017. End-to-end optimization of goal-driven and visually grounded dialogue systems. In IJCAI . 2765--2771.","DOI":"10.24963\/ijcai.2017\/385"},{"key":"e_1_3_2_1_37_1","unstructured":"Richard S Sutton David A McAllester Satinder P Singh and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In NIPS . 1057--1063. Richard S Sutton David A McAllester Satinder P Singh and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In NIPS . 1057--1063."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2006.06.001"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. CVPR. 3156--3164. Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. CVPR. 3156--3164.","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"crossref","unstructured":"TH Wen D Vandyke N Mrkvs 'ic M Gavs 'ic LM Rojas-Barahona PH Su S Ultes and S Young. 2017. A network-based end-to-end trainable task-oriented dialogue system. In EACL . 438--449. TH Wen D Vandyke N Mrkvs 'ic M Gavs 'ic LM Rojas-Barahona PH Su S Ultes and S Young. 2017. A network-based end-to-end trainable task-oriented dialogue system. In EACL . 438--449.","DOI":"10.18653\/v1\/E17-1042"},{"key":"e_1_3_2_1_41_1","unstructured":"Jason Weston Sumit Chopra and Antoine Bordes. 2015. Memory Networks. In ICLR . Jason Weston Sumit Chopra and Antoine Bordes. 2015. Memory Networks. In ICLR ."},{"key":"e_1_3_2_1_42_1","unstructured":"Jason D Williams and Geoffrey Zweig. 2016. End-to-end LS\u2122-based dialog control optimized with supervised and reinforcement learning. arXiv preprint arXiv:1606.01269 (2016). Jason D Williams and Geoffrey Zweig. 2016. End-to-end LS\u2122-based dialog control optimized with supervised and reinforcement learning. arXiv preprint arXiv:1606.01269 (2016)."},{"key":"e_1_3_2_1_43_1","unstructured":"Ronald J Williams. 199"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Qi Wu Chunhua Shen Lingqiao Liu Anthony Dick and Anton van den Hengel. 2016. What value do explicit high level concepts have in vision to language problems?. In CVPR . 203--212. Qi Wu Chunhua Shen Lingqiao Liu Anthony Dick and Anton van den Hengel. 2016. What value do explicit high level concepts have in vision to language problems?. In CVPR . 203--212.","DOI":"10.1109\/CVPR.2016.29"},{"key":"e_1_3_2_1_45_1","unstructured":"Kelvin Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhudinov Rich Zemel and Yoshua Bengio. 2015. Show attend and tell: Neural image caption generation with visual attention. In ICML . 2048--2057. Kelvin Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhudinov Rich Zemel and Yoshua Bengio. 2015. Show attend and tell: Neural image caption generation with visual attention. In ICML . 2048--2057."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Zhao Yan Nan Duan Peng Chen Ming Zhou Jianshe Zhou and Zhoujun Li. 2017. Building Task-Oriented Dialogue Systems for Online Shopping.. In AAAI. 4618--4626. Zhao Yan Nan Duan Peng Chen Ming Zhou Jianshe Zhou and Zhoujun Li. 2017. Building Task-Oriented Dialogue Systems for Online Shopping.. In AAAI. 4618--4626.","DOI":"10.1609\/aaai.v31i1.11182"},{"key":"e_1_3_2_1_47_1","unstructured":"Bishan Yang Wen-tau Yih Xiaodong He Jianfeng Gao and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In ICLR . Bishan Yang Wen-tau Yih Xiaodong He Jianfeng Gao and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In ICLR ."},{"key":"e_1_3_2_1_48_1","unstructured":"Wojciech Zaremba and Ilya Sutskever. 2015. Reinforcement learning neural turing machines-revised. arXiv preprint arXiv:1505.00521 (2015). Wojciech Zaremba and Ilya Sutskever. 2015. Reinforcement learning neural turing machines-revised. arXiv preprint arXiv:1505.00521 (2015)."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2502081.2502093"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"crossref","unstructured":"Peng Zhang Yash Goyal Douglas Summers-Stay Dhruv Batra and Devi Parikh. 2016. Yin and yang: Balancing and answering binary visual questions. In CVPR. 5014--5022. Peng Zhang Yash Goyal Douglas Summers-Stay Dhruv Batra and Devi Parikh. 2016. Yin and yang: Balancing and answering binary visual questions. In CVPR. 5014--5022.","DOI":"10.1109\/CVPR.2016.542"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"crossref","unstructured":"Tiancheng Zhao and Maxine Eskenazi. 2016. Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning. In SIGDIAL. 1. Tiancheng Zhao and Maxine Eskenazi. 2016. Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning. In SIGDIAL. 1.","DOI":"10.18653\/v1\/W16-3601"},{"key":"e_1_3_2_1_52_1","unstructured":"Wenya Zhu Kaixiang Mo Yu Zhang Zhangbin Zhu Xuezheng Peng and Qiang Yang. 2017. Flexible End-to-End Dialogue System for Knowledge Grounded Conversation. arXiv preprint arXiv:1709.04264 (2017). Wenya Zhu Kaixiang Mo Yu Zhang Zhangbin Zhu Xuezheng Peng and Qiang Yang. 2017. Flexible End-to-End Dialogue System for Knowledge Grounded Conversation. arXiv preprint arXiv:1709.04264 (2017)."}],"event":{"name":"MM '18: ACM Multimedia Conference","location":"Seoul Republic of Korea","acronym":"MM '18","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 26th ACM international conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3240508.3240605","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3240508.3240605","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T20:40:33Z","timestamp":1775248833000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3240508.3240605"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,10,15]]},"references-count":52,"alternative-id":["10.1145\/3240508.3240605","10.1145\/3240508"],"URL":"https:\/\/doi.org\/10.1145\/3240508.3240605","relation":{},"subject":[],"published":{"date-parts":[[2018,10,15]]},"assertion":[{"value":"2018-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}