{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T19:10:48Z","timestamp":1781723448967,"version":"3.54.5"},"publisher-location":"New York, NY, USA","reference-count":45,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T00:00:00Z","timestamp":1634515200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,18]]},"DOI":"10.1145\/3462244.3479909","type":"proceedings-article","created":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T15:01:58Z","timestamp":1634310118000},"page":"71-79","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["ViCA: Combining visual, social, and task-oriented conversational AI in a healthcare setting"],"prefix":"10.1145","author":[{"given":"George","family":"Pantazopoulos","sequence":"first","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jeremy","family":"Bruyere","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Malvina","family":"Nikandrou","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Thibaud","family":"Boissier","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Supun","family":"Hemanthage","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Binha Kumar","family":"Sachish","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Vidyul","family":"Shah","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christian","family":"Dondrup","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Oliver","family":"Lemon","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom and Alana AI, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2021,10,18]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343413.3377974"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/RO-MAN46459.2019.8956300"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"crossref","unstructured":"Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In CVPR.  Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In CVPR.","DOI":"10.1109\/CVPR.2018.00636"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300484"},{"key":"e_1_3_2_2_5_1","unstructured":"Laura Aymerich-Franch and Iliana Ferrer. 2020. The implementation of social robots during the COVID-19 pandemic. arXiv preprint arXiv:2007.03941(2020).  Laura Aymerich-Franch and Iliana Ferrer. 2020. The implementation of social robots during the COVID-19 pandemic. arXiv preprint arXiv:2007.03941(2020)."},{"issue":"1","key":"e_1_3_2_2_6_1","first-page":"71","article-title":"Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots. International journal of social robotics. Dordrecht","volume":"1","author":"Bartneck Christoph","year":"2009","unstructured":"Christoph Bartneck , Dana Kuli\u0107 , Dana Kuli\u0107 , Elizabeth Croft , Elizabeth Croft , Susana Zoghbi , and Susana Zoghbi . 2009 . Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots. International journal of social robotics. Dordrecht : Springer Netherlands , 1 ( 1 ), pp. 71 \u2013 81 (2009). Christoph Bartneck, Dana Kuli\u0107, Dana Kuli\u0107, Elizabeth Croft, Elizabeth Croft, Susana Zoghbi, and Susana Zoghbi. 2009. Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots. International journal of social robotics. Dordrecht: Springer Netherlands, 1(1), pp. 71\u201381 (2009).","journal-title":"Springer Netherlands"},{"key":"e_1_3_2_2_7_1","volume-title":"Laura M\u00a0Pfeifer Vardoulakis, and Daniel Schulman","author":"Bickmore W","year":"2013","unstructured":"Timothy\u00a0 W Bickmore , Laura M\u00a0Pfeifer Vardoulakis, and Daniel Schulman . 2013 . Tinker: a relational agent museum guide. Autonomous agents and multi-agent systems 27, 2 (2013), 254\u2013276. Timothy\u00a0W Bickmore, Laura M\u00a0Pfeifer Vardoulakis, and Daniel Schulman. 2013. Tinker: a relational agent museum guide. Autonomous agents and multi-agent systems 27, 2 (2013), 254\u2013276."},{"key":"e_1_3_2_2_8_1","volume-title":"Rasa: Open Source Language Understanding and Dialogue Management. CoRR abs\/1712.05181(2017). arxiv:1712.05181","author":"Bocklisch Tom","year":"2017","unstructured":"Tom Bocklisch , Joey Faulkner , Nick Pawlowski , and Alan Nichol . 2017 . Rasa: Open Source Language Understanding and Dialogue Management. CoRR abs\/1712.05181(2017). arxiv:1712.05181 Tom Bocklisch, Joey Faulkner, Nick Pawlowski, and Alan Nichol. 2017. Rasa: Open Source Language Understanding and Dialogue Management. CoRR abs\/1712.05181(2017). arxiv:1712.05181"},{"key":"e_1_3_2_2_9_1","unstructured":"Antoine Bordes Y-Lan Boureau and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683(2016).  Antoine Bordes Y-Lan Boureau and Jason Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683(2016)."},{"key":"e_1_3_2_2_10_1","volume-title":"Diet: Lightweight language understanding for dialogue systems. arXiv preprint arXiv:2004.09936(2020).","author":"Bunk Tanja","year":"2020","unstructured":"Tanja Bunk , Daksh Varshneya , Vladimir Vlasov , and Alan Nichol . 2020 . Diet: Lightweight language understanding for dialogue systems. arXiv preprint arXiv:2004.09936(2020). Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, and Alan Nichol. 2020. Diet: Lightweight language understanding for dialogue systems. arXiv preprint arXiv:2004.09936(2020)."},{"key":"e_1_3_2_2_11_1","volume-title":"1st Proceedings of Alexa Prize (Alexa Prize","author":"Curry Amanda Cercas","year":"2018","unstructured":"Amanda Cercas Curry , Ioannis Papaioannou , Alessandro Suglia , Shubham Agarwal , Igor Shalyminov , Xu Xinnuo , Ondrej Dusek , Arash Eshghi , Ioannis Konstas , Verena Rieser , and Oliver Lemon . 2018 . Alana v2: Entertaining and Informative Open-domain Social Dialogue using Ontologies and Entity Linking . In 1st Proceedings of Alexa Prize (Alexa Prize 2018). Amanda Cercas Curry, Ioannis Papaioannou, Alessandro Suglia, Shubham Agarwal, Igor Shalyminov, Xu Xinnuo, Ondrej Dusek, Arash Eshghi, Ioannis Konstas, Verena Rieser, and Oliver Lemon. 2018. Alana v2: Entertaining and Informative Open-domain Social Dialogue using Ontologies and Entity Linking. In 1st Proceedings of Alexa Prize (Alexa Prize 2018)."},{"key":"e_1_3_2_2_12_1","unstructured":"Kai Chen Jiaqi Wang Jiangmiao Pang Yuhang Cao Yu Xiong Xiaoxiao Li Shuyang Sun Wansen Feng Ziwei Liu Jiarui Xu Zheng Zhang Dazhi Cheng Chenchen Zhu Tianheng Cheng Qijie Zhao Buyu Li Xin Lu Rui Zhu Yue Wu Jifeng Dai Jingdong Wang Jianping Shi Wanli Ouyang Chen\u00a0Change Loy and Dahua Lin. 2019. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv preprint arXiv:1906.07155(2019).  Kai Chen Jiaqi Wang Jiangmiao Pang Yuhang Cao Yu Xiong Xiaoxiao Li Shuyang Sun Wansen Feng Ziwei Liu Jiarui Xu Zheng Zhang Dazhi Cheng Chenchen Zhu Tianheng Cheng Qijie Zhao Buyu Li Xin Lu Rui Zhu Yue Wu Jifeng Dai Jingdong Wang Jianping Shi Wanli Ouyang Chen\u00a0Change Loy and Dahua Lin. 2019. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv preprint arXiv:1906.07155(2019)."},{"key":"e_1_3_2_2_13_1","first-page":"1","article-title":"SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition","volume":"20","author":"Chen Yuntao","year":"2019","unstructured":"Yuntao Chen , Chenxia Han , Yanghao Li , Zehao Huang , Yi Jiang , Naiyan Wang , and Zhaoxiang Zhang . 2019 . SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition . Journal of Machine Learning Research 20 , 156 (2019), 1 \u2013 8 . Yuntao Chen, Chenxia Han, Yanghao Li, Zehao Huang, Yi Jiang, Naiyan Wang, and Zhaoxiang Zhang. 2019. SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition. Journal of Machine Learning Research 20, 156 (2019), 1\u20138.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.121"},{"key":"e_1_3_2_2_15_1","unstructured":"Jeff Donahue Yangqing Jia Oriol Vinyals Judy Hoffman Ning Zhang Eric Tzeng and Trevor Darrell. 2013. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition.CoRR abs\/1310.1531(2013).  Jeff Donahue Yangqing Jia Oriol Vinyals Judy Hoffman Ning Zhang Eric Tzeng and Trevor Darrell. 2013. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition.CoRR abs\/1310.1531(2013)."},{"key":"e_1_3_2_2_16_1","volume-title":"Emora: An inquisitive social chatbot who cares for you. arXiv preprint arXiv:2009.04617(2020).","author":"Finch E","year":"2020","unstructured":"Sarah\u00a0 E Finch , James\u00a0 D Finch , Ali Ahmadvand , Xiangjue Dong , Ruixiang Qi , Harshita Sahijwani , Sergey Volokhin , Zihan Wang , Zihao Wang , Jinho\u00a0 D Choi , 2020 . Emora: An inquisitive social chatbot who cares for you. arXiv preprint arXiv:2009.04617(2020). Sarah\u00a0E Finch, James\u00a0D Finch, Ali Ahmadvand, Xiangjue Dong, Ruixiang Qi, Harshita Sahijwani, Sergey Volokhin, Zihan Wang, Zihao Wang, Jinho\u00a0D Choi, 2020. Emora: An inquisitive social chatbot who cares for you. arXiv preprint arXiv:2009.04617(2020)."},{"key":"e_1_3_2_2_17_1","volume-title":"Mummer: Socially intelligent human-robot interaction in public spaces. arXiv preprint arXiv:1909.06749(2019).","author":"Foster Mary\u00a0Ellen","year":"2019","unstructured":"Mary\u00a0Ellen Foster , Bart Craenen , Amol Deshmukh , Oliver Lemon , Emanuele Bastianelli , Christian Dondrup , Ioannis Papaioannou , Andrea Vanzo , Jean-Marc Odobez , Olivier Can\u00e9vet , 2019 . Mummer: Socially intelligent human-robot interaction in public spaces. arXiv preprint arXiv:1909.06749(2019). Mary\u00a0Ellen Foster, Bart Craenen, Amol Deshmukh, Oliver Lemon, Emanuele Bastianelli, Christian Dondrup, Ioannis Papaioannou, Andrea Vanzo, Jean-Marc Odobez, Olivier Can\u00e9vet, 2019. Mummer: Socially intelligent human-robot interaction in public spaces. arXiv preprint arXiv:1909.06749(2019)."},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.169"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3383652.3423889"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00550"},{"key":"e_1_3_2_2_21_1","volume-title":"The movielens datasets: History and context. ACM transactions on interactive intelligent systems 5, 4","author":"Harper F\u00a0Maxwell","year":"2015","unstructured":"F\u00a0Maxwell Harper and Joseph\u00a0 A Konstan . 2015. The movielens datasets: History and context. ACM transactions on interactive intelligent systems 5, 4 ( 2015 ), 1\u201319. F\u00a0Maxwell Harper and Joseph\u00a0A Konstan. 2015. The movielens datasets: History and context. ACM transactions on interactive intelligent systems 5, 4 (2015), 1\u201319."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.322"},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9197160"},{"key":"e_1_3_2_2_25_1","volume-title":"Image retrieval using scene graphs","author":"Johnson Justin","unstructured":"Justin Johnson , Ranjay Krishna , Michael Stark , Li-Jia Li , David\u00a0 A. Shamma , Michael\u00a0 S. Bernstein , and Fei-Fei Li. 2015. Image retrieval using scene graphs .. In CVPR. IEEE Computer Society , 3668\u20133678. Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David\u00a0A. Shamma, Michael\u00a0S. Bernstein, and Fei-Fei Li. 2015. Image retrieval using scene graphs.. In CVPR. IEEE Computer Society, 3668\u20133678."},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"crossref","unstructured":"Zeashan\u00a0Hameed Khan Afifa Siddique and Chang\u00a0Won Lee. 2020. Robotics utilization for healthcare digitization in global COVID-19 management. International journal of environmental research and public health 17 11(2020) 3819.  Zeashan\u00a0Hameed Khan Afifa Siddique and Chang\u00a0Won Lee. 2020. Robotics utilization for healthcare digitization in global COVID-19 management. International journal of environmental research and public health 17 11(2020) 3819.","DOI":"10.3390\/ijerph17113819"},{"key":"e_1_3_2_2_27_1","volume-title":"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations.","author":"Krishna Ranjay","year":"2016","unstructured":"Ranjay Krishna , Yuke Zhu , Oliver Groth , Justin Johnson , Kenji Hata , Joshua Kravitz , Stephanie Chen , Yannis Kalantidis , Li-Jia Li , David\u00a0 A Shamma , Michael Bernstein , and Li Fei-Fei . 2016 . Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David\u00a0A Shamma, Michael Bernstein, and Li Fei-Fei. 2016. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations."},{"key":"e_1_3_2_2_28_1","unstructured":"Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C.\u00a0Lawrence Zitnick and Piotr Doll\u00e1r. 2014. Microsoft COCO: Common Objects in Context. arXiv preprint arXiv:1405.0312.  Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C.\u00a0Lawrence Zitnick and Piotr Doll\u00e1r. 2014. Microsoft COCO: Common Objects in Context. arXiv preprint arXiv:1405.0312."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58523-5_20"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROMAN.2017.8172363"},{"key":"e_1_3_2_2_31_1","volume-title":"Towards Visual Dialogue for Human-Robot Interaction. In Companion of the 2021 ACM\/IEEE International Conference on Human-Robot Interaction","author":"Part L.","year":"2021","unstructured":"Jose\u00a0 L. Part , Daniel Hern\u00e1ndez\u00a0Garc\u00eda , Yanchao Yu , Nancie Gunson , Christian Dondrup , and Oliver Lemon . 2021 . Towards Visual Dialogue for Human-Robot Interaction. In Companion of the 2021 ACM\/IEEE International Conference on Human-Robot Interaction ( Boulder, CO, USA) (HRI \u201921 Companion). Association for Computing Machinery, New York, NY, USA, 670\u2013672. Jose\u00a0L. Part, Daniel Hern\u00e1ndez\u00a0Garc\u00eda, Yanchao Yu, Nancie Gunson, Christian Dondrup, and Oliver Lemon. 2021. Towards Visual Dialogue for Human-Robot Interaction. In Companion of the 2021 ACM\/IEEE International Conference on Human-Robot Interaction (Boulder, CO, USA) (HRI \u201921 Companion). Association for Computing Machinery, New York, NY, USA, 670\u2013672."},{"key":"e_1_3_2_2_32_1","volume-title":"Conversational AI: The Science Behind the Alexa Prize. CoRR abs\/1801.03604(2018). arxiv:1801.03604","author":"Ram Ashwin","year":"2018","unstructured":"Ashwin Ram , Rohit Prasad , Chandra Khatri , Anu Venkatesh , Raefer Gabriel , Qing Liu , Jeff Nunn , Behnam Hedayatnia , Ming Cheng , Ashish Nagar , Eric King , Kate Bland , Amanda Wartick , Yi Pan , Han Song , Sk Jayadevan , Gene Hwang , and Art Pettigrue . 2018 . Conversational AI: The Science Behind the Alexa Prize. CoRR abs\/1801.03604(2018). arxiv:1801.03604 Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, and Art Pettigrue. 2018. Conversational AI: The Science Behind the Alexa Prize. CoRR abs\/1801.03604(2018). arxiv:1801.03604"},{"key":"e_1_3_2_2_33_1","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. cite arxiv:1506.01497Comment: Extended tech report.  Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. cite arxiv:1506.01497Comment: Extended tech report."},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_2_35_1","unstructured":"Kurt Shuster Eric\u00a0Michael Smith Da Ju and Jason Weston. 2020. Multi-Modal Open-Domain Dialogue. arXiv preprint arXiv:2010.01082(2020).  Kurt Shuster Eric\u00a0Michael Smith Da Ju and Jason Weston. 2020. Multi-Modal Open-Domain Dialogue. arXiv preprint arXiv:2010.01082(2020)."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1020"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.24251\/HICSS.2018.133"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00377"},{"key":"e_1_3_2_2_39_1","unstructured":"Vladimir Vlasov Johannes\u00a0EM Mosig and Alan Nichol. 2019. Dialogue transformers. arXiv preprint arXiv:1910.00486(2019).  Vladimir Vlasov Johannes\u00a0EM Mosig and Alan Nichol. 2019. Dialogue transformers. arXiv preprint arXiv:1910.00486(2019)."},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2019.00067"},{"key":"e_1_3_2_2_41_1","unstructured":"Yuxin Wu Alexander Kirillov Francisco Massa Wan-Yen Lo and Ross Girshick. 2019. Detectron2.  Yuxin Wu Alexander Kirillov Francisco Massa Wan-Yen Lo and Ross Girshick. 2019. Detectron2."},{"key":"e_1_3_2_2_42_1","unstructured":"Danfei Xu Yuke Zhu Christopher\u00a0B. Choy and Li Fei-Fei. 2017. Scene Graph Generation by Iterative Message Passing.CoRR abs\/1701.02426(2017).  Danfei Xu Yuke Zhu Christopher\u00a0B. Choy and Li Fei-Fei. 2017. Scene Graph Generation by Iterative Message Passing.CoRR abs\/1701.02426(2017)."},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.23919\/SICE.2017.8105748"},{"key":"e_1_3_2_2_44_1","unstructured":"Zhou Yu Alan\u00a0W Black and Alexander\u00a0I Rudnicky. 2017. Learning conversational systems that interleave task and non-task content. arXiv preprint arXiv:1703.00099(2017).  Zhou Yu Alan\u00a0W Black and Alexander\u00a0I Rudnicky. 2017. Learning conversational systems that interleave task and non-task content. arXiv preprint arXiv:1703.00099(2017)."},{"key":"e_1_3_2_2_45_1","volume-title":"ECCV (14)(Lecture Notes in Computer Science, Vol.\u00a012359), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.)","author":"Zhong Yiwu","unstructured":"Yiwu Zhong , Liwei Wang , Jianshu Chen , Dong Yu , and Yin Li. 2020. Comprehensive Image Captioning via Scene Graph Decomposition .. In ECCV (14)(Lecture Notes in Computer Science, Vol.\u00a012359), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.) . Springer , 211\u2013229. Yiwu Zhong, Liwei Wang, Jianshu Chen, Dong Yu, and Yin Li. 2020. Comprehensive Image Captioning via Scene Graph Decomposition.. In ECCV (14)(Lecture Notes in Computer Science, Vol.\u00a012359), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 211\u2013229."}],"event":{"name":"ICMI '21: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","location":"Montr\u00e9al QC Canada","acronym":"ICMI '21","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"]},"container-title":["Proceedings of the 2021 International Conference on Multimodal Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3462244.3479909","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3462244.3479909","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:54Z","timestamp":1750193334000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3462244.3479909"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,18]]},"references-count":45,"alternative-id":["10.1145\/3462244.3479909","10.1145\/3462244"],"URL":"https:\/\/doi.org\/10.1145\/3462244.3479909","relation":{},"subject":[],"published":{"date-parts":[[2021,10,18]]},"assertion":[{"value":"2021-10-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}