{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T05:37:42Z","timestamp":1767850662886,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":23,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,6,5]],"date-time":"2019-06-05T00:00:00Z","timestamp":1559692800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,6,5]]},"DOI":"10.1145\/3323873.3325036","type":"proceedings-article","created":{"date-parts":[[2019,6,10]],"date-time":"2019-06-10T12:10:58Z","timestamp":1560168658000},"page":"187-191","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Multimodal Dialog for Browsing Large Visual Catalogs using Exploration-Exploitation Paradigm in a Joint Embedding Space"],"prefix":"10.1145","author":[{"given":"Indrani","family":"Bhattacharya","sequence":"first","affiliation":[{"name":"Rensselaer Polytechnic Institute, Troy, NY, USA"}]},{"given":"Arkabandhu","family":"Chowdhury","sequence":"additional","affiliation":[{"name":"Rice University, Houston, TX, USA"}]},{"given":"Vikas C.","family":"Raykar","sequence":"additional","affiliation":[{"name":"IBM Research, Bangalore, India"}]}],"member":"320","published-online":{"date-parts":[[2019,6,5]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Raykar","author":"Bhattacharya Indrani","year":"2019","unstructured":"Indrani Bhattacharya , Arkabandhu Chowdhury , and Vikas C . Raykar . 2019 . Multimodal dialog for browsing large visual catalogs using exploration-exploitation paradigm in a joint embedding space. arXiv preprint arXiv:1901.09854. Indrani Bhattacharya, Arkabandhu Chowdhury, and Vikas C. Raykar. 2019. Multimodal dialog for browsing large visual catalogs using exploration-exploitation paradigm in a joint embedding space. arXiv preprint arXiv:1901.09854."},{"key":"e_1_3_2_1_2_1","volume-title":"Correlational neural networks. Neural computation","author":"Chandar Sarath","year":"2016","unstructured":"Sarath Chandar , Mitesh M Khapra , Hugo Larochelle , and Balaraman Ravindran . 2016. Correlational neural networks. Neural computation ( 2016 ). Sarath Chandar, Mitesh M Khapra, Hugo Larochelle, and Balaraman Ravindran. 2016. Correlational neural networks. Neural computation (2016)."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.121"},{"key":"e_1_3_2_1_4_1","volume-title":"Stefan Lee, and Dhruv Batra.","author":"Das Abhishek","year":"2017","unstructured":"Abhishek Das , Satwik Kottur , Jos\u00e9 MF Moura , Stefan Lee, and Dhruv Batra. 2017 . Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning . arXiv preprint arXiv:1703.06585 (2017). Abhishek Das, Satwik Kottur, Jos\u00e9 MF Moura, Stefan Lee, and Dhruv Batra. 2017. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning. arXiv preprint arXiv:1703.06585 (2017)."},{"key":"e_1_3_2_1_5_1","first-page":"3","article-title":"GuessWhat?! Visual object discovery through multi-modal dialogue","volume":"1","author":"Vries Harm De","year":"2017","unstructured":"Harm De Vries , Florian Strub , Sarath Chandar , Olivier Pietquin , Hugo Larochelle , and Aaron C Courville . 2017 . GuessWhat?! Visual object discovery through multi-modal dialogue .. In CVPR , Vol. 1. 3 . Harm De Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, and Aaron C Courville. 2017. GuessWhat?! Visual object discovery through multi-modal dialogue.. In CVPR, Vol. 1. 3.","journal-title":"CVPR"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Felix A Gers J\u00fcrgen Schmidhuber and Fred Cummins. 1999. Learning to forget: Continual prediction with LSTM. (1999).  Felix A Gers J\u00fcrgen Schmidhuber and Fred Cummins. 1999. Learning to forget: Continual prediction with LSTM. (1999).","DOI":"10.1049\/cp:19991218"},{"key":"e_1_3_2_1_7_1","volume-title":"Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine","author":"Gormley Clinton","year":"2015","unstructured":"Clinton Gormley and Zachary Tong . 2015 . Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine . O'Reilly Media, Inc. Clinton Gormley and Zachary Tong. 2015. Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine. O'Reilly Media, Inc."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.382"},{"key":"e_1_3_2_1_9_1","volume-title":"Categorical reparameterization with Gumbel-Softmax. arXiv preprint arXiv:1611.01144","author":"Jang Eric","year":"2016","unstructured":"Eric Jang , Shixiang Gu , and Ben Poole . 2016. Categorical reparameterization with Gumbel-Softmax. arXiv preprint arXiv:1611.01144 ( 2016 ). Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with Gumbel-Softmax. arXiv preprint arXiv:1611.01144 (2016)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2788621"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3184558.3191588"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159716"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240508.3240605"},{"key":"e_1_3_2_1_14_1","volume-title":"The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909","author":"Lowe Ryan","year":"2015","unstructured":"Ryan Lowe , Nissan Pow , Iulian Serban , and Joelle Pineau . 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 ( 2015 ). Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2018.8351401"},{"key":"e_1_3_2_1_16_1","volume-title":"Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov , Kai Chen , Greg Corrado , and Jeffrey Dean . 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 ( 2013 ). Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)."},{"key":"e_1_3_2_1_17_1","volume-title":"Image-grounded conversations: Multimodal context for natural question and response generation. arXiv preprint arXiv:1701.08251","author":"Mostafazadeh Nasrin","year":"2017","unstructured":"Nasrin Mostafazadeh , Chris Brockett , Bill Dolan , Michel Galley , Jianfeng Gao , Georgios P Spithourakis , and Lucy Vanderwende . 2017. Image-grounded conversations: Multimodal context for natural question and response generation. arXiv preprint arXiv:1701.08251 ( 2017 ). Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P Spithourakis, and Lucy Vanderwende. 2017. Image-grounded conversations: Multimodal context for natural question and response generation. arXiv preprint arXiv:1701.08251 (2017)."},{"key":"e_1_3_2_1_18_1","volume-title":"Towards Building Large Scale Multimodal Domain-Aware Conversation Systems. In Thirty- Second AAAI Conference on Artificial Intelligence.","author":"Saha Amrita","year":"2018","unstructured":"Amrita Saha , Mitesh M Khapra , and Karthik Sankaranarayanan . 2018 . Towards Building Large Scale Multimodal Domain-Aware Conversation Systems. In Thirty- Second AAAI Conference on Artificial Intelligence. Amrita Saha, Mitesh M Khapra, and Karthik Sankaranarayanan. 2018. Towards Building Large Scale Multimodal Domain-Aware Conversation Systems. In Thirty- Second AAAI Conference on Artificial Intelligence."},{"key":"e_1_3_2_1_19_1","volume-title":"Learning Disentangled Multimodal Representations for the Fashion Domain. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 557- 566","author":"Saha Amrita","year":"2018","unstructured":"Amrita Saha , Megha Nawhal , Mitesh M Khapra , and Vikas C Raykar . 2018 . Learning Disentangled Multimodal Representations for the Fashion Domain. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 557- 566 . Amrita Saha, Megha Nawhal, Mitesh M Khapra, and Vikas C Raykar. 2018. Learning Disentangled Multimodal Representations for the Fashion Domain. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 557- 566."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/3016387.3016435"},{"key":"e_1_3_2_1_21_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and AndrewZisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ). Karen Simonyan and AndrewZisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959171"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098162"}],"event":{"name":"ICMR '19: International Conference on Multimedia Retrieval","location":"Ottawa ON Canada","acronym":"ICMR '19","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 2019 on International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3323873.3325036","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3323873.3325036","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:02:22Z","timestamp":1750208542000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3323873.3325036"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,6,5]]},"references-count":23,"alternative-id":["10.1145\/3323873.3325036","10.1145\/3323873"],"URL":"https:\/\/doi.org\/10.1145\/3323873.3325036","relation":{},"subject":[],"published":{"date-parts":[[2019,6,5]]},"assertion":[{"value":"2019-06-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}