{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,21]],"date-time":"2025-10-21T15:26:50Z","timestamp":1761060410083,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":17,"publisher":"ACM","license":[{"start":{"date-parts":[[2017,6,19]],"date-time":"2017-06-19T00:00:00Z","timestamp":1497830400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2017,6,19]]},"DOI":"10.1145\/3095713.3095718","type":"proceedings-article","created":{"date-parts":[[2017,8,28]],"date-time":"2017-08-28T12:45:27Z","timestamp":1503924327000},"page":"1-6","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Question Part Relevance and Editing for Cooperative and Context-Aware VQA (C2VQA)"],"prefix":"10.1145","author":[{"given":"Andeep S.","family":"Toor","sequence":"first","affiliation":[{"name":"George Mason University, Department of Computer Science, Fairfax, Virginia"}]},{"given":"Harry","family":"Wechsler","sequence":"additional","affiliation":[{"name":"George Mason University, Department of Computer Science, Fairfax, Virginia"}]},{"given":"Michele","family":"Nappi","sequence":"additional","affiliation":[{"name":"Universit\u00e0 di Salerno, Dipartimento di Informatica, Fisciano, Italy"}]}],"member":"320","published-online":{"date-parts":[[2017,6,19]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.279"},{"key":"e_1_3_2_1_2_1","volume-title":"arXiv preprint arXiv:1611.08481","author":"de Vries Harm","year":"2016","unstructured":"Harm de Vries , Florian Strub , Sarath Chandar , Olivier Pietquin , Hugo Larochelle , and Aaron Courville . 2016. GuessWhat?! Visual object discovery through multi-modal dialogue. arXiv preprint arXiv:1611.08481 ( 2016 ). Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, and Aaron Courville. 2016. GuessWhat?! Visual object discovery through multi-modal dialogue. arXiv preprint arXiv:1611.08481 (2016)."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1422953112"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6638947"},{"key":"e_1_3_2_1_5_1","volume-title":"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. https:\/\/arxiv.org\/abs\/1602.07332","author":"Krishna Ranjay","year":"2016","unstructured":"Ranjay Krishna , Yuke Zhu , Oliver Groth , Justin Johnson , Kenji Hata , Joshua Kravitz , Stephanie Chen , Yannis Kalantidis , Li-Jia Li , David A Shamma , Michael Bernstein , and Li Fei-Fei . 2016 . Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. https:\/\/arxiv.org\/abs\/1602.07332 Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A Shamma, Michael Bernstein, and Li Fei-Fei. 2016. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. https:\/\/arxiv.org\/abs\/1602.07332"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_2_1_7_1","unstructured":"Tomas Mikolov Ilya Sutskever Kai Chen Greg S Corrado and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.  Tomas Mikolov Ilya Sutskever Kai Chen Greg S Corrado and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119."},{"key":"e_1_3_2_1_8_1","first-page":"1532","article-title":"Glove: Global Vectors for Word Representation","volume":"14","author":"Pennington Jeffrey","year":"2014","unstructured":"Jeffrey Pennington , Richard Socher , and Christopher D Manning . 2014 . Glove: Global Vectors for Word Representation .. In EMNLP , Vol. 14. 1532 -- 1543 . Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation.. In EMNLP, Vol. 14. 1532--1543.","journal-title":"EMNLP"},{"key":"e_1_3_2_1_9_1","volume-title":"Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions. arXiv preprint arXiv:1606.06622","author":"Ray Arijit","year":"2016","unstructured":"Arijit Ray , Gordon Christie , Mohit Bansal , Dhruv Batra , and Devi Parikh . 2016. Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions. arXiv preprint arXiv:1606.06622 ( 2016 ). Arijit Ray, Gordon Christie, Mohit Bansal, Dhruv Batra, and Devi Parikh. 2016. Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions. arXiv preprint arXiv:1606.06622 (2016)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_11_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ). Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_12_1","volume-title":"Biometrics and forensics integration using deep multi-modal semantic alignment and joint embedding. Pattern Recognition Letters","author":"Toor Andeep S","year":"2017","unstructured":"Andeep S Toor and Harry Wechsler . 2017. Biometrics and forensics integration using deep multi-modal semantic alignment and joint embedding. Pattern Recognition Letters ( 2017 ). Andeep S Toor and Harry Wechsler. 2017. Biometrics and forensics integration using deep multi-modal semantic alignment and joint embedding. Pattern Recognition Letters (2017)."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_1_14_1","volume-title":"Dynamic memory networks for visual and textual question answering. arXiv 1603","author":"Xiong Caiming","year":"2016","unstructured":"Caiming Xiong , Stephen Merity , and Richard Socher . 2016. Dynamic memory networks for visual and textual question answering. arXiv 1603 ( 2016 ). Caiming Xiong, Stephen Merity, and Richard Socher. 2016. Dynamic memory networks for visual and textual question answering. arXiv 1603 (2016)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46478-7_28"},{"key":"e_1_3_2_1_16_1","first-page":"77","article-title":"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention","volume":"14","author":"Xu Kelvin","year":"2015","unstructured":"Kelvin Xu , Jimmy Ba , Ryan Kiros , Kyunghyun Cho , Aaron C Courville , Ruslan Salakhutdinov , Richard S Zemel , and Yoshua Bengio . 2015 . Show, Attend and Tell: Neural Image Caption Generation with Visual Attention .. In ICML , Vol. 14. 77 -- 81 . Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C Courville, Ruslan Salakhutdinov, Richard S Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.. In ICML, Vol. 14. 77--81.","journal-title":"ICML"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.10"}],"event":{"name":"CBMI '17: International Workshop on Content-Based Multimedia Indexing","acronym":"CBMI '17","location":"Florence Italy"},"container-title":["Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3095713.3095718","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3095713.3095718","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:36:52Z","timestamp":1750217812000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3095713.3095718"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,6,19]]},"references-count":17,"alternative-id":["10.1145\/3095713.3095718","10.1145\/3095713"],"URL":"https:\/\/doi.org\/10.1145\/3095713.3095718","relation":{},"subject":[],"published":{"date-parts":[[2017,6,19]]},"assertion":[{"value":"2017-06-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}