{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:15:51Z","timestamp":1750306551189,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":14,"publisher":"ACM","license":[{"start":{"date-parts":[[2015,10,13]],"date-time":"2015-10-13T00:00:00Z","timestamp":1444694400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Science Foundation of China","award":["61271433"],"award-info":[{"award-number":["61271433"]}]},{"name":"Lenovo Outstanding Young Scientists Program (LOYS)"},{"name":"National Basic Research Program of China (973 Program)","award":["2011CB706900"],"award-info":[{"award-number":["2011CB706900"]}]},{"name":"National Hi-Tech Development Program (863 Program) of China","award":["2014AA015202"],"award-info":[{"award-number":["2014AA015202"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2015,10,13]]},"DOI":"10.1145\/2733373.2806338","type":"proceedings-article","created":{"date-parts":[[2016,2,26]],"date-time":"2016-02-26T19:09:21Z","timestamp":1456513761000},"page":"1315-1318","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Rich Image Description Based on Regions"],"prefix":"10.1145","author":[{"given":"Xiaodan","family":"Zhang","sequence":"first","affiliation":[{"name":"University of Chinese Academy of Sciences, City University of Hong Kong, Beijing, China"}]},{"given":"Xinhang","family":"Song","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Xiong","family":"Lv","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Shuqiang","family":"Jiang","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China"}]},{"given":"Qixiang","family":"Ye","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Jianbin","family":"Jiao","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2015,10,13]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"ICML","author":"Socher Richard","year":"2011","unstructured":"Richard Socher and Cliff Chiung-Yu Lin , Andrew Y. Ng , and Christopher D. Manning . Parsing natural scenes and natural language with recursive neural networks . In ICML , 2011 . Richard Socher andCliff Chiung-Yu Lin, Andrew Y. Ng, and Christopher D. Manning. Parsing natural scenes and natural language with recursive neural networks. In ICML, 2011."},{"key":"e_1_3_2_1_2_1","volume-title":"EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics","author":"Daelemans Walter","year":"2012","unstructured":"Walter Daelemans , Mirella Lapata , and Llu\u0131s M\u00e0rquez , editors. EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics , Avignon, France, April 23--27 , 2012 . The Association for Computer Linguistics, 2012. Walter Daelemans, Mirella Lapata, and Llu\u0131s M\u00e0rquez, editors. EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, April 23--27, 2012. The Association for Computer Linguistics, 2012."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_5_1","volume-title":"Framing image description as a ranking task: Data, models and evaluation metrics. J. Artif. Intell. Res. (JAIR), 47:853--899","author":"Hodosh Micah","year":"2013","unstructured":"Micah Hodosh , Peter Young , and Julia Hockenmaier . Framing image description as a ranking task: Data, models and evaluation metrics. J. Artif. Intell. Res. (JAIR), 47:853--899 , 2013 . Micah Hodosh, Peter Young, and Julia Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. J. Artif. Intell. Res. (JAIR), 47:853--899, 2013."},{"key":"e_1_3_2_1_6_1","volume-title":"Deep visual-semantic alignments for generating image descriptions. CoRR, abs\/1412.2306","author":"Karpathy Andrej","year":"2014","unstructured":"Andrej Karpathy and Fei-Fei Li . Deep visual-semantic alignments for generating image descriptions. CoRR, abs\/1412.2306 , 2014 . Andrej Karpathy and Fei-Fei Li. Deep visual-semantic alignments for generating image descriptions. CoRR, abs\/1412.2306, 2014."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.162"},{"key":"e_1_3_2_1_8_1","volume-title":"Microsoft COCO: common objects in context. CoRR, abs\/1405.0312","author":"Lin Tsung-Yi","year":"2014","unstructured":"Tsung-Yi Lin , Michael Maire , Serge Belongie , Lubomir D. Bourdev , Ross B. Girshick , James Hays , Pietro Perona , Deva Ramanan , Piotr Doll\u00e1r , and C. Lawrence Zitnick . Microsoft COCO: common objects in context. CoRR, abs\/1405.0312 , 2014 . Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. CoRR, abs\/1405.0312, 2014."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2010-343"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_3_2_1_11_1","volume-title":"Imagenet large scale visual recognition challenge. CoRR, abs\/1409.0575","author":"Russakovsky Olga","year":"2014","unstructured":"Olga Russakovsky , Jia Deng , Hao Su , Jonathan Krause , Sanjeev Satheesh , Sean Ma , Zhiheng Huang , Andrej Karpathy , Aditya Khosla , Michael S. Bernstein , Alexander C. Berg , and Fei-Fei Li . Imagenet large scale visual recognition challenge. CoRR, abs\/1409.0575 , 2014 . Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Fei-Fei Li. Imagenet large scale visual recognition challenge. CoRR, abs\/1409.0575, 2014."},{"key":"e_1_3_2_1_12_1","volume-title":"Very deep convolutional networks for large-scale image recognition. CoRR, abs\/1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . Very deep convolutional networks for large-scale image recognition. CoRR, abs\/1409.1556 , 2014 . Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs\/1409.1556, 2014."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0620-5"},{"key":"e_1_3_2_1_14_1","volume-title":"Show and tell: A neural image caption generator. CoRR, abs\/1411.4555","author":"Vinyals Oriol","year":"2014","unstructured":"Oriol Vinyals , Alexander Toshev , Samy Bengio , and Dumitru Erhan . Show and tell: A neural image caption generator. CoRR, abs\/1411.4555 , 2014 . Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show and tell: A neural image caption generator. CoRR, abs\/1411.4555, 2014."}],"event":{"name":"MM '15: ACM Multimedia Conference","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Brisbane Australia","acronym":"MM '15"},"container-title":["Proceedings of the 23rd ACM international conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2733373.2806338","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2733373.2806338","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:12:40Z","timestamp":1750227160000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2733373.2806338"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,10,13]]},"references-count":14,"alternative-id":["10.1145\/2733373.2806338","10.1145\/2733373"],"URL":"https:\/\/doi.org\/10.1145\/2733373.2806338","relation":{},"subject":[],"published":{"date-parts":[[2015,10,13]]},"assertion":[{"value":"2015-10-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}