{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:13:02Z","timestamp":1760710382387,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T00:00:00Z","timestamp":1603065600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Guangxi Natural Science Foundation","award":["2019GXNSFDA245018"],"award-info":[{"award-number":["2019GXNSFDA245018"]}]},{"name":"National Natural Science Foundation of China","award":["61966004, 61663004"],"award-info":[{"award-number":["61966004, 61663004"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,19]]},"DOI":"10.1145\/3340531.3411948","type":"proceedings-article","created":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T05:31:04Z","timestamp":1603085464000},"page":"535-544","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Image Captioning with Internal and External Knowledge"],"prefix":"10.1145","author":[{"given":"Feicheng","family":"Huang","sequence":"first","affiliation":[{"name":"Guangxi Normal University, Guilin, China"}]},{"given":"Zhixin","family":"Li","sequence":"additional","affiliation":[{"name":"Guangxi Normal University, Guilin, China"}]},{"given":"Shengjia","family":"Chen","sequence":"additional","affiliation":[{"name":"Guangxi Normal University, Guilin, China"}]},{"given":"Canlong","family":"Zhang","sequence":"additional","affiliation":[{"name":"Guangxi Normal University, Guilin, China"}]},{"given":"Huifang","family":"Ma","sequence":"additional","affiliation":[{"name":"Northwest Normal University, Lanzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2020,10,19]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00636"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.8"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.05.080"},{"key":"e_1_3_2_2_4_1","volume-title":"Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and\/or summarization. 65--72","author":"Banerjee Satanjeev","year":"2005","unstructured":"Satanjeev Banerjee and Alon Lavie . 2005 . METEOR: An automatic metric for MT evaluation with improved correlation with human judgments . In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and\/or summarization. 65--72 . Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and\/or summarization. 65--72."},{"key":"e_1_3_2_2_5_1","unstructured":"Samy Bengio Oriol Vinyals Navdeep Jaitly and Noam Shazeer. 2015. Scheduled sampling for sequence prediction with recurrent neural networks. In Advances in Neural Information Processing Systems. 1171--1179.  Samy Bengio Oriol Vinyals Navdeep Jaitly and Noam Shazeer. 2015. Scheduled sampling for sequence prediction with recurrent neural networks. In Advances in Neural Information Processing Systems. 1171--1179."},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.667"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298023.3298147"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.64"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.681"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3177745"},{"key":"e_1_3_2_2_11_1","volume-title":"Mohd Fairuz Shiratuddin, and Hamid Laga","author":"Zakir Hossain MD","year":"2019","unstructured":"MD Zakir Hossain , Ferdous Sohel , Mohd Fairuz Shiratuddin, and Hamid Laga . 2019 . A comprehensive survey of deep learning for image captioning. ACM Computing Surveys (CSUR) 51, 6 (2019), 1--36. MD Zakir Hossain, Ferdous Sohel, Mohd Fairuz Shiratuddin, and Hamid Laga. 2019. A comprehensive survey of deep learning for image captioning. ACM Computing Surveys (CSUR)51, 6 (2019), 1--36."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.277"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.12283"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01216-8_31"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.494"},{"key":"e_1_3_2_2_16_1","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV). 0--0.","author":"Kim Boeun","year":"2018","unstructured":"Boeun Kim , Young Han Lee , Hyedong Jung , and Choongsang Cho . 2018 . Distinctive-attribute extraction for image captioning . In Proceedings of the European Conference on Computer Vision (ECCV). 0--0. Boeun Kim, Young Han Lee, Hyedong Jung, and Choongsang Cho. 2018. Distinctive-attribute extraction for image captioning. In Proceedings of the European Conference on Computer Vision (ECCV). 0--0."},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298023.3298168"},{"key":"e_1_3_2_2_18_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"33","author":"Li Xiangyang","year":"2019","unstructured":"Xiangyang Li , Shuqiang Jiang , and Jungong Han . 2019 . Learning Object Con-text for Dense Captioning . In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 33 . 8650--8657. Xiangyang Li, Shuqiang Jiang, and Jungong Han. 2019. Learning Object Con-text for Dense Captioning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8650--8657."},{"key":"e_1_3_2_2_19_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12497--12506","author":"Li Yehao","year":"2019","unstructured":"Yehao Li , Ting Yao , Yingwei Pan , Hongyang Chao , and Tao Mei . 2019 . Pointingnovel objects in image captioning . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12497--12506 . Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, and Tao Mei. 2019. Pointingnovel objects in image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12497--12506."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073445.1073465"},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298023.3298174"},{"key":"e_1_3_2_2_22_1","volume-title":"Optimization of image description metrics using policy gradient methods. arXiv preprint arXiv:1612.003705","author":"Liu Siqi","year":"2016","unstructured":"Siqi Liu , Zhenhai Zhu , Ning Ye , Sergio Guadarrama , and Kevin Murphy . 2016. Optimization of image description metrics using policy gradient methods. arXiv preprint arXiv:1612.003705 ( 2016 ). Siqi Liu, Zhenhai Zhu, Ning Ye, Sergio Guadarrama, and Kevin Murphy. 2016. Optimization of image description metrics using policy gradient methods. arXiv preprint arXiv:1612.003705 (2016)."},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.100"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.345"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v30i1.10475"},{"key":"e_1_3_2_2_26_1","volume-title":"Proceedings of the40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni , Salim Roukos , Todd Ward , and Wei-Jing Zhu . 2002 . BLEU: amethod for automatic evaluation of machine translation . In Proceedings of the40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318 . Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: amethod for automatic evaluation of machine translation. In Proceedings of the40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.140"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00856"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.334"},{"key":"e_1_3_2_2_30_1","unstructured":"Marc'Aurelio Ranzato Sumit Chopra Michael Auli and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732(2015).  Marc'Aurelio Ranzato Sumit Chopra Michael Auli and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732(2015)."},{"key":"e_1_3_2_2_31_1","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.  Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99."},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.128"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.131"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.11164"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299087"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.780"},{"key":"e_1_3_2_2_38_1","volume-title":"Image captioning and visual question answering based on attributes and external knowledge","author":"Wu Qi","year":"2017","unstructured":"Qi Wu , Chunhua Shen , Peng Wang , Anthony Dick , and Anton van den Hengel . 2017. Image captioning and visual question answering based on attributes and external knowledge . IEEE transactions on pattern analysis and machine intelligence 40, 6 ( 2017 ), 1367--1381. Qi Wu, Chunhua Shen, Peng Wang, Anthony Dick, and Anton van den Hengel. 2017. Image captioning and visual question answering based on attributes and external knowledge. IEEE transactions on pattern analysis and machine intelligence 40, 6 (2017), 1367--1381."},{"key":"e_1_3_2_2_39_1","volume-title":"International conference on machine learning. 2048--2057","author":"Xu Kelvin","year":"2015","unstructured":"Kelvin Xu , Jimmy Ba , Ryan Kiros , Kyunghyun Cho , Aaron Courville , Ruslan Salakhudinov , Rich Zemel , and Yoshua Bengio . 2015 . Show, attend and tell: Neural image caption generation with visual attention . In International conference on machine learning. 2048--2057 . Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. 2048--2057."},{"key":"e_1_3_2_2_40_1","unstructured":"Zhilin Yang Ye Yuan and Yuexin Wu. 2016. Encode review and decode: Reviewer module for caption generation. arXiv preprint arXiv:1605.07912(2016).  Zhilin Yang Ye Yuan and Yuexin Wu. 2016. Encode review and decode: Reviewer module for caption generation. arXiv preprint arXiv:1605.07912(2016)."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.559"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.524"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.503"},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2019.00036"}],"event":{"name":"CIKM '20: The 29th ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Virtual Event Ireland","acronym":"CIKM '20"},"container-title":["Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3411948","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3340531.3411948","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:22Z","timestamp":1750197682000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3411948"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,19]]},"references-count":44,"alternative-id":["10.1145\/3340531.3411948","10.1145\/3340531"],"URL":"https:\/\/doi.org\/10.1145\/3340531.3411948","relation":{},"subject":[],"published":{"date-parts":[[2020,10,19]]},"assertion":[{"value":"2020-10-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}