{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T22:56:20Z","timestamp":1773442580211,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":27,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,3]],"date-time":"2022-10-03T00:00:00Z","timestamp":1664755200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,3]]},"DOI":"10.1145\/3538641.3561491","type":"proceedings-article","created":{"date-parts":[[2022,10,20]],"date-time":"2022-10-20T22:09:43Z","timestamp":1666303783000},"page":"71-76","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["End-to-end image captioning based on reduced feature maps of deep learners pre-trained for object detection"],"prefix":"10.1145","author":[{"given":"Yan","family":"Lyu","sequence":"first","affiliation":[{"name":"The University of Aizu, Aizu-Wakamatsu, Fukushima, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiangfu","family":"Zhao","sequence":"additional","affiliation":[{"name":"The University of Aizu, Aizu-Wakamatsu, Fukushima, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yong","family":"Liu","sequence":"additional","affiliation":[{"name":"The University of Aizu, Aizu-Wakamatsu, Fukushima, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,10,20]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"International conference on machine learning. PMLR, 1597--1607","author":"Chen Ting","year":"2020","unstructured":"Ting Chen , Simon Kornblith , Mohammad Norouzi , and Geoffrey Hinton . 2020 . A simple framework for contrastive learning of visual representations . In International conference on machine learning. PMLR, 1597--1607 . Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607."},{"key":"e_1_3_2_1_2_1","volume-title":"Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297","author":"Chen Xinlei","year":"2020","unstructured":"Xinlei Chen , Haoqi Fan , Ross Girshick , and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 ( 2020 ). Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020)."},{"key":"e_1_3_2_1_3_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295748"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298932"},{"key":"e_1_3_2_1_6_1","volume-title":"Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E Hinton . 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 ( 2012 ). Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2019.2914351"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_2_1_9_1","volume-title":"Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019 . Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019). Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i3.16328"},{"key":"e_1_3_2_1_11_1","volume-title":"Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632","author":"Mao Junhua","year":"2014","unstructured":"Junhua Mao , Wei Xu , Yi Yang , Jiang Wang , Zhiheng Huang , and Alan Yuille . 2014. Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632 ( 2014 ). Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, and Alan Yuille. 2014. Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632 (2014)."},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni , Salim Roukos , Todd Ward , and Wei-Jing Zhu . 2002 . Bleu: a method for automatic evaluation of machine translation . In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318 . Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318."},{"key":"e_1_3_2_1_13_1","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans Ilya Sutskever etal 2018. Improving language understanding by generative pre-training. (2018).  Alec Radford Karthik Narasimhan Tim Salimans Ilya Sutskever et al. 2018. Improving language understanding by generative pre-training. (2018)."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/1866696.1866717"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_2_1_16_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ). Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.3390\/app9102024"},{"key":"e_1_3_2_1_18_1","volume-title":"Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223","author":"Sun Yu","year":"2019","unstructured":"Yu Sun , Shuohuan Wang , Yukun Li , Shikun Feng , Xuyi Chen , Han Zhang , Xin Tian , Danxiang Zhu , Hao Tian , and Hua Wu . 2019 . Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019). Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6428"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.29"},{"key":"e_1_3_2_1_23_1","volume-title":"Image captioning and visual question answering based on attributes and external knowledge","author":"Wu Qi","year":"2017","unstructured":"Qi Wu , Chunhua Shen , Peng Wang , Anthony Dick , and Anton Van Den Hengel . 2017. Image captioning and visual question answering based on attributes and external knowledge . IEEE transactions on pattern analysis and machine intelligence 40, 6 ( 2017 ), 1367--1381. Qi Wu, Chunhua Shen, Peng Wang, Anthony Dick, and Anton Van Den Hengel. 2017. Image captioning and visual question answering based on attributes and external knowledge. IEEE transactions on pattern analysis and machine intelligence 40, 6 (2017), 1367--1381."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2017.8019399"},{"key":"e_1_3_2_1_25_1","volume-title":"Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems 32","author":"Yang Zhilin","year":"2019","unstructured":"Zhilin Yang , Zihang Dai , Yiming Yang , Jaime Carbonell , Russ R Salakhutdinov , and Quoc V Le . 2019 . Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems 32 (2019). Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems 32 (2019)."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00166"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10590-1_53"}],"event":{"name":"RACS '22: International Conference on Research in Adaptive and Convergent Systems","location":"Virtual Event Japan","acronym":"RACS '22","sponsor":["SIGAPP ACM Special Interest Group on Applied Computing"]},"container-title":["Proceedings of the Conference on Research in Adaptive and Convergent Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3538641.3561491","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3538641.3561491","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:03:02Z","timestamp":1750186982000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3538641.3561491"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,3]]},"references-count":27,"alternative-id":["10.1145\/3538641.3561491","10.1145\/3538641"],"URL":"https:\/\/doi.org\/10.1145\/3538641.3561491","relation":{},"subject":[],"published":{"date-parts":[[2022,10,3]]},"assertion":[{"value":"2022-10-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}