{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T23:58:05Z","timestamp":1775174285202,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,17]]},"DOI":"10.1145\/3474085.3475179","type":"proceedings-article","created":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T20:00:05Z","timestamp":1634587205000},"page":"2708-2716","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["Semi-Autoregressive Image Captioning"],"prefix":"10.1145","author":[{"given":"Xu","family":"Yan","sequence":"first","affiliation":[{"name":"Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, Beijing, China"}]},{"given":"Zhengcong","family":"Fei","sequence":"additional","affiliation":[{"name":"Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, Beijing, China"}]},{"given":"Zekang","family":"Li","sequence":"additional","affiliation":[{"name":"Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, Beijing, China"}]},{"given":"Shuhui","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, Beijing, China"}]},{"given":"Qingming","family":"Huang","sequence":"additional","affiliation":[{"name":"Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, Beijing, China"}]},{"given":"Qi","family":"Tian","sequence":"additional","affiliation":[{"name":"Cloud BU, Huawei Technologies, Shenzhen, China, Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46454-1_24"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00636"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.05.080"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553380"},{"key":"e_1_3_2_2_5_1","volume-title":"Microsoft coco captions: Data collection and evaluation server. arXiv preprint arXiv:1504.00325","author":"Chen Xinlei","year":"2015","unstructured":"Xinlei Chen , Hao Fang , Tsung-Yi Lin , Ramakrishna Vedantam , Saurabh Gupta , Piotr Doll\u00e1r , and C Lawrence Zitnick . 2015. Microsoft coco captions: Data collection and evaluation server. arXiv preprint arXiv:1504.00325 ( 2015 ). Xinlei Chen, Hao Fang, Tsung-Yi Lin, Ramakrishna Vedantam, Saurabh Gupta, Piotr Doll\u00e1r, and C Lawrence Zitnick. 2015. Microsoft coco captions: Data collection and evaluation server. arXiv preprint arXiv:1504.00325 (2015)."},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298856"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01059"},{"key":"e_1_3_2_2_8_1","volume-title":"Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860","author":"Dai Zihang","year":"2019","unstructured":"Zihang Dai , Zhilin Yang , Yiming Yang , Jaime Carbonell , Quoc V Le , and Ruslan Salakhutdinov . 2019 . Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019). Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019)."},{"key":"e_1_3_2_2_9_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413901"},{"key":"e_1_3_2_2_11_1","volume-title":"Fast Image Caption Generation with Position Alignment. arXiv preprint arXiv:1912.06365","year":"2019","unstructured":"Zheng-cong Fei. 2019. Fast Image Caption Generation with Position Alignment. arXiv preprint arXiv:1912.06365 ( 2019 ). Zheng-cong Fei. 2019. Fast Image Caption Generation with Position Alignment. arXiv preprint arXiv:1912.06365 (2019)."},{"key":"e_1_3_2_2_12_1","volume-title":"2019 b. Masked Non-Autoregressive Image Captioning. arXiv preprint arXiv:1906.00717","author":"Gao Junlong","year":"2019","unstructured":"Junlong Gao , Xi Meng , Shiqi Wang , Xia Li , Shanshe Wang , Siwei Ma , and Wen Gao . 2019 b. Masked Non-Autoregressive Image Captioning. arXiv preprint arXiv:1906.00717 ( 2019 ). Junlong Gao, Xi Meng, Shiqi Wang, Xia Li, Shanshe Wang, Siwei Ma, and Wen Gao. 2019 b. Masked Non-Autoregressive Image Captioning. arXiv preprint arXiv:1906.00717 (2019)."},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018320"},{"key":"e_1_3_2_2_14_1","volume-title":"Proc. ICLR .","author":"Gu Jiatao","year":"2018","unstructured":"Jiatao Gu , James Bradbury , Caiming Xiong , Victor O. K. Li , and Richard Socher . 2018 . Non-Autoregressive Neural Machine Translation . In Proc. ICLR . Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, and Richard Socher. 2018. Non-Autoregressive Neural Machine Translation. In Proc. ICLR ."},{"key":"e_1_3_2_2_15_1","volume-title":"Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning. arXiv preprint arXiv:2005.04690","author":"Guo Longteng","year":"2020","unstructured":"Longteng Guo , Jing Liu , Xinxin Zhu , Xingjian He , Jie Jiang , and Hanqing Lu. 2020. Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning. arXiv preprint arXiv:2005.04690 ( 2020 ). Longteng Guo, Jing Liu, Xinxin Zhu, Xingjian He, Jie Jiang, and Hanqing Lu. 2020. Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning. arXiv preprint arXiv:2005.04690 (2020)."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01270-0_17"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00473"},{"key":"e_1_3_2_2_19_1","volume-title":"Proc. EMNLP. 1138--1149","author":"Jason Lee","year":"2018","unstructured":"Lee Jason , Mansimov Elman , Graham Neubig , and Cho Kyunghyun . 2018 . Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement . In Proc. EMNLP. 1138--1149 . Lee Jason, Mansimov Elman, Graham Neubig, and Cho Kyunghyun. 2018. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement. In Proc. EMNLP. 1138--1149."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298932"},{"key":"e_1_3_2_2_21_1","volume-title":"Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947","author":"Kim Yoon","year":"2016","unstructured":"Yoon Kim and Alexander M Rush . 2016. Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 ( 2016 ). Yoon Kim and Alexander M Rush. 2016. Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 (2016)."},{"key":"e_1_3_2_2_22_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/1626355.1626389"},{"key":"e_1_3_2_2_24_1","volume-title":"Proc. ACL Workshops. 74--81","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin . 2004 . ROUGE: A Package for Automatic Evaluation of summaries . In Proc. ACL Workshops. 74--81 . Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of summaries. In Proc. ACL Workshops. 74--81."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_3_2_2_26_1","volume-title":"Competence-based curriculum learning for neural machine translation. arXiv preprint arXiv:1903.09848","author":"Platanios Emmanouil Antonios","year":"2019","unstructured":"Emmanouil Antonios Platanios , Otilia Stretcu , Graham Neubig , Barnabas Poczos , and Tom M Mitchell . 2019. Competence-based curriculum learning for neural machine translation. arXiv preprint arXiv:1903.09848 ( 2019 ). Emmanouil Antonios Platanios, Otilia Stretcu, Graham Neubig, Barnabas Poczos, and Tom M Mitchell. 2019. Competence-based curriculum learning for neural machine translation. arXiv preprint arXiv:1903.09848 (2019)."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/2969239.2969250"},{"key":"e_1_3_2_2_28_1","volume-title":"A Study of Non-autoregressive Model for Sequence Generation. arXiv preprint arXiv:2004.10454","author":"Ren Yi","year":"2020","unstructured":"Yi Ren , Jinglin Liu , Xu Tan , Sheng Zhao , Zhou Zhao , and Tie-Yan Liu . 2020. A Study of Non-autoregressive Model for Sequence Generation. arXiv preprint arXiv:2004.10454 ( 2020 ). Yi Ren, Jinglin Liu, Xu Tan, Sheng Zhao, Zhou Zhao, and Tie-Yan Liu. 2020. A Study of Non-autoregressive Model for Sequence Generation. arXiv preprint arXiv:2004.10454 (2020)."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.131"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/2969033.2969173"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299087"},{"key":"e_1_3_2_2_33_1","volume-title":"Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424","author":"Vijayakumar Ashwin K","year":"2016","unstructured":"Ashwin K Vijayakumar , Michael Cogswell , Ramprasath R Selvaraju , Qing Sun , Stefan Lee , David Crandall , and Dhruv Batra . 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424 ( 2016 ). Ashwin K Vijayakumar, Michael Cogswell, Ramprasath R Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424 (2016)."},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1044"},{"key":"e_1_3_2_2_36_1","volume-title":"Imitation learning for non-autoregressive neural machine translation. arXiv preprint arXiv:1906.02041","author":"Wei Bingzhen","year":"2019","unstructured":"Bingzhen Wei , Mingxuan Wang , Hao Zhou , Junyang Lin , Jun Xie , and Xu Sun . 2019. Imitation learning for non-autoregressive neural machine translation. arXiv preprint arXiv:1906.02041 ( 2019 ). Bingzhen Wei, Mingxuan Wang, Hao Zhou, Junyang Lin, Jun Xie, and Xu Sun. 2019. Imitation learning for non-autoregressive neural machine translation. arXiv preprint arXiv:1906.02041 (2019)."},{"key":"e_1_3_2_2_37_1","volume-title":"Sequence-to-sequence learning as beam-search optimization. arXiv preprint arXiv:1606.02960","author":"Wiseman Sam","year":"2016","unstructured":"Sam Wiseman and Alexander M Rush . 2016. Sequence-to-sequence learning as beam-search optimization. arXiv preprint arXiv:1606.02960 ( 2016 ). Sam Wiseman and Alexander M Rush. 2016. Sequence-to-sequence learning as beam-search optimization. arXiv preprint arXiv:1606.02960 (2016)."},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045118.3045336"},{"key":"e_1_3_2_2_39_1","volume-title":"Non-Autoregressive Coarse-to-Fine Video Captioning. arXiv preprint arXiv:1911.12018","author":"Yang Bang","year":"2019","unstructured":"Bang Yang , Fenglin Liu , Can Zhang , and Yuexian Zou . 2019. Non-Autoregressive Coarse-to-Fine Video Captioning. arXiv preprint arXiv:1911.12018 ( 2019 ). Bang Yang, Fenglin Liu, Can Zhang, and Yuexian Zou. 2019. Non-Autoregressive Coarse-to-Fine Video Captioning. arXiv preprint arXiv:1911.12018 (2019)."},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_42"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00859"}],"event":{"name":"MM '21: ACM Multimedia Conference","location":"Virtual Event China","acronym":"MM '21","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 29th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3475179","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3474085.3475179","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:47Z","timestamp":1750193327000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3475179"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":41,"alternative-id":["10.1145\/3474085.3475179","10.1145\/3474085"],"URL":"https:\/\/doi.org\/10.1145\/3474085.3475179","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}