{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,23]],"date-time":"2025-08-23T05:25:12Z","timestamp":1755926712198,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":43,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T00:00:00Z","timestamp":1665360000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,10]]},"DOI":"10.1145\/3503161.3548150","type":"proceedings-article","created":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T15:43:12Z","timestamp":1665416592000},"page":"4665-4673","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Image Understanding by Captioning with Differentiable Architecture Search"],"prefix":"10.1145","author":[{"given":"Ramtin","family":"Hosseini","sequence":"first","affiliation":[{"name":"University of California, San Diego, San Diego, CA, USA"}]},{"given":"Pengtao","family":"Xie","sequence":"additional","affiliation":[{"name":"University of California, San Diego, San Diego, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2022,10,10]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46454-1_24"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00636"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.3390\/app12031638"},{"key":"e_1_3_2_2_4_1","volume-title":"Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. Association for Computational Linguistics","author":"Banerjee Satanjeev","year":"2005","unstructured":"Satanjeev Banerjee and Alon Lavie . 2005 . METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments . In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. Association for Computational Linguistics , Ann Arbor, Michigan, 65--72. https:\/\/aclanthology.org\/W05-0909 Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65--72. https:\/\/aclanthology.org\/W05-0909"},{"key":"e_1_3_2_2_5_1","first-page":"876","article-title":"Data: Differentiable architecture approximation","volume":"32","author":"Chang Jianlong","year":"2019","unstructured":"Jianlong Chang , Yiwen Guo , GAOFENG MENG, SHIMING XIANG , Chunhong Pan , 2019 . Data: Differentiable architecture approximation . Advances in Neural Information Processing Systems 32 (2019), 876 -- 886 . Jianlong Chang, Yiwen Guo, GAOFENG MENG, SHIMING XIANG, Chunhong Pan, et al. 2019. Data: Differentiable architecture approximation. Advances in Neural Information Processing Systems 32 (2019), 876--886.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2015.2477044"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01059"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_2_9_1","volume-title":"Learning from Mistakes--A Framework for Neural Architecture Search. arXiv preprint arXiv:2111.06353","author":"Garg Bhanu","year":"2021","unstructured":"Bhanu Garg , Li Zhang , Pradyumna Sridhara , Ramtin Hosseini , Eric Xing , and Pengtao Xie . 2021. Learning from Mistakes--A Framework for Neural Architecture Search. arXiv preprint arXiv:2111.06353 ( 2021 ). Bhanu Garg, Li Zhang, Pradyumna Sridhara, Ramtin Hosseini, Eric Xing, and Pengtao Xie. 2021. Learning from Mistakes--A Framework for Neural Architecture Search. arXiv preprint arXiv:2111.06353 (2021)."},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01213"},{"key":"e_1_3_2_2_11_1","volume-title":"Learning by Self-Explanation, with Application to Neural Architecture Search. arXiv preprint arXiv:2012.12899","author":"Hosseini Ramtin","year":"2020","unstructured":"Ramtin Hosseini and Pengtao Xie . 2020. Learning by Self-Explanation, with Application to Neural Architecture Search. arXiv preprint arXiv:2012.12899 ( 2020 ). Ramtin Hosseini and Pengtao Xie. 2020. Learning by Self-Explanation, with Application to Neural Architecture Search. arXiv preprint arXiv:2012.12899 (2020)."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00613"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00473"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01216-8_31"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298932"},{"key":"e_1_3_2_2_16_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"crossref","unstructured":"Ranjay Krishna Yuke Zhu Oliver Groth Justin Johnson Kenji Hata Joshua Kravitz Stephanie Chen Yannis Kalantidis Li-Jia Li David A Shamma etal 2017. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision 123 1 (2017) 32--73.  Ranjay Krishna Yuke Zhu Oliver Groth Justin Johnson Kenji Hata Joshua Kravitz Stephanie Chen Yannis Kalantidis Li-Jia Li David A Shamma et al. 2017. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision 123 1 (2017) 32--73.","DOI":"10.1007\/s11263-016-0981-7"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58577-8_8"},{"key":"e_1_3_2_2_19_1","volume-title":"Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin . 2004 . Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_2_2_21_1","unstructured":"Hanxiao Liu Karen Simonyan Oriol Vinyals Chrisantha Fernando and Koray Kavukcuoglu. 2018. Hierarchical Representations for Efficient Architecture Search. In ICLR.  Hanxiao Liu Karen Simonyan Oriol Vinyals Chrisantha Fernando and Koray Kavukcuoglu. 2018. Hierarchical Representations for Efficient Architecture Search. In ICLR."},{"key":"e_1_3_2_2_22_1","volume-title":"Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055","author":"Liu Hanxiao","year":"2018","unstructured":"Hanxiao Liu , Karen Simonyan , and Yiming Yang . 2018 . Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018). Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)."},{"key":"e_1_3_2_2_23_1","volume-title":"Explain images with multimodal recurrent neural networks. arXiv preprint arXiv:1410.1090","author":"Mao Junhua","year":"2014","unstructured":"Junhua Mao , Wei Xu , Yi Yang , JiangWang, and Alan L Yuille . 2014. Explain images with multimodal recurrent neural networks. arXiv preprint arXiv:1410.1090 ( 2014 ). Junhua Mao,Wei Xu, Yi Yang, JiangWang, and Alan L Yuille. 2014. Explain images with multimodal recurrent neural networks. arXiv preprint arXiv:1410.1090 (2014)."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01098"},{"key":"e_1_3_2_2_25_1","volume-title":"Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni , Salim Roukos , Todd Ward , and Wei-Jing Zhu . 2002 . Bleu: a method for automatic evaluation of machine translation . In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318 . Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318."},{"key":"e_1_3_2_2_26_1","unstructured":"Hieu Pham Melody Y. Guan Barret Zoph Quoc V. Le and Jeff Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. In ICML.  Hieu Pham Melody Y. Guan Barret Zoph Quoc V. Le and Jeff Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. In ICML."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00856"},{"key":"e_1_3_2_2_28_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence 33","author":"Real Esteban","year":"2019","unstructured":"Esteban Real , Alok Aggarwal , Yanping Huang , and Quoc V. Le . 2019. Regularized Evolution for Image Classifier Architecture Search . Proceedings of the AAAI Conference on Artificial Intelligence 33 , 01 ( July 2019 ), 4780--4789. https:\/\/doi.org\/10.1609\/aaai.v33i01.33014780 Number : 01. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2019. Regularized Evolution for Image Classifier Architecture Search. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (July 2019), 4780--4789. https:\/\/doi.org\/10.1609\/aaai.v33i01.33014780 Number: 01."},{"key":"e_1_3_2_2_29_1","volume-title":"Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren , Kaiming He , Ross Girshick , and Jian Sun . 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 ( 2015 ), 91--99. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91--99."},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.131"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299087"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"e_1_3_2_2_33_1","volume-title":"PC-DARTS: Partial channel connections for memoryefficient architecture search. arXiv preprint arXiv:1907.05737","author":"Xu Yuhui","year":"2019","unstructured":"Yuhui Xu , Lingxi Xie , Xiaopeng Zhang , Xin Chen , Guo-Jun Qi , Qi Tian , and Hongkai Xiong . 2019. PC-DARTS: Partial channel connections for memoryefficient architecture search. arXiv preprint arXiv:1907.05737 ( 2019 ). Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2019. PC-DARTS: Partial channel connections for memoryefficient architecture search. arXiv preprint arXiv:1907.05737 (2019)."},{"key":"e_1_3_2_2_34_1","volume-title":"International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=BJlS634tPr","author":"Xu Yuhui","year":"2020","unstructured":"Yuhui Xu , Lingxi Xie , Xiaopeng Zhang , Xin Chen , Guo-Jun Qi , Qi Tian , and Hongkai Xiong . 2020 . {PC}-{DARTS}: Partial Channel Connections for Memory-Efficient Architecture Search . In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=BJlS634tPr Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2020. {PC}-{DARTS}: Partial Channel Connections for Memory-Efficient Architecture Search. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=BJlS634tPr"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01094"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_42"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.524"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.503"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2021.104126"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00553"},{"key":"e_1_3_2_2_41_1","volume-title":"AutoCaption: Image Captioning with Neural Architecture Search. arXiv preprint arXiv:2012.09742","author":"Zhu Xinxin","year":"2020","unstructured":"Xinxin Zhu , Weining Wang , Longteng Guo , and Jing Liu . 2020. AutoCaption: Image Captioning with Neural Architecture Search. arXiv preprint arXiv:2012.09742 ( 2020 ). Xinxin Zhu, Weining Wang, Longteng Guo, and Jing Liu. 2020. AutoCaption: Image Captioning with Neural Architecture Search. arXiv preprint arXiv:2012.09742 (2020)."},{"key":"e_1_3_2_2_42_1","volume-title":"Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578","author":"Zoph Barret","year":"2016","unstructured":"Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 ( 2016 ). Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)."},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"crossref","unstructured":"Barret Zoph Vijay Vasudevan Jonathon Shlens and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In CVPR.  Barret Zoph Vijay Vasudevan Jonathon Shlens and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In CVPR.","DOI":"10.1109\/CVPR.2018.00907"}],"event":{"name":"MM '22: The 30th ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Lisboa Portugal","acronym":"MM '22"},"container-title":["Proceedings of the 30th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3503161.3548150","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3503161.3548150","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:19Z","timestamp":1750186819000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3503161.3548150"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,10]]},"references-count":43,"alternative-id":["10.1145\/3503161.3548150","10.1145\/3503161"],"URL":"https:\/\/doi.org\/10.1145\/3503161.3548150","relation":{},"subject":[],"published":{"date-parts":[[2022,10,10]]},"assertion":[{"value":"2022-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}