{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:20:35Z","timestamp":1750220435645,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T00:00:00Z","timestamp":1602460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Natural Science Foundation of China (NSFC)","award":["61836012","61876045"],"award-info":[{"award-number":["61836012","61876045"]}]},{"name":"Natural Science Foundation of Guangdong Province","award":["2017A030312006"],"award-info":[{"award-number":["2017A030312006"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,12]]},"DOI":"10.1145\/3394171.3413571","type":"proceedings-article","created":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T12:27:35Z","timestamp":1602505655000},"page":"973-981","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Active Object Search"],"prefix":"10.1145","author":[{"given":"Jie","family":"Wu","sequence":"first","affiliation":[{"name":"Sun Yat-Sen University, guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tianshui","family":"Chen","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University &amp; DarkMatter AI, guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lishan","family":"Huang","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University, guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hefeng","family":"Wu","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University, guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guanbin","family":"Li","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University, guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ling","family":"Tian","sequence":"additional","affiliation":[{"name":"University of Electronic Science and Technology of China, chengdu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Liang","family":"Lin","sequence":"additional","affiliation":[{"name":"Sun Yat-Sen University &amp; DarkMatter AI, guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,10,12]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.1990.118128"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989164"},{"key":"e_1_3_2_2_3_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","volume":"2","author":"Anderson Peter","unstructured":"Peter Anderson , Qi Wu , Damien Teney , Jake Bruce , Mark Johnson , Niko S\u00fcnderhauf , Ian Reid , Stephen Gould , and Anton van den Hengel. 2018. Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , Vol. 2 . Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko S\u00fcnderhauf, Ian Reid, Stephen Gould, and Anton van den Hengel. 2018. Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2."},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.286"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2018.2870062"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2859025"},{"key":"e_1_3_2_2_7_1","volume-title":"Thirty-Second AAAI Conference on Artificial Intelligence.","author":"Chen Tianshui","year":"2018","unstructured":"Tianshui Chen , Zhouxia Wang , Guanbin Li , and Liang Lin . 2018 c. Recurrent attentional reinforcement learning for multi-label image recognition . In Thirty-Second AAAI Conference on Artificial Intelligence. Tianshui Chen, Zhouxia Wang, Guanbin Li, and Liang Lin. 2018c. Recurrent attentional reinforcement learning for multi-label image recognition. In Thirty-Second AAAI Conference on Artificial Intelligence."},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00061"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.691"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3351005"},{"key":"e_1_3_2_2_11_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"5","author":"Das Abhishek","year":"2018","unstructured":"Abhishek Das , Samyak Datta , Georgia Gkioxari , Stefan Lee , Devi Parikh , and Dhruv Batra . 2018 . Embodied question answering . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , Vol. 5 . 6. Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, and Dhruv Batra. 2018. Embodied question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 5. 6."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1993.5.4.613"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/2354409.2354978"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.169"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00430"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.769"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_20_1","volume-title":"Phil Blunsom, and Stephen Clark.","author":"Hill Felix","year":"2017","unstructured":"Felix Hill , Karl Moritz Hermann , Phil Blunsom, and Stephen Clark. 2017 . Understanding grounded language learning agents. arXiv preprint arXiv:1710.09867 (2017). Felix Hill, Karl Moritz Hermann, Phil Blunsom, and Stephen Clark. 2017. Understanding grounded language learning agents. arXiv preprint arXiv:1710.09867 (2017)."},{"key":"e_1_3_2_2_21_1","volume-title":"AI2-THOR: An interactive 3d environment for visual AI. arXiv preprint arXiv:1712.05474","author":"Kolve Eric","year":"2017","unstructured":"Eric Kolve , Roozbeh Mottaghi , Daniel Gordon , Yuke Zhu , Abhinav Gupta , and Ali Farhadi . 2017. AI2-THOR: An interactive 3d environment for visual AI. arXiv preprint arXiv:1712.05474 ( 2017 ). Eric Kolve, Roozbeh Mottaghi, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. 2017. AI2-THOR: An interactive 3d environment for visual AI. arXiv preprint arXiv:1712.05474 (2017)."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2017.8019345"},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_2_2_25_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"1","author":"Mei Hongyuan","year":"2016","unstructured":"Hongyuan Mei , Mohit Bansal , and Matthew R Walter . 2016 . Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences .. In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 1 . 2. Hongyuan Mei, Mohit Bansal, and Matthew R Walter. 2016. Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences.. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 1. 2."},{"key":"e_1_3_2_2_26_1","unstructured":"Piotr Mirowski Razvan Pascanu Fabio Viola Hubert Soyer Andrew J Ballard Andrea Banino Misha Denil Ross Goroshin Laurent Sifre Koray Kavukcuoglu etal 2016. Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673 (2016).  Piotr Mirowski Razvan Pascanu Fabio Viola Hubert Soyer Andrew J Ballard Andrea Banino Misha Denil Ross Goroshin Laurent Sifre Koray Kavukcuoglu et al. 2016. Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673 (2016)."},{"key":"e_1_3_2_2_27_1","volume-title":"International conference on machine learning. 1928--1937","author":"Mnih Volodymyr","year":"2016","unstructured":"Volodymyr Mnih , Adria Puigdomenech Badia , Mehdi Mirza , Alex Graves , Timothy Lillicrap , Tim Harley , David Silver , and Koray Kavukcuoglu . 2016 . Asynchronous methods for deep reinforcement learning . In International conference on machine learning. 1928--1937 . Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. 1928--1937."},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298668"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00102"},{"key":"e_1_3_2_2_30_1","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.  Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99."},{"key":"e_1_3_2_2_31_1","volume-title":"High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438","author":"Schulman John","year":"2015","unstructured":"John Schulman , Philipp Moritz , Sergey Levine , Michael Jordan , and Pieter Abbeel . 2015. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 ( 2015 ). John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015)."},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298655"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.94"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298664"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350924"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.58"},{"key":"e_1_3_2_2_37_1","volume-title":"Visual semantic navigation using scene priors. arXiv preprint arXiv:1810.06543","author":"Yang Wei","year":"2018","unstructured":"Wei Yang , Xiaolong Wang , Ali Farhadi , Abhinav Gupta , and Roozbeh Mottaghi . 2018. Visual semantic navigation using scene priors. arXiv preprint arXiv:1810.06543 ( 2018 ). Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, and Roozbeh Mottaghi. 2018. Visual semantic navigation using scene priors. arXiv preprint arXiv:1810.06543 (2018)."},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350935"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3351011"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.60"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989381"}],"event":{"name":"MM '20: The 28th ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Seattle WA USA","acronym":"MM '20"},"container-title":["Proceedings of the 28th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3413571","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394171.3413571","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:14Z","timestamp":1750193234000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3413571"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,12]]},"references-count":41,"alternative-id":["10.1145\/3394171.3413571","10.1145\/3394171"],"URL":"https:\/\/doi.org\/10.1145\/3394171.3413571","relation":{},"subject":[],"published":{"date-parts":[[2020,10,12]]},"assertion":[{"value":"2020-10-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}