{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T11:34:52Z","timestamp":1780486492609,"version":"3.54.1"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2023,7,26]],"date-time":"2023-07-26T00:00:00Z","timestamp":1690329600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFB3303400"],"award-info":[{"award-number":["2022YFB3303400"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62025207"],"award-info":[{"award-number":["62025207"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"GD Natural Science Foundation","award":["2021B1515020085"],"award-info":[{"award-number":["2021B1515020085"]}]},{"name":"Shenzhen Science and Technology Program","award":["RCYX20210609103121030"],"award-info":[{"award-number":["RCYX20210609103121030"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:p>\n            Autoscanning of an unknown environment is the key to many AR\/VR and robotic applications. However, autonomous reconstruction with both high efficiency and quality remains a challenging problem. In this work, we propose a reconstruction-oriented autoscanning approach, called ScanBot, which utilizes hierarchical deep reinforcement learning techniques for global\n            <jats:italic>region-of-interest<\/jats:italic>\n            (ROI) planning to improve the scanning efficiency and local\n            <jats:italic>next-best-view<\/jats:italic>\n            (NBV) planning to enhance the reconstruction quality. Given the partially reconstructed scene, the global policy designates an ROI with insufficient exploration or reconstruction. The local policy is then applied to refine the reconstruction quality of objects in this region by planning and scanning a series of NBVs. A novel mixed 2D-3D representation is designed for these policies, where a 2D quality map with tailored quality channels encoding the scanning progress is consumed by the global policy, and a coarse-to-fine 3D volumetric representation that embodies both local environment and object completeness is fed to the local policy. These two policies iterate until the whole scene has been completely explored and scanned. To speed up the learning of complex environmental dynamics and enhance the agent's memory for spatial-temporal inference, we further introduce two novel auxiliary learning tasks to guide the training of our global policy. Thorough evaluations and comparisons are carried out to show the feasibility of our proposed approach and its advantages over previous methods. Code and data are available at https:\/\/github.com\/HezhiCao\/Scanbot.\n          <\/jats:p>","DOI":"10.1145\/3592113","type":"journal-article","created":{"date-parts":[[2023,7,26]],"date-time":"2023-07-26T15:47:45Z","timestamp":1690386465000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["ScanBot: Autonomous Reconstruction via Deep Reinforcement Learning"],"prefix":"10.1145","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4760-0743","authenticated-orcid":false,"given":"Hezhi","family":"Cao","sequence":"first","affiliation":[{"name":"University of Science and Technology of China, HeFei, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3396-9243","authenticated-orcid":false,"given":"Xi","family":"Xia","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, HeFei, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-6495-1845","authenticated-orcid":false,"given":"Guan","family":"Wu","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, HeFei, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6798-0336","authenticated-orcid":false,"given":"Ruizhen","family":"Hu","sequence":"additional","affiliation":[{"name":"Shenzhen University, ShenZhen, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4352-1431","authenticated-orcid":false,"given":"Ligang","family":"Liu","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, HeFei, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,7,26]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"Ilge Akkaya Marcin Andrychowicz Maciek Chociej Mateusz Litwin Bob McGrew Arthur Petron Alex Paino Matthias Plappert Glenn Powell Raphael Ribas et al. 2019. Solving rubik's cube with a robot hand. arXiv preprint arXiv:1910.07113 (2019)."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10916"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/IRDS.2002.1041446"},{"key":"e_1_2_2_4_1","volume-title":"Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158","author":"Chang Angel","year":"2017","unstructured":"Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. 2017. Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158 (2017)."},{"key":"e_1_2_2_5_1","volume-title":"Neural Topological SLAM for Visual Navigation. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12872--12881","author":"Chaplot Devendra","year":"2020","unstructured":"Devendra Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, and Saurabh Gupta. 2020c. Neural Topological SLAM for Visual Navigation. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12872--12881."},{"key":"e_1_2_2_6_1","volume-title":"SEAL: Self-supervised Embodied Active Learning. In Advances in Neural Information Processing Systems.","author":"Chaplot Devendra Singh","year":"2021","unstructured":"Devendra Singh Chaplot, Murtaza Dalal, Saurabh Gupta, Jitendra Malik, and Ruslan Salakhutdinov. 2021. SEAL: Self-supervised Embodied Active Learning. In Advances in Neural Information Processing Systems."},{"key":"e_1_2_2_7_1","volume-title":"Learning To Explore Using Active Neural SLAM. In International Conference on Learning Representations (ICLR).","author":"Chaplot Devendra Singh","year":"2020","unstructured":"Devendra Singh Chaplot, Dhiraj Gandhi, Saurabh Gupta, Abhinav Gupta, and Ruslan Salakhutdinov. 2020b. Learning To Explore Using Active Neural SLAM. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_2_2_8_1","volume-title":"Abhinav Gupta, and Russ R Salakhutdinov.","author":"Chaplot Devendra Singh","year":"2020","unstructured":"Devendra Singh Chaplot, Dhiraj Prakashchand Gandhi, Abhinav Gupta, and Russ R Salakhutdinov. 2020a. Object goal navigation using goal-oriented semantic exploration. Advances in Neural Information Processing Systems 33 (2020)."},{"key":"e_1_2_2_9_1","volume-title":"Robotics: Science and Systems","volume":"11","author":"Charrow Benjamin","year":"2015","unstructured":"Benjamin Charrow, Gregory Kahn, Sachin Patil, Sikang Liu, Ken Goldberg, Pieter Abbeel, Nathan Michael, and Vijay Kumar. 2015. Information-Theoretic Planning with Trajectory Optimization for Dense 3D Mapping.. In Robotics: Science and Systems, Vol. 11. Rome, 3--12."},{"key":"e_1_2_2_10_1","volume-title":"Learning exploration policies for navigation. arXiv preprint arXiv:1903.01959","author":"Chen Tao","year":"2019","unstructured":"Tao Chen, Saurabh Gupta, and Abhinav Gupta. 2019. Learning exploration policies for navigation. arXiv preprint arXiv:1903.01959 (2019)."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3054739"},{"key":"e_1_2_2_12_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3306346.3322942","article-title":"Multi-Robot Collaborative Dense Scene Reconstruction","volume":"38","author":"Dong Siyan","year":"2019","unstructured":"Siyan Dong, Kai Xu, Qiang Zhou, Andrea Tagliasacchi, Shiqing Xin, Matthias Nie\u00dfner, and Baoquan Chen. 2019. Multi-Robot Collaborative Dense Scene Reconstruction. ACM Transactions on Graphics 38, 4 (2019), 1--16.","journal-title":"ACM Transactions on Graphics"},{"key":"e_1_2_2_13_1","volume-title":"Path planning and trajectory planning algorithms: A general overview. Motion and operation planning of robotic systems","author":"Gasparetto Alessandro","year":"2015","unstructured":"Alessandro Gasparetto, Paolo Boscariol, Albano Lanzutti, and Renato Vidoni. 2015. Path planning and trajectory planning algorithms: A general overview. Motion and operation planning of robotic systems (2015), 3--27."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989385"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.769"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417764"},{"key":"e_1_2_2_18_1","volume-title":"Tom Schaul, Joel Z Leibo, David Silver, and Koray Kavukcuoglu.","author":"Jaderberg Max","year":"2016","unstructured":"Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z Leibo, David Silver, and Koray Kavukcuoglu. 2016. Reinforcement learning with unsupervised auxiliary tasks. arXiv preprint arXiv:1611.05397 (2016)."},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS40897.2019.8967913"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1609\/aiide.v15i1.5222"},{"key":"e_1_2_2_21_1","unstructured":"Sven Koenig and Maxim Likhachev. 2002. D\u02c6* lite. Aaai\/iaai 15 (2002)."},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2011.5980429"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6385624"},{"key":"e_1_2_2_24_1","volume-title":"Conference on Robot Learning. PMLR, 603--616","author":"Li Chengshu","year":"2020","unstructured":"Chengshu Li, Fei Xia, Roberto Martin-Martin, and Silvio Savarese. 2020. Hrl4in: Hierarchical reinforcement learning for interactive navigation with mobile manipulators. In Conference on Robot Learning. PMLR, 603--616."},{"key":"e_1_2_2_25_1","first-page":"1","article-title":"Object-aware guidance for autonomous scene reconstruction","volume":"37","author":"Liu Ligang","year":"2018","unstructured":"Ligang Liu, Xi Xia, Han Sun, Qi Shen, Juzhan Xu, Bin Chen, Hui Huang, and Kai Xu. 2018. Object-aware guidance for autonomous scene reconstruction. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--12.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_26_1","volume-title":"Conference on Robot Learning. PMLR, 734--743","author":"Matas Jan","year":"2018","unstructured":"Jan Matas, Stephen James, and Andrew J Davison. 2018. Sim-to-real reinforcement learning for deformable object manipulation. In Conference on Robot Learning. PMLR, 734--743."},{"key":"e_1_2_2_27_1","unstructured":"Piotr Mirowski Razvan Pascanu Fabio Viola Hubert Soyer Andrew J Ballard Andrea Banino Misha Denil Ross Goroshin Laurent Sifre Koray Kavukcuoglu et al. 2016. Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673 (2016)."},{"key":"e_1_2_2_28_1","volume-title":"Multi-agent manipulation via locomotion using hierarchical sim2real. arXiv preprint arXiv:1908.05224","author":"Nachum Ofir","year":"2019","unstructured":"Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, and Vikash Kumar. 2019. Multi-agent manipulation via locomotion using hierarchical sim2real. arXiv preprint arXiv:1908.05224 (2019)."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISMAR.2011.6092378"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508363.2508374"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2017.70"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275014"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460528"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00943"},{"key":"e_1_2_2_35_1","doi-asserted-by":"crossref","unstructured":"David Silver Thomas Hubert Julian Schrittwieser Ioannis Antonoglou Matthew Lai Arthur Guez Marc Lanctot Laurent Sifre Dharshan Kumaran Thore Graepel et al. 2018. A general reinforcement learning algorithm that masters chess shogi and Go through self-play. Science 362 6419 (2018) 1140--1144.","DOI":"10.1126\/science.aar6404"},{"key":"e_1_2_2_36_1","volume-title":"Robotics: Science and systems","author":"Stachniss Cyrill","unstructured":"Cyrill Stachniss, Giorgio Grisetti, and Wolfram Burgard. 2005. Information Gain-based Exploration Using Rao-Blackwellized Particle Filters.. In Robotics: Science and systems, Vol. 2. 65--72."},{"key":"e_1_2_2_37_1","volume-title":"Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence 112, 1--2","author":"Sutton Richard S","year":"1999","unstructured":"Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence 112, 1--2 (1999), 181--211."},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8202319"},{"key":"e_1_2_2_39_1","volume-title":"International Conference on Machine Learning. PMLR, 3540--3549","author":"Vezhnevets Alexander Sasha","year":"2017","unstructured":"Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning. PMLR, 3540--3549."},{"key":"e_1_2_2_40_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3414685.3417788","article-title":"Scene mover: Automatic move planning for scene arrangement by deep reinforcement learning","volume":"39","author":"Wang Hanqing","year":"2020","unstructured":"Hanqing Wang, Wei Liang, and Lap-Fai Yu. 2020. Scene mover: Automatic move planning for scene arrangement by deep reinforcement learning. ACM Transactions on Graphics (TOG) 39, 6 (2020), 1--15.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364916669237"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661242"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00945"},{"key":"e_1_2_2_44_1","first-page":"1","article-title":"Autoscanning for coupled scene reconstruction and proactive object analysis","volume":"34","author":"Xu Kai","year":"2015","unstructured":"Kai Xu, Hui Huang, Yifei Shi, Hao Li, Pinxin Long, Jianong Caichen, Wei Sun, and Baoquan Chen. 2015. Autoscanning for coupled scene reconstruction and proactive object analysis. ACM Transactions on Graphics (TOG) 34, 6 (2015), 1--14.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_45_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2980179.2980224","article-title":"3D attention-driven depth acquisition for object identification","volume":"35","author":"Xu Kai","year":"2016","unstructured":"Kai Xu, Yifei Shi, Lintao Zheng, Junyu Zhang, Min Liu, Hui Huang, Hao Su, Daniel Cohen-Or, and Baoquan Chen. 2016. 3D attention-driven depth acquisition for object identification. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1--14.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_46_1","first-page":"1","article-title":"Autonomous reconstruction of unknown indoor scenes guided by time-varying tensor fields","volume":"36","author":"Xu Kai","year":"2017","unstructured":"Kai Xu, Lintao Zheng, Zihao Yan, Guohang Yan, Eugene Zhang, Matthias Niessner, Oliver Deussen, Daniel Cohen-Or, and Hui Huang. 2017. Autonomous reconstruction of unknown indoor scenes guided by time-varying tensor fields. ACM Transactions on Graphics (TOG) 36, 6 (2017), 1--15.","journal-title":"ACM Transactions on Graphics (TOG)"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CIRA.1997.613851"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01581"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01442"},{"key":"e_1_2_2_50_1","volume-title":"Computer Graphics Forum","author":"Zheng Lintao","unstructured":"Lintao Zheng, Chenyang Zhu, Jiazhao Zhang, Hang Zhao, Hui Huang, Matthias Niessner, and Kai Xu. 2019. Active scene understanding via online semantic reconstruction. In Computer Graphics Forum, Vol. 38. Wiley Online Library, 103--114."}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3592113","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3592113","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:45Z","timestamp":1750178265000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3592113"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,26]]},"references-count":50,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,8]]}},"alternative-id":["10.1145\/3592113"],"URL":"https:\/\/doi.org\/10.1145\/3592113","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,26]]},"assertion":[{"value":"2023-07-26","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}