{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T16:04:21Z","timestamp":1753891461571,"version":"3.41.2"},"reference-count":39,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,2,24]],"date-time":"2023-02-24T00:00:00Z","timestamp":1677196800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Neurorobot."],"abstract":"<jats:p>Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established <jats:italic>via<\/jats:italic> the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches.<\/jats:p>","DOI":"10.3389\/fnbot.2023.1093132","type":"journal-article","created":{"date-parts":[[2023,2,24]],"date-time":"2023-02-24T05:50:35Z","timestamp":1677217835000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Viewpoint planning with transition management for active object recognition"],"prefix":"10.3389","volume":"17","author":[{"given":"Haibo","family":"Sun","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Feng","family":"Zhu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yangyang","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pengfei","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanzi","family":"Kong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianyu","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yingcai","family":"Wan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuangfei","family":"Fu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2023,2,24]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"827","DOI":"10.1016\/j.cviu.2013.04.005","article-title":"50 years of object recognition: directions forward","volume":"117","author":"Andreopoulos","year":"2013","journal-title":"Comput. Vis. Image Understand"},{"key":"B2","doi-asserted-by":"crossref","first-page":"6455","DOI":"10.1109\/ICRA.2014.6907812","article-title":"\u201cAppearance-based motion strategies for object detection,\u201d","volume-title":"2014 IEEE International Conference on Robotics and Automation (ICRA)","author":"Becerra","year":"2014"},{"key":"B3","first-page":"2574","article-title":"\u201cBounding boxes, segmentations and object coordinates: how important is recognition for 3d scene flow estimation in autonomous driving scenarios,\u201d","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Behl","year":"2017"},{"key":"B4","first-page":"449","article-title":"\u201cA distributional perspective on reinforcement learning,\u201d","volume-title":"International Conference on Machine Learning","author":"Bellemare","year":"2017"},{"key":"B5","doi-asserted-by":"publisher","first-page":"2151","DOI":"10.1109\/TMI.2019.2894322","article-title":"Automatic 3D bi-ventricular segmentation of cardiac images by a shape-refined multi- task deep learning approach","volume":"38","author":"Duan","year":"2019","journal-title":"IEEE Trans"},{"key":"B6","first-page":"1587","article-title":"\u201cAddressing function approximation error in actor-critic methods,\u201d","volume-title":"International Conference on Machine Learning","author":"Fujimoto","year":"2018"},{"key":"B7","doi-asserted-by":"publisher","first-page":"2514","DOI":"10.1109\/TASE.2021.3088004","article-title":"A deep deterministic policy gradient approach for vehicle speed tracking control with a robotic driver","volume":"19","author":"Hao","year":"2021","journal-title":"IEEE Trans. Autom. Sci. Eng"},{"key":"B8","doi-asserted-by":"publisher","first-page":"489","DOI":"10.1016\/j.neucom.2005.12.126","article-title":"Extreme learning machine: theory and applications","volume":"70","author":"Huang","year":"2006","journal-title":"Neurocomputing"},{"key":"B9","doi-asserted-by":"publisher","first-page":"1601","DOI":"10.1109\/TPAMI.2018.2840991","article-title":"End-to-end policy learning for active visual categorization","volume":"41","author":"Jayaraman","year":"2019","journal-title":"IEEE T. Pattern Anal"},{"key":"B10","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1412.6980","article-title":"ADAM: a method for stochastic optimization","author":"Kingma","year":"2014","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"B11","doi-asserted-by":"publisher","first-page":"4180","DOI":"10.1109\/TPWRS.2020.2999536","article-title":"Agent-based modeling in electricity market using deep deterministic policy gradient algorithm","volume":"35","author":"Liang","year":"2020","journal-title":"IEEE Trans. Power Syst"},{"key":"B12","article-title":"Continuous control with deep reinforcement learning","author":"Lillicrap","year":"2015","journal-title":"arXiv preprint arXiv:1509.02971"},{"volume-title":"Reinforcement Learning for Robots Using Neural Networks","year":"1992","author":"Lin","key":"B13"},{"key":"B14","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1007\/s12293-017-0229-2","article-title":"Active object recognition using hierarchical local-receptive-field-based extreme learning machine","volume":"10","author":"Liu","year":"","journal-title":"Memet. Comput"},{"key":"B15","doi-asserted-by":"publisher","first-page":"2253","DOI":"10.1109\/TNNLS.2017.2785233","article-title":"Extreme trust region policy optimization for active object recognition","volume":"29","author":"Liu","year":"","journal-title":"IEEE Trans. Neural Network Learn. Syst"},{"key":"B16","doi-asserted-by":"crossref","first-page":"4276","DOI":"10.1109\/IROS.2017.8206290","article-title":"\u201cBelief tree search for active object recognition,\u201d","volume-title":"2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)","author":"Malmir","year":"2017"},{"key":"B17","doi-asserted-by":"publisher","first-page":"161.1","DOI":"10.5244\/C.29.161","article-title":"\u201cDeep q-learning for active recognition of germs: Baseline performance on a standardized dataset for active learning,\u201d","author":"Malmir","year":"2015","journal-title":"Proceedings of the British Machine Vision Conference (BMVC)"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1312.5602","article-title":"Playing atari with deep reinforcement learning","author":"Mnih","year":"2013","journal-title":"arXiv preprint"},{"key":"B19","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"B20","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1016\/S0921-8890(99)00079-2","article-title":"Active object recognition by view integration and reinforcement learning","volume":"31","author":"Paletta","year":"2000","journal-title":"Robot. Auton. Syst"},{"key":"B21","doi-asserted-by":"publisher","first-page":"651432","DOI":"10.3389\/fnbot.2021.651432","article-title":"Generative models for active vision","volume":"15","author":"Parr","year":"2021","journal-title":"Front. Neurorobot"},{"key":"B22","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1109\/LRA.2015.2506901","article-title":"Viewpoint evaluation for online 3-d active object classification","volume":"1","author":"Patten","year":"2015","journal-title":"IEEE Robot. Autom. Lett"},{"key":"B23","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2210.07810","article-title":"A consistent and differentiable lp canonical calibration error estimator","author":"Popordanoska","year":"2022","journal-title":"arXiv preprint"},{"key":"B24","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1016\/j.robot.2016.06.013","article-title":"Active multi-view object recognition: a unifying view on online feature selection and view planning","volume":"84","author":"Potthast","year":"2016","journal-title":"Robot Auton. Syst"},{"key":"B25","first-page":"2027","article-title":"\u201cParis-lille-3d: a point cloud dataset for urban scene segmentation and classification,\u201d","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops(CVPR)","author":"Roynard","year":"2018"},{"key":"B26","article-title":"\u201cPrioritized experience replay,\u201d","author":"Schaul","year":"2016","journal-title":"Proceedings of the International Conference on Learning Representations (ICLR)"},{"key":"B27","first-page":"1889","article-title":"\u201cTrust region policy optimization,\u201d","volume-title":"International Conference on Machine Learning","author":"Schulman","year":"2015"},{"key":"B28","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1016\/j.neucom.2020.03.063","article-title":"Adaptive neuro-fuzzy pid controller based on twin delayed deep deterministic policy gradient algorithm","volume":"402","author":"Shi","year":"2020","journal-title":"Neurocomputing"},{"key":"B29","first-page":"387","article-title":"\u201cDeterministic policy gradient algorithms,\u201d","volume-title":"International Conference on Machine Learning","author":"Silver","year":"2014"},{"key":"B30","doi-asserted-by":"crossref","first-page":"5307","DOI":"10.1109\/IROS.2018.8593741","article-title":"\u201cClassification of hanging garments using learned features extracted from 3d point clouds,\u201d","volume-title":"2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)","author":"Stria","year":"2018"},{"key":"B31","doi-asserted-by":"publisher","first-page":"107360","DOI":"10.1016\/j.oceaneng.2020.107360","article-title":"Auv path following controlled by modified deep deterministic policy gradient","volume":"210","author":"Sun","year":"2020","journal-title":"Ocean Eng"},{"volume-title":"Reinforcement Learning: An Introduction","year":"2018","author":"Sutton","key":"B32"},{"key":"B33","doi-asserted-by":"publisher","first-page":"840658","DOI":"10.3389\/fnbot.2022.840658","article-title":"Embodied object representation learning and recognition","volume":"16","author":"Van de Maele","year":"2022","journal-title":"Front. Neurorobot"},{"key":"B34","doi-asserted-by":"publisher","first-page":"3713","DOI":"10.1109\/TSMC.2018.2884725","article-title":"Deterministic policy gradient with integral compensator for robust quadrotor control","volume":"50","author":"Wang","year":"2020","journal-title":"IEEE Trans. Syst. Man Cybernet. Syst"},{"key":"B35","doi-asserted-by":"publisher","first-page":"496","DOI":"10.3390\/machines10070496","article-title":"A state-compensated deep deterministic policy gradient algorithm for uav trajectory tracking","volume":"10","author":"Wu","year":"2022","journal-title":"Machines"},{"key":"B36","doi-asserted-by":"crossref","first-page":"4230","DOI":"10.1109\/ICRA.2015.7139782","article-title":"\u201cActive recognition and pose estimation of household objects in clutter,\u201d","volume-title":"2015 IEEE International Conference on Robotics and Automation (ICRA)","author":"Wu","year":"2015"},{"key":"B37","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1007\/s41095-020-0179-3","article-title":"View planning in robot active vision: a survey of systems, algorithms, and applications","volume":"6","author":"Zeng","year":"2020","journal-title":"Comput. Vis. Media"},{"key":"B38","doi-asserted-by":"publisher","first-page":"1175","DOI":"10.1109\/TWC.2020.3031436","article-title":"Energy-efficient mode selection and resource allocation for d2d-enabled heterogeneous networks: a deep reinforcement learning approach","volume":"20","author":"Zhang","year":"2020","journal-title":"IEEE T. Wirel. Commun"},{"key":"B39","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1109\/TCDS.2016.2614675","article-title":"Deep reinforcement learning with visual attention for vehicle classification","volume":"9","author":"Zhao","year":"2016","journal-title":"IEEE Tran. Cogn. Dev. Syst"}],"container-title":["Frontiers in Neurorobotics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2023.1093132\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,24]],"date-time":"2023-02-24T05:50:45Z","timestamp":1677217845000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2023.1093132\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,24]]},"references-count":39,"alternative-id":["10.3389\/fnbot.2023.1093132"],"URL":"https:\/\/doi.org\/10.3389\/fnbot.2023.1093132","relation":{},"ISSN":["1662-5218"],"issn-type":[{"type":"electronic","value":"1662-5218"}],"subject":[],"published":{"date-parts":[[2023,2,24]]},"article-number":"1093132"}}