{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:17:33Z","timestamp":1766067453525,"version":"build-2065373602"},"reference-count":32,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2022,1,11]],"date-time":"2022-01-11T00:00:00Z","timestamp":1641859200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Postgraduate Practice Innovation Program of Jiangsu Province","award":["SJCX20_0932"],"award-info":[{"award-number":["SJCX20_0932"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>To solve the problems of poor exploration ability and convergence speed of traditional deep reinforcement learning in the navigation task of the patrol robot under indoor specified routes, an improved deep reinforcement learning algorithm based on Pan\/Tilt\/Zoom(PTZ) image information was proposed in this paper. The obtained symmetric image information and target position information are taken as the input of the network, the speed of the robot is taken as the output of the next action, and the circular route with boundary is taken as the test. The improved reward and punishment function is designed to improve the convergence speed of the algorithm and optimize the path so that the robot can plan a safer path while avoiding obstacles first. Compared with Deep Q Network(DQN) algorithm, the convergence speed after improvement is shortened by about 40%, and the loss function is more stable.<\/jats:p>","DOI":"10.3390\/sym14010132","type":"journal-article","created":{"date-parts":[[2022,1,11]],"date-time":"2022-01-11T20:33:04Z","timestamp":1641933184000},"page":"132","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["Improved Path Planning for Indoor Patrol Robot Based on Deep Reinforcement Learning"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9200-863X","authenticated-orcid":false,"given":"Jianfeng","family":"Zheng","sequence":"first","affiliation":[{"name":"School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China"}]},{"given":"Shuren","family":"Mao","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China"}]},{"given":"Zhenyu","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China"}]},{"given":"Pengcheng","family":"Kong","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6746-4251","authenticated-orcid":false,"given":"Hao","family":"Qiang","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,11]]},"reference":[{"key":"ref_1","unstructured":"Sun, Y., Wang, J., and Duan, X. (2013, January 20\u201322). Research on Path Planning Algorithm of Indoor Mobile Robot. Proceedings of the 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC), Shenyang, China."},{"key":"ref_2","unstructured":"Wang, C., Zhu, D., Li, T., Meng, M.Q.H., and Silva, C.D. (2018). SRM: An Efficient Framework for Autonomous Robotic Exploration in Indoor Environments. arXiv."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"012047","DOI":"10.1088\/1742-6596\/1898\/1\/012047","article-title":"Application of A-Star Algorithm on Pathfinding Game","volume":"1898","author":"Candra","year":"2021","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1186\/s13638-019-1396-2","article-title":"Obstacle avoidance of mobile robots using modified artificial potential field algorithm","volume":"2019","author":"Rostami","year":"2019","journal-title":"EURASIP J. Wirel. Commun. Netw."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"8223","DOI":"10.1007\/s13369-021-05443-8","article-title":"A Predictive Path Planning Algorithm for Mobile Robot in Dynamic Environments Based on Rapidly Exploring Random Tree","volume":"46","author":"Zhang","year":"2021","journal-title":"Arab. J. Sci. Eng."},{"key":"ref_6","unstructured":"Lynnerup, N.A., Nolling, L., Hasle, R., and Hallam, J. (November, January 30). A Survey on Reproducibility by Evaluating Deep Reinforcement Learning Algorithms on Real-World Robots. Proceedings of the Conference on Robot Learning: CoRL 2019, Osaka, Japan."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1007\/s41315-020-00135-2","article-title":"A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation","volume":"4","author":"Zhang","year":"2020","journal-title":"Int. J. Intell. Robot. Appl."},{"key":"ref_8","first-page":"1","article-title":"Deep Reinforcement Learning Algorithms for Multiple Arc-Welding Robots","volume":"2","author":"Chen","year":"2021","journal-title":"Front. Control Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Tai, L., Li, S., and Liu, M. (2016, January 9\u201314). A Deep-Network Solution towards Model-Less Obstacle Avoidance. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759428"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yu, X., Wang, P., and Zhang, Z. (2021). Learning-Based End-to-End Path Planning for Lunar Rovers with Safety Constraints. Sensors, 21.","DOI":"10.3390\/s21030796"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"012064","DOI":"10.1088\/1742-6596\/1955\/1\/012064","article-title":"Research on multi feature fusion perception technology of mine fire based on inspection robot","volume":"1955","author":"Miao","year":"2021","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shi, X., Lu, J., Liu, F., and Zhou, J. (2014, January 26\u201328). Patrol Robot Navigation Control Based on Memory Algorithm. Proceedings of the 2014 4th IEEE International Conference on Information Science and Technology, Shenzhen, China.","DOI":"10.1109\/ICIST.2014.6920362"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"012002","DOI":"10.1088\/1755-1315\/582\/1\/012002","article-title":"A Deep Learning and Depth Image based Obstacle Detection and Distance Measurement Method for Substation Patrol Robot","volume":"582","author":"Xu","year":"2020","journal-title":"IOP Conf. Ser. Earth Environ. Sci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"052035","DOI":"10.1088\/1755-1315\/546\/5\/052035","article-title":"Research on Indoor Patrol Robot Location based on BP Neural Network","volume":"546","author":"Dong","year":"2020","journal-title":"IOP Conf. Ser. Earth Environ. Sci."},{"key":"ref_16","unstructured":"Van Nguyen, T.T., Phung, M.D., Pham, D.T., and Tran, Q.V. (2020). Development of a Fuzzy-based Patrol Robot Using in Building Automation System. arXiv."},{"key":"ref_17","unstructured":"Ji, J., Xing, F., and Li, Y. (2019, January 6\u20137). Research on Navigation System of Patrol Robot Based on Multi-Sensor Fusion. Proceedings of the 2019 8th International Conference on Advanced Materials and Computer Science(ICAMCS 2019), Chongqing, China."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Xia, L., Meng, Q., Chi, D., Meng, B., and Yang, H. (2019). An Optimized Tightly-Coupled VIO Design on the Basis of the Fused Point and Line Features for Patrol Robot Navigation. Sensors, 19.","DOI":"10.3390\/s19092004"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.15837\/ijccc.2021.2.4115","article-title":"Extract Executable Action Sequences from Natural Language Instructions Based on DQN for Medical Service Robots","volume":"16","author":"Zhao","year":"2021","journal-title":"Int. J. Comput. Commun. Control"},{"key":"ref_20","first-page":"2269","article-title":"DQN Reinforcement Learning: The Robot\u2019s Optimum Path Navigation in Dynamic Environments for Smart Factory","volume":"44","author":"Seok","year":"2019","journal-title":"J. Korean Inst. Commun. Inf. Sci."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"840","DOI":"10.20965\/jaciii.2017.p0840","article-title":"Experimental Study on Behavior Acquisition of Mobile Robot by Deep Q-Network","volume":"21","author":"Sasaki","year":"2017","journal-title":"J. Adv. Comput. Intell. Intell. Inform."},{"key":"ref_22","first-page":"220","article-title":"Walking Stability Control Method for Biped Robot on Uneven Ground Based on Deep Q-Network","volume":"28","author":"Han","year":"2019","journal-title":"J. Beijing Inst. Technol."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1186\/s40638-018-0091-9","article-title":"Implementation of Q learning and deep Q network for controlling a self balancing robot model","volume":"5","author":"Rahman","year":"2018","journal-title":"Robot. Biomim."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1007\/s10846-021-01333-1","article-title":"Deep Reinforcement Learning for a Humanoid Robot Soccer Player","volume":"102","author":"Perico","year":"2021","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_25","first-page":"825","article-title":"Enhanced Autonomous Navigation of Robots by Deep Reinforcement Learning Algorithm with Multistep Method","volume":"33","author":"Peng","year":"2021","journal-title":"Sens. Mater."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"6678","DOI":"10.1109\/LRA.2020.3013906","article-title":"AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning","volume":"5","author":"Tallamraju","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Abanay, A., Masmoudi, L., Elharif, A., Gharbi, M., and Bououlid, B. (2017, January 14\u201316). Design and Development of a Mobile Platform for an Agricultural Robot Prototype. Proceedings of the 2nd International Conference on Computing and Wireless Communication Systems, Larache, Morocco.","DOI":"10.1145\/3167486.3167527"},{"key":"ref_28","first-page":"100","article-title":"A method for path planning strategy and navigation of service robot","volume":"2","author":"Budiharto","year":"2011","journal-title":"Paladyn"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Arvin, F., Samsudin, K., and Nasseri, M.A. (2009, January 25\u201326). Design of a Differential-Drive Wheeled Robot Controller with Pulse-Width Modulation. Proceedings of the 2009 Innovative Technologies in Intelligent Systems and Industrial Applications, Kuala Lumpur, Malaysia.","DOI":"10.1109\/CITISIA.2009.5224223"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Bethencourt, J.V.M., Ling, Q., and Fern\u00e1ndez, A.V. (2011, January 23\u201325). Controller Design and Implementation for a Differential Drive Wheeled Mobile Robot. Proceedings of the 2011 Chinese Control and Decision Conference (CCDC), Mianyang, China.","DOI":"10.1109\/CCDC.2011.5968930"},{"key":"ref_31","unstructured":"Zeng, D., Xu, G., Zhong, J., and Li, L. (2007, January 18\u201321). Development of a Mobile Platform for Security Robot. Proceedings of the 2007 IEEE International Conference on Automation and Logistics, Jinan, China."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Sharma, M., Sharma, R., Ahuja, K., and Jha, S. (2014, January 6\u20138). Design of an Intelligent Security Robot for Collision Free Navigation Applications. Proceedings of the 2014 International Conference on Reliability Optimization and Information Technology (ICROIT), Faridabad, India.","DOI":"10.1109\/ICROIT.2014.6798324"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/1\/132\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T14:28:18Z","timestamp":1760365698000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/1\/132"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,11]]},"references-count":32,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1]]}},"alternative-id":["sym14010132"],"URL":"https:\/\/doi.org\/10.3390\/sym14010132","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2022,1,11]]}}}