{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T08:20:53Z","timestamp":1781252453907,"version":"3.54.1"},"reference-count":35,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T00:00:00Z","timestamp":1757635200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Shanxi Provincial Postgraduate Scientific Research Innovation Project","award":["2024SJ247"],"award-info":[{"award-number":["2024SJ247"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Since efficient path planning technology is the key to the safe and autonomous navigation of autonomous ground robots, and in the complex and asymmetrically distributed land environment, the existing path planning and obstacle avoidance technologies seem somewhat inadequate. Since efficient path planning technology is key to the safe and autonomous navigation of autonomous ground robots, an advanced double Q-learning algorithm based on self-supervised prediction and curiosity-driven exploration is proposed. The algorithm reduces the risk of overestimation and bootstrapping by adjusting the calculation method of the target Q value and optimizing the network structure. In addition, a priority experience replay is introduced to set the priority for the data in the experience pool, thereby increasing the probability that better data is extracted. Experience pool data with fewer training times can be used more effectively. Adding the curiosity network to the original neural network, each state is given an overall reward when performing diverse actions. This method enhances the exploration of unmanned ground mobile robots and can independently select the shortest path to the endpoint. In complex environments, compared with the Sparrow Search Algorithm, Dung Beetle Optimization Algorithm, and Particle Swarm Optimization Algorithm, the results of the proposed algorithm are reduced by 18.07%, 7.91%, and 5.56%, respectively. Therefore, it could better cope with the challenges brought by complex environments and solve the problem that the algorithm cannot converge in complex environments.<\/jats:p>","DOI":"10.3390\/sym17091530","type":"journal-article","created":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T13:43:02Z","timestamp":1757684582000},"page":"1530","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Application of an Improved Double Q-Learning Algorithm in Ground Mobile Robots"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-6883-6976","authenticated-orcid":false,"given":"Jinchao","family":"Zhao","sequence":"first","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ya","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nan","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xinye","family":"Han","sequence":"additional","affiliation":[{"name":"Faculty of Science and Engineering, BEng Electrical & Electronic Engineering, University of Liverpool, Brownlow Hill, Liverpool L69 7ZX, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Luoyin","family":"Ning","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaowei","family":"Ren","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lingling","family":"Fang","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jiaxuan","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xu","family":"Ren","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yu","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jinghao","family":"Feng","sequence":"additional","affiliation":[{"name":"College of Mechatronics Engineering, North University of China, Taiyuan 030051, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Wang, H., Liu, J., Dong, H., and Shao, Z. (2025). A Survey of the Multi-Sensor Fusion Object Detection Task in Autonomous Driving. Sensors, 25.","DOI":"10.3390\/s25092794"},{"key":"ref_2","unstructured":"Carberry, S. (2024). Land Mass: Ground Robot Swarm Tech Crawling Along. Natl. Def., 109."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ou, Y., Cai, Y., Sun, Y., and Qin, T. (2024). Correction: Ou et al. Autonomous Navigation by Mobile Robot with Sensor Fusion Based on Deep Reinforcement Learning. Sensors, 24.","DOI":"10.3390\/s24123895"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yan, W., Xu, X., Rodi\u0107, A., and Petrovich, P.B. (2025). FRRT*-Connect: A Bidirectional Sampling-Based Path Planner with Potential Field Guidance for Complex Obstacle Environments. Sensors, 25.","DOI":"10.3390\/s25092761"},{"key":"ref_5","unstructured":"Wolf, Y., Levy, E., and Rotbart, M. (2025). Ground Robot Drive System: US15450445. (US20170174278A1)."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"124575","DOI":"10.1016\/j.apenergy.2024.124575","article-title":"A dynamic migration route planning optimization strategy based on real-time energy state observation considering flexibility and energy efficiency of thermal power unit","volume":"377","author":"Hong","year":"2025","journal-title":"Appl. Energy"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1007\/s11768-023-00162-x","article-title":"System identification and control of the ground operation mode of a hybrid aerial\u2013ground robot","volume":"21","author":"Cao","year":"2023","journal-title":"Control Theory Technol."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Parkhomenko, V., and Medvedev, M. (2021, January 3\u20135). Nerual Network System For Ground Robot Path Planning and Obstacle Avoidance. Proceedings of the 2021 7th International Conference on Mechatronics and Robotics Engineering (ICMRE), Budapest, Hungary.","DOI":"10.1109\/ICMRE51691.2021.9384820"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Tavares, A.J.A., and Oliveira, N.M.F. (2024). A Novel Approach for Kalman Filter Tuning for Direct and Indirect Inertial Navigation System\/Global Navigation Satellite System Integration. Sensors, 24.","DOI":"10.3390\/s24227331"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Bansal, T., and Anand, S. (2024). Probabilistic Roadmap Generation for Autonomous Robot Path Planning in Dynamic Environments. International Conference on Mechanical and Energy Technologies, Springer.","DOI":"10.1007\/978-981-97-2716-2_41"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"100","DOI":"10.62520\/fujece.1501508","article-title":"Optimizing Unmanned Vehicle Navigation: A Hybrid PSO-GWO Algorithm for Efficient Route Planning","volume":"4","author":"Altun","year":"2025","journal-title":"Firat Univ. J. Exp. Comput. Eng. FUJECE"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"130552","DOI":"10.1016\/j.physa.2025.130552","article-title":"A two-level framework for dynamic route planning and trajectory optimization of connected and automated vehicles in road networks","volume":"668","author":"Xue","year":"2025","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"14401","DOI":"10.1109\/TVT.2020.3034628","article-title":"Optimal Time-Consuming Path Planning for Autonomous Underwater Vehicles Based on a Dynamic Neural Network Model in Ocean Current Environments","volume":"69","author":"Chen","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"AbuJabal, N., Rabie, T., Baziyad, M., Kamel, I., and Almazrouei, K. (2024). Path Planning Techniques for Real-Time Multi-Robot Systems: A Systematic Review. Electronics, 13.","DOI":"10.3390\/electronics13122239"},{"key":"ref_15","first-page":"31","article-title":"Path planning of a building robot based on BIM and an improved RRT algorithm","volume":"41","author":"Yang","year":"2024","journal-title":"Exp. Technol. Manag."},{"key":"ref_16","first-page":"1431","article-title":"Optimization of route planning for the mobile robot using a hybrid Neuro-IWO technique","volume":"17","author":"Sahoo","year":"2025","journal-title":"Int. J. Inf. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"112386","DOI":"10.1016\/j.asoc.2024.112386","article-title":"Improving policy training for autonomous driving through randomized ensembled double Q-learning with Transformer encoder feature evaluation","volume":"167","author":"Fan","year":"2024","journal-title":"Appl. Soft Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"121249","DOI":"10.1016\/j.oceaneng.2025.121249","article-title":"Optimization of route planning based on active towed array sonar for underwater search and rescue","volume":"330","author":"Liao","year":"2025","journal-title":"Ocean. Eng."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Moras, J. (2024). Continuous Online Semantic Implicit Representation for Autonomous Ground Robot Navigation in Unstructured Environments. Robotics, 13.","DOI":"10.3390\/robotics13070108"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2074","DOI":"10.1111\/mice.13176","article-title":"Unmanned aerial vehicle\u2013human collaboration route planning for intelligent infrastructure inspection","volume":"39","author":"Pan","year":"2024","journal-title":"Comput.-Aided Civ. Infrastruct. Eng."},{"key":"ref_21","unstructured":"Tai, L. (2019). Sensorimotor Learning for Ground Robot Navigation. [Ph.D. Thesis, The Hong Kong University of Science and Technology]."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1007\/s10015-023-00909-4","article-title":"Deep-reinforcement learning-based route planning with obstacle avoidance for autonomous vessels","volume":"29","author":"Saga","year":"2024","journal-title":"Artif. Life Robot."},{"key":"ref_23","first-page":"2769","article-title":"Improved Double Deep Q Network Algorithm Based on Average Q-Value Estimation and Reward Redistribution for Robot Path Planning","volume":"81","author":"Yin","year":"2024","journal-title":"Comput. Mater. Contin."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2450230","DOI":"10.1142\/S021812662450230X","article-title":"An Efficient Double Deep Q Learning Network-Based Soft Faults Detection and Localization in Analog Circuits","volume":"33","author":"Puvaneswari","year":"2024","journal-title":"J. Circuits Syst. Comput."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2736","DOI":"10.1109\/TCST.2019.2949757","article-title":"Cautious Model Predictive Control Using Gaussian Process Regression","volume":"28","author":"Hewing","year":"2020","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1982","DOI":"10.1109\/TCST.2022.3216989","article-title":"Multitrajectory Model Predictive Control for Safe UAV Navigation in an Unknown Environment","volume":"31","author":"Saccani","year":"2023","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"2887","DOI":"10.1109\/TCST.2023.3259819","article-title":"An Adaptive Line-of-Sight (ALOS) Guidance Law for Path Following of Aircraft and Marine Craft","volume":"31","author":"Fossen","year":"2023","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TCST.2024.3377876","article-title":"Path-Following Control of Unmanned Underwater Vehicle Based on an Improved TD3 Deep Reinforcement Learning","volume":"32","author":"Fan","year":"2024","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1109\/TCST.2024.3393210","article-title":"Guaranteeing Control Requirements via Reward Shaping in Reinforcement Learning","volume":"32","author":"Coraggio","year":"2024","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TCST.2021.3107483","article-title":"Impedance Learning-Based Adaptive Control for Human\u2013Robot Interaction","volume":"30","author":"Sharifi","year":"2022","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1016\/j.robot.2023.104529","article-title":"A study of robotic search strategy for multi-radiation sources in unknown environments","volume":"169","author":"Bai","year":"2023","journal-title":"Robot. Auton. Syst."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Feng, A., Xie, Y., Sun, Y., Wang, X., Jiang, B., and Xiao, J. (2023). Efficient Autonomous Exploration and Mapping in Unknown Environments. Sensors, 23.","DOI":"10.3390\/s23104766"},{"key":"ref_33","unstructured":"Bellemare, M.G., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., and Munos, R. (2016). Unifying Count-Based Exploration and Intrinsic Motivation. arXiv."},{"key":"ref_34","unstructured":"Tang, H., Houthooft, R., Foote, D., Stooke, A., Chen, X., Duan, Y., Schulman, J., De Turck, F., and Abbeel, P. (2017, January 4\u20139). #Exploration: A study of count-based exploration for deep reinforcement learning. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_35","unstructured":"Stadie, B.C., Levine, S., and Abbeel, P. (2015). Incentivizing exploration in reinforcement learning with deep predictive models. arXiv."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/9\/1530\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:44:47Z","timestamp":1760035487000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/9\/1530"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,12]]},"references-count":35,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["sym17091530"],"URL":"https:\/\/doi.org\/10.3390\/sym17091530","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,12]]}}}