{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,29]],"date-time":"2026-03-29T22:08:36Z","timestamp":1774822116574,"version":"3.50.1"},"reference-count":41,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2024,2,27]],"date-time":"2024-02-27T00:00:00Z","timestamp":1708992000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Foundation of China","award":["62373376"],"award-info":[{"award-number":["62373376"]}]},{"name":"Natural Science Foundation of China","award":["61976224"],"award-info":[{"award-number":["61976224"]}]},{"name":"Natural Science Foundation of China","award":["61976088"],"award-info":[{"award-number":["61976088"]}]},{"name":"Natural Science Foundation of China","award":["2022JJ30758"],"award-info":[{"award-number":["2022JJ30758"]}]},{"DOI":"10.13039\/501100004735","name":"Natural Science Foundation of Hunan Province","doi-asserted-by":"publisher","award":["62373376"],"award-info":[{"award-number":["62373376"]}],"id":[{"id":"10.13039\/501100004735","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004735","name":"Natural Science Foundation of Hunan Province","doi-asserted-by":"publisher","award":["61976224"],"award-info":[{"award-number":["61976224"]}],"id":[{"id":"10.13039\/501100004735","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004735","name":"Natural Science Foundation of Hunan Province","doi-asserted-by":"publisher","award":["61976088"],"award-info":[{"award-number":["61976088"]}],"id":[{"id":"10.13039\/501100004735","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004735","name":"Natural Science Foundation of Hunan Province","doi-asserted-by":"publisher","award":["2022JJ30758"],"award-info":[{"award-number":["2022JJ30758"]}],"id":[{"id":"10.13039\/501100004735","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Path planning for mobile robots in complex circumstances is still a challenging issue. This work introduces an improved deep reinforcement learning strategy for robot navigation that combines dueling architecture, Prioritized Experience Replay, and shaped Rewards. In a grid world and two Gazebo simulation environments with static and dynamic obstacles, the Dueling Deep Q-Network with Modified Rewards and Prioritized Experience Replay (PMR-Dueling DQN) algorithm is compared against Q-learning, DQN, and DDQN in terms of path optimality, collision avoidance, and learning speed. To encourage the best routes, the shaped Reward function takes into account target direction, obstacle avoidance, and distance. Prioritized replay concentrates training on important events while a dueling architecture separates value and advantage learning. The results show that the PMR-Dueling DQN has greatly increased convergence speed, stability, and overall performance across conditions. In both grid world and Gazebo environments the PMR-Dueling DQN achieved higher cumulative rewards. The combination of deep reinforcement learning with reward design, network architecture, and experience replay enables the PMR-Dueling DQN to surpass traditional approaches for robot path planning in complex environments.<\/jats:p>","DOI":"10.3390\/s24051523","type":"journal-article","created":{"date-parts":[[2024,2,27]],"date-time":"2024-02-27T03:35:48Z","timestamp":1709004948000},"page":"1523","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["Enhancing Stability and Performance in Mobile Robot Path Planning with PMR-Dueling DQN Algorithm"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-7127-5458","authenticated-orcid":false,"given":"Demelash Abiye","family":"Deguale","sequence":"first","affiliation":[{"name":"School of Automation, Central South University, Changsha 410083, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3690-8569","authenticated-orcid":false,"given":"Lingli","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Automation, Central South University, Changsha 410083, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-9048-7913","authenticated-orcid":false,"given":"Melikamu Liyih","family":"Sinishaw","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Central South University, Changsha 410083, China"}]},{"given":"Keyi","family":"Li","sequence":"additional","affiliation":[{"name":"School of Automation, Central South University, Changsha 410083, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,2,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zhang, H.Y., Lin, W.M., and Chen, A.X. (2018). Path planning for the mobile robot: A review. Symmetry, 10.","DOI":"10.3390\/sym10100450"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.procs.2018.07.018","article-title":"Methodology for path planning and optimization of mobile robots: A review","volume":"133","author":"Zafar","year":"2018","journal-title":"Procedia Comput. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Tian, S., Lei, S., Huang, Q., and Huang, A. (2022, January 18\u201321). The application of path planning algorithm based on deep reinforcement learning for mobile robots. Proceedings of the 2022 International Conference on Culture-Oriented Science and Technology (CoST), Lanzhou, China.","DOI":"10.1109\/CoST57098.2022.00084"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1016\/j.dt.2019.04.011","article-title":"A review: On path planning strategies for navigation of mobile robot","volume":"15","author":"Patle","year":"2019","journal-title":"Def. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3951","DOI":"10.1007\/s00170-021-08597-9","article-title":"A modified Q-learning algorithm for robot path planning in a digital twin assembly system","volume":"119","author":"Guo","year":"2022","journal-title":"Int. J. Adv. Manuf. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Bae, H., Kim, G., Kim, J., Qian, D., and Lee, S. (2019). Multi-robot path planning method using reinforcement learning. Appl. Sci., 9.","DOI":"10.3390\/app9153057"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Holen, M., Saha, R., Goodwin, M., Omlin, C.W., and Sandsmark, K.E. (2020, January 19\u201322). Road detection for reinforcement learning based autonomous car. Proceedings of the 3rd International Conference on Information Science and Systems, Cambridge, UK.","DOI":"10.1145\/3388176.3388199"},{"key":"ref_8","unstructured":"Xu, J., Tian, Y., Ma, P., Rus, D., Sueda, S., and Matusik, W. (2020, January 13\u201318). Prediction-guided multi-objective reinforcement learning for continuous robot control. Proceedings of the 37th International Conference on Machine Learning PMLR, Virtual."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"15647","DOI":"10.1073\/pnas.1014269108","article-title":"Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis","volume":"108","author":"Glimcher","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_11","first-page":"814","article-title":"An analysis of Q-learning algorithms with strategies of reward function","volume":"3","author":"Manju","year":"2011","journal-title":"Int. J. Comput. Sci. Eng."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.neunet.2022.05.013","article-title":"A survey for deep reinforcement learning in markovian cyber\u2013physical systems: Common problems and solutions","volume":"153","author":"Rupprecht","year":"2022","journal-title":"Neural Netw."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement learning in robotics: A survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_14","unstructured":"Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., and Freitas, N. (2016, January 20\u201322). Dueling network architectures for deep reinforcement learning. Proceedings of the International Conference on Machine Learning PMLR, New York, NY, USA."},{"key":"ref_15","unstructured":"Felner, A. (2011, January 15\u201316). Position paper: Dijkstra\u2019s algorithm versus uniform cost search or a case against dijkstra\u2019s algorithm. Proceedings of the International Symposium on Combinatorial Search, Barcelona, Spain."},{"key":"ref_16","unstructured":"Nannicini, G., Delling, D., Liberti, L., and Schultes, D. (June, January 30). Bidirectional A\u2217 search for time-dependent fast paths. Proceedings of the Experimental Algorithms: 7th International Workshop, WEA 2008, Provincetown, MA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1018","DOI":"10.1109\/ROBOT.1999.772447","article-title":"The Gaussian sampling strategy for probabilistic roadmap planners","volume":"Volume 2","author":"Boor","year":"1999","journal-title":"Proceedings of the 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C)"},{"key":"ref_18","unstructured":"LaValle, S. (1998). Research Report 9811, Iowa State University."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1016\/j.procs.2018.01.054","article-title":"Grid path planning with deep reinforcement learning: Preliminary results","volume":"123","author":"Panov","year":"2018","journal-title":"Procedia Comput. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"657","DOI":"10.4304\/jsw.7.3.657-662","article-title":"Reinforcement Learning in Robot Path Optimization","volume":"7","author":"Zhang","year":"2012","journal-title":"J. Softw."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1016\/j.cie.2016.10.022","article-title":"Integrating estimation of distribution algorithms versus Q-learning into Meta-RaPS for solving the 0-1 multidimensional knapsack problem","volume":"112","author":"Arin","year":"2017","journal-title":"Comput. Ind. Eng."},{"key":"ref_22","first-page":"838","article-title":"Heuristically accelerated state backtracking Q-learning based on cost analysis","volume":"35","author":"Fang","year":"2013","journal-title":"Int. J. Pattern Recognit. Artif. Intell."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2184","DOI":"10.1016\/j.engappai.2013.06.016","article-title":"Backward Q-learning: The combination of Sarsa algorithm and Q-learning","volume":"26","author":"Wang","year":"2013","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1016\/j.eswa.2016.06.021","article-title":"Neural networks based reinforcement learning for mobile robots obstacle avoidance","volume":"62","author":"Duguleana","year":"2016","journal-title":"Expert Syst. Appl."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"814","DOI":"10.1109\/TSMCA.2012.2226024","article-title":"Realization of an adaptive memetic algorithm using differential evolution and Q-learning: A case study in multirobot path planning","volume":"43","author":"Rakshit","year":"2013","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1016\/j.robot.2018.05.016","article-title":"Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning","volume":"107","author":"Carlucho","year":"2018","journal-title":"Robot. Auton. Syst."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"9963018","DOI":"10.1155\/2021\/9963018","article-title":"A novel behavioral strategy for RoboCode platform based on deep Q-learning","volume":"2021","author":"Kayakoku","year":"2021","journal-title":"Complexity"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Gao, X., Luo, H., Ning, B., Zhao, F., Bao, L., Gong, Y., Xiao, Y., and Jiang, J. (2020). RL-AKF: An adaptive kalman filter navigation algorithm based on reinforcement learning for ground vehicles. Remote. Sens., 12.","DOI":"10.3390\/rs12111704"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.robot.2019.01.003","article-title":"Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning","volume":"114","author":"You","year":"2019","journal-title":"Robot. Auton. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Maeda, Y., Watanabe, T., and Moriyama, Y. (2011, January 25\u201327). View-based programming with reinforcement learning for robotic manipulation. Proceedings of the 2011 IEEE International Symposium on Assembly and Manufacturing (ISAM), Tampere, Finland.","DOI":"10.1109\/ISAM.2011.5942329"},{"key":"ref_31","first-page":"74","article-title":"Application of optimized q learning algorithm in reinforcement learning","volume":"34","author":"Wu","year":"2018","journal-title":"Bull. Sci. Technol."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wu, Z., Yin, Y., Liu, J., Zhang, D., Chen, J., and Jiang, W. (2023). A Novel Path Planning Approach for Mobile Robot in Radioactive Environment Based on Improved Deep Q Network Algorithm. Symmetry, 15.","DOI":"10.3390\/sym15112048"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Escobar-Naranjo, J., Caiza, G., Ayala, P., Jordan, E., Garcia, C.A., and Garcia, M.V. (2023). Autonomous Navigation of Robots: Optimization with DQN. Appl. Sci., 13.","DOI":"10.3390\/app13127202"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_35","first-page":"1588","article-title":"A Deep Recurrent Q Network with Exploratory Noise","volume":"42","author":"Quan","year":"2019","journal-title":"Chin. J. Comput."},{"key":"ref_36","first-page":"3661","article-title":"An improved algorithm for deep Q-network","volume":"36","author":"Xia","year":"2019","journal-title":"J. Comput. Appl. Res."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Kim, K.S., Kim, D.E., and Lee, J.M. (2018, January 9\u201312). Deep learning based on smooth driving for autonomous navigation. Proceedings of the 2018 IEEE\/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Auckland, New Zealand.","DOI":"10.1109\/AIM.2018.8452266"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Ruan, X., Ren, D., Zhu, X., and Huang, J. (2019, January 3\u20135). Mobile robot navigation based on deep reinforcement learning. Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China.","DOI":"10.1109\/CCDC.2019.8832393"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1007\/s42979-021-00817-z","article-title":"Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation","volume":"2","author":"Chen","year":"2021","journal-title":"SN Comput. Sci."},{"key":"ref_40","first-page":"1229","article-title":"Review of convolutional neural network","volume":"40","author":"Zhou","year":"2017","journal-title":"J. Comput."},{"key":"ref_41","unstructured":"Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2015). Prioritized experience replay. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/5\/1523\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:05:21Z","timestamp":1760105121000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/5\/1523"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,27]]},"references-count":41,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,3]]}},"alternative-id":["s24051523"],"URL":"https:\/\/doi.org\/10.3390\/s24051523","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,27]]}}}