{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T22:11:12Z","timestamp":1775254272984,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T00:00:00Z","timestamp":1670371200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["2020R1F1A1064330"],"award-info":[{"award-number":["2020R1F1A1064330"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Yeungnam University","award":["2020R1F1A1064330"],"award-info":[{"award-number":["2020R1F1A1064330"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In this paper, we propose a deep deterministic policy gradient (DDPG)-based path-planning method for mobile robots by applying the hindsight experience replay (HER) technique to overcome the performance degradation resulting from sparse reward problems occurring in autonomous driving mobile robots. The mobile robot in our analysis was a robot operating system-based TurtleBot3, and the experimental environment was a virtual simulation based on Gazebo. A fully connected neural network was used as the DDPG network based on the actor\u2013critic architecture. Noise was added to the actor network. The robot recognized an unknown environment by measuring distances using a laser sensor and determined the optimized policy to reach its destination. The HER technique improved the learning performance by generating three new episodes with normal experience from a failed episode. The proposed method demonstrated that the HER technique could help mitigate the sparse reward problem; this was further corroborated by the successful autonomous driving results obtained after applying the proposed method to two reward systems, as well as actual experimental results.<\/jats:p>","DOI":"10.3390\/s22249574","type":"journal-article","created":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T05:50:52Z","timestamp":1670392252000},"page":"9574","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments"],"prefix":"10.3390","volume":"22","author":[{"given":"Minjae","family":"Park","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9071-4837","authenticated-orcid":false,"given":"Seok Young","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Soonchunhyang University, Asan 31538, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jin Seok","family":"Hong","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nam Kyu","family":"Kwon","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,12,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1080\/01691864.2019.1691941","article-title":"Development of a separable search-and-rescue robot composed of a mobile robot and a snake robot","volume":"34","author":"Kamegawa","year":"2020","journal-title":"Adv. Robot."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sonnleitner, F., Shu, R., and Hollis, R.L. (2019, January 20\u201324). The mechanics and control of leaning to lift heavy objects with a dynamically stable mobile robot. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793620"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ghute, M.S., Kamble, K.P., and Korde, M. (2018, January 15\u201317). Design of military surveillance robot. Proceedings of the 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India.","DOI":"10.1109\/ICSCCC.2018.8703330"},{"key":"ref_4","first-page":"4891","article-title":"A one decade survey of autonomous mobile robot systems","volume":"11","author":"Zghair","year":"2021","journal-title":"Int. J. Electr. Comput. Eng."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Sichkar, V.N. (2019, January 25\u201329). Reinforcement learning algorithms in global path planning for mobile robot. Proceedings of the 2019 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM), Sochi, Russia.","DOI":"10.1109\/ICIEAM.2019.8742915"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Gao, J., Ye, W., Guo, J., and Li, Z. (2020). Deep reinforcement learning for indoor mobile robot path planning. Sensors, 20.","DOI":"10.3390\/s20195493"},{"key":"ref_7","first-page":"220","article-title":"Fire Fighting Mobile Robot: State of the Art and Recent Development","volume":"7","author":"Tan","year":"2013","journal-title":"Aust. J. Basic Appl. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1016\/j.robot.2010.03.010","article-title":"Developing a mobile robot for transport applications in the hospital domain","volume":"58","author":"Takahashi","year":"2010","journal-title":"Robot. Auton. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Thanh, V.N., Vinh, D.P., and Nghi, N.T. (2019, January 4\u20137). Restaurant serving robot with double line sensors following approach. Proceedings of the 2019 IEEE International Conference on Mechatronics and Automation (ICMA), Tianjin, China.","DOI":"10.1109\/ICMA.2019.8816404"},{"key":"ref_10","unstructured":"Leonard, J.J., and Durrant-Whyte, H.F. (1991, January 3\u20135). Simultaneous map building and localization for an autonomous mobile robot. Proceedings of the IROS, Osaka, Japan."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1052","DOI":"10.1109\/TPAMI.2007.1049","article-title":"MonoSLAM: Real-time single camera SLAM","volume":"29","author":"Davison","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","unstructured":"Diosi, A., Taylor, G., and Kleeman, L. (2005, January 18\u201322). Interactive SLAM using laser and advanced sonar. Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1109\/TSSC.1968.300136","article-title":"A formal basis for the heuristic determination of minimum cost paths","volume":"4","author":"Hart","year":"1968","journal-title":"IEEE Trans. Syst. Sci. Cybern."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Arulkumaran, K., Deisenroth, M.P., Brundage, M., and Bharath, A.A. (2017). A brief survey of deep reinforcement learning. arXiv.","DOI":"10.1109\/MSP.2017.2743240"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1038\/nature16961","article-title":"Mastering the game of Go with deep neural networks and tree search","volume":"529","author":"Silver","year":"2016","journal-title":"Nature"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","article-title":"Grandmaster level in StarCraft II using multi-agent reinforcement learning","volume":"575","author":"Vinyals","year":"2019","journal-title":"Nature"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1038\/s41586-020-03051-4","article-title":"Mastering atari, go, chess and shogi by planning with a learned model","volume":"588","author":"Schrittwieser","year":"2020","journal-title":"Nature"},{"key":"ref_18","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv."},{"key":"ref_19","unstructured":"Sutton, R.S., McAllester, D., Singh, S., and Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems 12 (NIPS 1999), MIT Press."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1007\/BF00992696","article-title":"Simple statistical gradient-following algorithms for connectionist reinforcement learning","volume":"8","author":"Williams","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","article-title":"Neuronlike adaptive elements that can solve difficult learning control problems","volume":"13","author":"Barto","year":"1983","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_22","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv."},{"key":"ref_23","unstructured":"Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014, January 21\u201326). Deterministic policy gradient algorithms. Proceedings of the 31st International Conference on Machine Learning, Beijing, China."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Jesus, J.C., Bottega, J.A., Cuadros, M.A., and Gamarra, D.F. (2019, January 2\u20136). Deep deterministic policy gradient for navigation of mobile robots in simulated environments. Proceedings of the 2019 19th International Conference on Advanced Robotics (ICAR), Belo Horizonte, Brazil.","DOI":"10.1109\/ICAR46387.2019.8981638"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhao, P., Zheng, J., Zhou, Q., Lyu, C., and Lyu, L. (2021, January 8\u201312). A dueling-DDPG architecture for mobile robots path planning based on laser range findings. Proceedings of the Pacific Rim International Conference on Artificial Intelligence, Hanoi, Vietnam.","DOI":"10.1007\/978-3-030-89188-6_12"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gong, H., Wang, P., Ni, C., and Cheng, N. (2022). Efficient Path Planning for Mobile Robot Based on Deep Deterministic Policy Gradient. Sensors, 22.","DOI":"10.21203\/rs.3.rs-2201974\/v1"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Tai, L., Paolo, G., and Liu, M. (2017, January 24\u201328). Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8202134"},{"key":"ref_28","first-page":"5169460","article-title":"Research on Dynamic Path Planning of Mobile Robot Based on Improved DDPG Algorithm","volume":"2021","author":"Li","year":"2021","journal-title":"Mob. Inf. Syst."},{"key":"ref_29","unstructured":"Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Pieter Abbeel, O., and Zaremba, W. (2017). Hindsight experience replay. arXiv."},{"key":"ref_30","unstructured":"Huang, B.-Q., Cao, G.-Y., and Guo, M. (2005, January 18\u201321). Reinforcement learning neural network to the problem of autonomous mobile robot obstacle avoidance. Proceedings of the 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ruan, X., Ren, D., Zhu, X., and Huang, J. (2019, January 3\u20135). Mobile robot navigation based on deep reinforcement learning. Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China.","DOI":"10.1109\/CCDC.2019.8832393"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Quiroga, F., Hermosilla, G., Farias, G., Fabregas, E., and Montenegro, G. (2022). Position control of a mobile robot through deep reinforcement learning. Appl. Sci., 12.","DOI":"10.3390\/app12147194"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Dong, Y., and Zou, X. (2020, January 16\u201318). Mobile Robot Path Planning Based on Improved DDPG Reinforcement Learning Algorithm. Proceedings of the 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China.","DOI":"10.1109\/ICSESS49938.2020.9237641"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1103\/PhysRev.36.823","article-title":"On the theory of the Brownian motion","volume":"36","author":"Uhlenbeck","year":"1930","journal-title":"Phys. Rev."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/24\/9574\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:35:28Z","timestamp":1760146528000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/24\/9574"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,7]]},"references-count":34,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["s22249574"],"URL":"https:\/\/doi.org\/10.3390\/s22249574","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,7]]}}}