{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T13:01:51Z","timestamp":1770469311292,"version":"3.49.0"},"reference-count":47,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2024,10,14]],"date-time":"2024-10-14T00:00:00Z","timestamp":1728864000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100007085","name":"National University of Defense Technology","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100007085","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Neurorobot."],"abstract":"<jats:p>How to improve the success rate of autonomous underwater vehicle (AUV) path planning and reduce travel time as much as possible is a very challenging and crucial problem in the practical applications of AUV in the complex ocean current environment. Traditional reinforcement learning algorithms lack exploration of the environment, and the strategies learned by the agent may not generalize well to other different environments. To address these challenges, we propose a novel AUV path planning algorithm named the Noisy Dueling Double Deep Q-Network (ND3QN) algorithm by modifying the reward function and introducing a noisy network, which generalizes the traditional D3QN algorithm. Compared with the classical algorithm [e.g., Rapidly-exploring Random Trees Star (RRT*), DQN, and D3QN], with simulation experiments conducted in realistic terrain and ocean currents, the proposed ND3QN algorithm demonstrates the outstanding characteristics of a higher success rate of AUV path planning, shorter travel time, and smoother paths.<\/jats:p>","DOI":"10.3389\/fnbot.2024.1466571","type":"journal-article","created":{"date-parts":[[2024,10,14]],"date-time":"2024-10-14T05:10:31Z","timestamp":1728882631000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Noisy Dueling Double Deep Q-Network algorithm for autonomous underwater vehicle path planning"],"prefix":"10.3389","volume":"18","author":[{"given":"Xu","family":"Liao","sequence":"first","affiliation":[]},{"given":"Le","family":"Li","sequence":"additional","affiliation":[]},{"given":"Chuangxia","family":"Huang","sequence":"additional","affiliation":[]},{"given":"Xian","family":"Zhao","sequence":"additional","affiliation":[]},{"given":"Shumin","family":"Tan","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2024,10,14]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"418","DOI":"10.1109\/JOE.2004.827837","article-title":"Evolutionary path planning for autonomous underwater vehicles in a variable ocean","volume":"29","author":"Alvarez","year":"2004","journal-title":"IEEE J. Oceanic Eng"},{"key":"B2","first-page":"106","article-title":"Underwater terrain mapping with a 5-dof auv","volume":"43","author":"Ambastha","year":"2014","journal-title":"Indian J. Geo-Mar. Sci"},{"key":"B3","doi-asserted-by":"publisher","DOI":"10.5220\/0008948900002513","article-title":"\u201cCurriculum deep reinforcement learning with different exploration strategies: a feasibility study on cardiac landmark detection,\u201d","author":"Astudillo","year":"2020","journal-title":"Bioimaging (Bristol. Print)"},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.1109\/AUV.2016.7778700","article-title":"\u201cEnvirobot: a bio-inspired environmental monitoring platform,\u201d","author":"Bayat","year":"2016","journal-title":"2016 IEEE\/OES Autonomous Underwater Vehicles (AUV)"},{"key":"B5","doi-asserted-by":"publisher","first-page":"4513","DOI":"10.1109\/TSG.2020.2986333","article-title":"Deep reinforcement learning-based energy storage arbitrage with accurate lithium-ion battery degradation model","volume":"11","author":"Cao","year":"2020","journal-title":"IEEE Trans. Smart Grid"},{"key":"B6","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1109\/TIV.2022.3153352","article-title":"Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance","volume":"8","author":"Chu","year":"2023","journal-title":"IEEE Trans. Intell. Vehic"},{"key":"B7","first-page":"4666","article-title":"\u201cGuarantees for epsilon-greedy reinforcement learning with function approximation,\u201d","author":"Dann","year":"2022","journal-title":"Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research"},{"key":"B8","doi-asserted-by":"publisher","first-page":"e26403","DOI":"10.1016\/j.heliyon.2024.e26403","article-title":"Bi-rrt*: an improved path planning algorithm for secure and trustworthy mobile robots systems","volume":"10","author":"Fan","year":"2024","journal-title":"Heliyon"},{"key":"B9","article-title":"Noisy networks for exploration","author":"Fortunato","year":"2017","journal-title":"ArXiv, abs\/1706.10295"},{"key":"B10","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1145\/3349341.3349459","article-title":"\u201cImproved rrt* for fast path planning in underwater 3d environment,\u201d","volume-title":"Proceedings of the 2019 International Conference on Artificial Intelligence and Computer Science, AICS 2019","author":"Fu","year":"2019"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1109\/ICCT52962.2021.9657841","article-title":"\u201cFuzzy noisy network for stable exploration,\u201d","author":"Gao","year":"2021","journal-title":"2021 IEEE 21st International Conference on Communication Technology (ICCT)"},{"key":"B12","doi-asserted-by":"publisher","first-page":"1369","DOI":"10.1109\/TGCN.2021.3073916","article-title":"Cellular-connected uav trajectory design with connectivity constraint: A deep reinforcement learning approach","volume":"5","author":"Gao","year":"2021","journal-title":"IEEE Trans. Green Commun. Networ"},{"key":"B13","unstructured":"35590884\n          Gebco 2020 grid\n          \n          2020"},{"key":"B14","doi-asserted-by":"publisher","first-page":"111503","DOI":"10.1016\/j.asoc.2024.111503","article-title":"Dynamic path planning via dueling double deep q-network (d3qn) with prioritized experience replay","volume":"158","author":"G\u00f6k","year":"2024","journal-title":"Appl. Soft Comput"},{"key":"B15","doi-asserted-by":"publisher","first-page":"121958","DOI":"10.1016\/j.energy.2021.121958","article-title":"Data-driven battery operation for energy arbitrage using rainbow deep reinforcement learning","volume":"238","author":"Harrold","year":"2022","journal-title":"Energy"},{"key":"B16","first-page":"2094","article-title":"\u201cDeep reinforcement learning with double q-learning,\u201d","volume-title":"Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI'16","author":"Hasselt","year":"2016"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1109\/CAI54212.2023.00053","article-title":"\u201cRobust ai-enabled simulation of treatment paths with markov decision process for breast cancer patients,\u201d","author":"Hossain","year":"2023","journal-title":"2023 IEEE Conference on Artificial Intelligence (CAI)"},{"key":"B18","unstructured":"IRI\/LDEO Climate Data Library\n          \n          2022"},{"key":"B19","doi-asserted-by":"publisher","first-page":"846","DOI":"10.7551\/mitpress\/9123.003.0038","article-title":"Sampling-based algorithms for optimal motion planning","volume":"30","author":"Karaman","year":"2011","journal-title":"Int. J. Rob. Res"},{"key":"B20","doi-asserted-by":"publisher","DOI":"10.1109\/ASET56582.2023.10180740","article-title":"\u201cIntelligent adaptive rrt* path planning algorithm for mobile robots,\u201d","author":"Khattab","year":"2023","journal-title":"2023 Advances in Science and Engineering Technology International Conferences (ASET)"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.1109\/ROMAN.2012.6343862","article-title":"\u201cReinforcement learning from human reward: discounting in episodic tasks,\u201d","author":"Knox","year":"2012","journal-title":"2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication"},{"key":"B22","doi-asserted-by":"publisher","first-page":"2301","DOI":"10.3390\/electronics11152301","article-title":"Review of collision avoidance and path planning algorithms used in autonomous underwater vehicles","volume":"11","author":"Kot","year":"2022","journal-title":"Electronics"},{"key":"B23","article-title":"\u201cImagenet classification with deep convolutional neural networks,\u201d","author":"Krizhevsky","year":"2012","journal-title":"Advances in Neural Information Processing Systems"},{"key":"B24","doi-asserted-by":"publisher","first-page":"147","DOI":"10.3390\/machines9080147","article-title":"Target search algorithm for auv based on real-time perception maps in unknown environment","volume":"9","author":"Li","year":"2021","journal-title":"Machines"},{"key":"B25","doi-asserted-by":"publisher","first-page":"8978","DOI":"10.1109\/TVT.2021.3098978","article-title":"Secure and reliable downlink transmission for energy-efficient user-centric ultra-dense networks: an accelerated drl approach","volume":"70","author":"Li","year":"2021","journal-title":"IEEE Trans. Vehic. Technol"},{"key":"B26","doi-asserted-by":"publisher","first-page":"3077","DOI":"10.3390\/rs15123077","article-title":"Comprehensive ocean information-enabled auv motion planning based on reinforcement learning","volume":"15","author":"Li","year":"2023","journal-title":"Rem. Sens"},{"key":"B27","doi-asserted-by":"publisher","first-page":"697","DOI":"10.1109\/TCST.2018.2884226","article-title":"Distributed formation control using artificial potentials and neural network for constrained multiagent systems","volume":"28","author":"Liu","year":"2020","journal-title":"IEEE Trans. Control Syst. Technol"},{"key":"B28","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"B29","doi-asserted-by":"publisher","first-page":"24894","DOI":"10.1109\/ACCESS.2023.3249966","article-title":"An overview of machine learning techniques in local path planning for autonomous underwater vehicles","volume":"11","author":"Okereke","year":"2023","journal-title":"IEEE Access"},{"key":"B30","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1007\/978-981-99-9239-3_10","article-title":"\u201cResearch on mobile robot path planning based on improved a* and dwa algorithms,\u201d","author":"Qian","year":"2024","journal-title":"Proceedings of the 13th International Conference on Computer Engineering and Networks"},{"key":"B31","doi-asserted-by":"publisher","DOI":"10.1109\/ICIT.2017.7915468","article-title":"\u201cModel based path planning using q-learning,\u201d","author":"Sharma","year":"2017","journal-title":"2017 IEEE International Conference on Industrial Technology (ICIT)"},{"key":"B32","volume-title":"Cyber-Physical Systems: Foundations, Principles and Applications","author":"Song","year":"2016"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1109\/TQCEBT54229.2022.10041614","article-title":"\u201cSelf-autonomous car simulation using deep q-learning algorithm,\u201d","author":"Soni","year":"2022","journal-title":"2022 International Conference on Trends in Quantum Computing and Emerging Business Technologies (TQCEBT)"},{"key":"B34","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1017\/S0373463322000091","article-title":"Energy optimised d* auv path planning with obstacle avoidance and ocean current environment","volume":"75","author":"Sun","year":"2022","journal-title":"J. Navig"},{"key":"B35","doi-asserted-by":"publisher","first-page":"1054","DOI":"10.1109\/TNN.1998.712192","article-title":"Reinforcement learning: an introduction","volume":"9","author":"Sutton","year":"1998","journal-title":"IEEE Trans. Neur. Netw"},{"key":"B36","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton","year":"2018"},{"key":"B37","doi-asserted-by":"publisher","first-page":"117547","DOI":"10.1016\/j.oceaneng.2024.117547","article-title":"Path planning of autonomous underwater vehicle in unknown environment based on improved deep reinforcement learning","volume":"301","author":"Tang","year":"2024","journal-title":"Ocean Eng"},{"key":"B38","doi-asserted-by":"publisher","first-page":"3549","DOI":"10.3390\/s130303549","article-title":"Continuous transmission frequency modulation detection under variable sonar-target speed conditions","volume":"13","author":"Wang","year":"2013","journal-title":"Sensors"},{"key":"B39","first-page":"1995","article-title":"\u201cDueling network architectures for deep reinforcement learning,\u201d","author":"Wang","year":"2016","journal-title":"Proceedings of the 33rd International Conference on International Conference on Machine Learning"},{"key":"B40","doi-asserted-by":"publisher","DOI":"10.1109\/ICEIEC.2019.8784487","article-title":"\u201cAn improved dijkstra's algorithm for shortest path planning on 2D grid maps,\u201d","author":"Wenzheng","year":"2019","journal-title":"2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC)"},{"key":"B41","doi-asserted-by":"publisher","first-page":"17440","DOI":"10.1109\/JIOT.2022.3155697","article-title":"Comprehensive ocean information-enabled auv path planning via reinforcement learning","volume":"9","author":"Xi","year":"2022","journal-title":"IEEE Internet Things J"},{"key":"B42","doi-asserted-by":"publisher","first-page":"1001","DOI":"10.1109\/JIOT.2022.3205685","article-title":"A time-saving path planning scheme for autonomous underwater vehicles with complex underwater conditions","volume":"10","author":"Yang","year":"","journal-title":"IEEE Internet Things J"},{"key":"B43","doi-asserted-by":"publisher","first-page":"1983","DOI":"10.1109\/TASE.2022.3190901","article-title":"Intelligent path planning of underwater robot based on reinforcement learning","volume":"20","author":"Yang","year":"","journal-title":"IEEE Trans. Autom. Sci. Eng"},{"key":"B44","doi-asserted-by":"publisher","first-page":"2011","DOI":"10.1109\/TII.2020.2984370","article-title":"Fadn: fully connected attitude detection network based on industrial video","volume":"17","author":"Yang","year":"2021","journal-title":"IEEE Trans. Industr. Inform"},{"key":"B45","doi-asserted-by":"publisher","DOI":"10.23919\/CCC58697.2023.10241227","article-title":"\u201cCurvature-continuous RRT-based path planning with enhanced efficiency,\u201d","author":"Zeng","year":"2023","journal-title":"2023 42nd Chinese Control Conference (CCC)"},{"key":"B46","doi-asserted-by":"publisher","first-page":"8821906","DOI":"10.1155\/2023\/8821906","article-title":"An improved quantum-behaved particle swarm optimization algorithm combined with reinforcement learning for auv path planning","volume":"2023","author":"Zhang","year":"2023","journal-title":"J. Robot"},{"key":"B47","doi-asserted-by":"publisher","first-page":"1649","DOI":"10.1109\/TITS.2021.3102995","article-title":"Auv-assisted subsea exploration method in 6G enabled deep ocean based on a cooperative pac-men mechanism","volume":"23","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Intell. Transport. Syst"}],"container-title":["Frontiers in Neurorobotics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2024.1466571\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,14]],"date-time":"2024-10-14T05:10:36Z","timestamp":1728882636000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2024.1466571\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,14]]},"references-count":47,"alternative-id":["10.3389\/fnbot.2024.1466571"],"URL":"https:\/\/doi.org\/10.3389\/fnbot.2024.1466571","relation":{},"ISSN":["1662-5218"],"issn-type":[{"value":"1662-5218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,14]]},"article-number":"1466571"}}