{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T05:01:16Z","timestamp":1773723676666,"version":"3.50.1"},"reference-count":38,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2023,6,12]],"date-time":"2023-06-12T00:00:00Z","timestamp":1686528000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["61861014"],"award-info":[{"award-number":["61861014"]}]},{"name":"National Natural Science Foundation of China","award":["BS2021025"],"award-info":[{"award-number":["BS2021025"]}]},{"name":"Doctor Start-Up Fund","award":["61861014"],"award-info":[{"award-number":["61861014"]}]},{"name":"Doctor Start-Up Fund","award":["BS2021025"],"award-info":[{"award-number":["BS2021025"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Motion planning based on the reinforcement learning algorithms of the autonomous underwater vehicle (AUV) has shown great potential. Motion planning algorithms are primarily utilized for path planning and trajectory-tracking. However, prior studies have been confronted with some limitations. The time-varying ocean current affects algorithmic sampling and AUV motion and then leads to an overestimation error during path planning. In addition, the ocean current makes it easy to fall into local optima during trajectory planning. To address these problems, this paper presents a reinforcement learning-based motion planning algorithm with comprehensive ocean information (RLBMPA-COI). First, we introduce real ocean data to construct a time-varying ocean current motion model. Then, comprehensive ocean information and AUV motion position are introduced, and the objective function is optimized in the state-action value network to reduce overestimation errors. Finally, state transfer and reward functions are designed based on real ocean current data to achieve multi-objective path planning and adaptive event triggering in trajectorytracking to improve robustness and adaptability. The numerical simulation results show that the proposed algorithm has a better path planning ability and a more robust trajectory-tracking effect than those of traditional reinforcement learning algorithms.<\/jats:p>","DOI":"10.3390\/rs15123077","type":"journal-article","created":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T02:00:45Z","timestamp":1686621645000},"page":"3077","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Comprehensive Ocean Information-Enabled AUV Motion Planning Based on Reinforcement Learning"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1477-3712","authenticated-orcid":false,"given":"Yun","family":"Li","sequence":"first","affiliation":[{"name":"School of Big Data and Artificial Intelligence, Guangxi University of Finance and Economics, Nanning 530003, China"},{"name":"Guangxi Big Data Analysis of Taxation Research Center of Engineering, Nanning 530003, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4122-963X","authenticated-orcid":false,"given":"Xinqi","family":"He","sequence":"additional","affiliation":[{"name":"School of Electronic Information, Guangxi Minzu University, Nanning 530006, China"}]},{"given":"Zhenkun","family":"Lu","sequence":"additional","affiliation":[{"name":"School of Electronic Information, Guangxi Minzu University, Nanning 530006, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2648-7358","authenticated-orcid":false,"given":"Peiguang","family":"Jing","sequence":"additional","affiliation":[{"name":"School of Electrical and Information Engineering, Tianjin University, Weijin Road, Tianjin 300072, China"}]},{"given":"Yishan","family":"Su","sequence":"additional","affiliation":[{"name":"School of Electrical and Information Engineering, Tianjin University, Weijin Road, Tianjin 300072, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,6,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zhao, W., Zhao, H., Liu, G., and Zhang, G. (2022). ANFIS-EKF-Based Single-Beacon Localization Algorithm for AUV. Remote Sens., 14.","DOI":"10.3390\/rs14205281"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Cai, C., Chen, J., Yan, Q., and Liu, F. (2022). A Multi-Robot Coverage Path Planning Method for Maritime Search and Rescue Using Multiple AUVs. Remote Sens., 15.","DOI":"10.3390\/rs15010093"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"766","DOI":"10.1109\/ACCESS.2016.2529723","article-title":"Internet of things and big data analytics for smart and connected communities","volume":"4","author":"Sun","year":"2016","journal-title":"IEEE Access"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1109\/TMECH.2014.2301459","article-title":"Cooperative path planning for target tracking in urban environments using unmanned air and ground vehicles","volume":"20","author":"Yu","year":"2014","journal-title":"IEEE\/ASME Trans. Mechatronics"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1109\/70.88035","article-title":"Motion planning in a plane using generalized Voronoi diagrams","volume":"5","author":"Takahashi","year":"1989","journal-title":"IEEE Trans. Robot. Autom."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2042","DOI":"10.2136\/sssaj2004.2042","article-title":"Map quality for ordinary kriging and inverse distance weighted interpolation","volume":"68","author":"Mueller","year":"2004","journal-title":"Soil Sci. Soc. Am. J."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, G., Wei, F., Jiang, Y., Zhao, M., Wang, K., and Qi, H. (2022). A Multi-AUV Maritime Target Search Method for Moving and Invisible Objects Based on Multi-Agent Deep Reinforcement Learning. Sensors, 22.","DOI":"10.3390\/s22218562"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yokota, Y., and Matsuda, T. (2021). Underwater Communication Using UAVs to Realize High-Speed AUV Deployment. Remote Sens., 13.","DOI":"10.20944\/preprints202108.0330.v1"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sedighi, S., Nguyen, D.V., and Kuhnert, K.D. (2019, January 19\u201322). Guided hybrid A-star path planning algorithm for valet parking applications. Proceedings of the 2019 5th International Conference on Control, Automation and Robotics (ICCAR), Beijing, China.","DOI":"10.1109\/ICCAR.2019.8813752"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhu, J., Zhao, S., and Zhao, R. (2021, January 8\u201310). Path planning for autonomous underwater vehicle based on artificial potential field and modified RRT. Proceedings of the 2021 International Conference on Computer, Control and Robotics (ICCCR), Shanghai, China.","DOI":"10.1109\/ICCCR49711.2021.9349402"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"2568","DOI":"10.1109\/TMECH.2018.2821767","article-title":"A fast and efficient double-tree RRT*-like sampling-based planner applying on mobile robotic systems","volume":"23","author":"Chen","year":"2018","journal-title":"IEEE\/ASME Trans. Mechatron."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Nayeem, G.M., Fan, M., and Akhter, Y. (2021, January 5\u20137). A time-varying adaptive inertia weight based modified PSO algorithm for UAV path planning. Proceedings of the 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), Dhaka, Bangladesh.","DOI":"10.1109\/ICREST51555.2021.9331101"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"8557","DOI":"10.1109\/TIE.2018.2886798","article-title":"A new robot navigation algorithm based on a double-layer ant algorithm and trajectory optimization","volume":"66","author":"Yang","year":"2018","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1109\/TII.2012.2198665","article-title":"Comparison of parallel genetic algorithm and particle swarm optimization for real-time UAV path planning","volume":"9","author":"Roberge","year":"2012","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1896","DOI":"10.1109\/TCST.2016.2628803","article-title":"Modified C\/GMRES algorithm for fast nonlinear model predictive tracking control of AUVs","volume":"25","author":"Shen","year":"2016","journal-title":"IEEE Trans. Control. Syst. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1109\/TSMC.2015.2465352","article-title":"Trajectory-tracking control of mobile robot systems incorporating neural-dynamic optimized model predictive approach","volume":"46","author":"Li","year":"2015","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1109\/TCST.2005.847331","article-title":"PID control system analysis, design, and technology","volume":"13","author":"Ang","year":"2005","journal-title":"IEEE Trans. Control. Syst. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"e09399.","DOI":"10.1016\/j.heliyon.2022.e09399","article-title":"Metaheuristic algorithms for PID controller parameters tuning: Review, approaches and open problems","volume":"8","author":"Joseph","year":"2022","journal-title":"Heliyon"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1141","DOI":"10.1109\/TSMCA.2012.2227719","article-title":"A deterministic improved Q-learning for path planning of a mobile robot","volume":"43","author":"Konar","year":"2013","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1109\/MVT.2020.3019650","article-title":"Machine learning for 6G wireless networks: Carrying forward enhanced bandwidth, massive access, and ultrareliable\/low-latency service","volume":"15","author":"Du","year":"2020","journal-title":"IEEE Veh. Technol. Mag."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3243","DOI":"10.1109\/TVT.2021.3066482","article-title":"Real-time path planning and following of a gliding robotic dolphin within a hierarchical framework","volume":"70","author":"Wang","year":"2021","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"8959","DOI":"10.1109\/TVT.2020.2998137","article-title":"Ant-colony-based complete-coverage path-planning algorithm for underwater gliders in ocean areas with thermoclines","volume":"69","author":"Han","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_23","unstructured":"Huang, B.Q., Cao, G.Y., and Guo, M. (2005, January 18\u201321). Reinforcement learning neural network to the problem of autonomous mobile robot obstacle avoidance. Proceedings of the 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1109\/TIV.2022.3153352","article-title":"Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance","volume":"8","author":"Chu","year":"2022","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"111453","DOI":"10.1016\/j.oceaneng.2022.111453","article-title":"AUV path tracking with real-time obstacle avoidance via reinforcement learning under adaptive constraints","volume":"256","author":"Zhang","year":"2022","journal-title":"Ocean Eng."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hou, Y., Liu, L., Wei, Q., Xu, X., and Chen, C. (2017, January 5\u20138). A novel DDPG method with prioritized experience replay. Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada.","DOI":"10.1109\/SMC.2017.8122622"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"588","DOI":"10.3390\/jmse11030588","article-title":"Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle","volume":"11","author":"Du","year":"2023","journal-title":"J. Mar. Sci. Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"103326","DOI":"10.1016\/j.apor.2022.103326","article-title":"Deep reinforcement learning for adaptive path planning and control of an autonomous underwater vehicle","volume":"129","author":"Hadi","year":"2023","journal-title":"Appl. Ocean Res."},{"key":"ref_30","unstructured":"Fujimoto, S., Hoof, H., and Meger, D. (2018, January 10\u201315). Addressing function approximation error in actor-critic methods. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"112038","DOI":"10.1016\/j.oceaneng.2022.112038","article-title":"A learning method for AUV collision avoidance through deep reinforcement learning","volume":"260","author":"Xu","year":"2022","journal-title":"Ocean Eng."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"112424","DOI":"10.1016\/j.oceaneng.2022.112424","article-title":"A general motion control architecture for an autonomous underwater vehicle with actuator faults and unknown disturbances through deep reinforcement learning","volume":"263","author":"Huang","year":"2022","journal-title":"Ocean Eng."},{"key":"ref_33","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2757","DOI":"10.1109\/TSMC.2021.3050960","article-title":"Asynchronous multithreading reinforcement-learning-based path planning and tracking for unmanned underwater vehicle","volume":"52","author":"He","year":"2021","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_35","unstructured":"Wang, Y., He, H., and Tan, X. (2019, January 22\u201325). Truly proximal policy optimization. Proceedings of the Uncertainty in Artificial Intelligence, Tel Aviv, Israel."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Jaffe, J., and Schurgers, C. (2006, January 25). Sensor networks of freely drifting autonomous underwater explorers. Proceedings of the 1st International Workshop on Underwater Networks, Los Angeles, CA, USA.","DOI":"10.1145\/1161039.1161058"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"17440","DOI":"10.1109\/JIOT.2022.3155697","article-title":"Comprehensive ocean information-enabled AUV path planning via reinforcement learning","volume":"9","author":"Xi","year":"2022","journal-title":"IEEE Internet Things J."},{"key":"ref_38","unstructured":"National Marine Science and Technology Center (2022, May 01). Available online: http:\/\/mds.nmdis.org.cn\/."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/12\/3077\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:53:35Z","timestamp":1760126015000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/12\/3077"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,12]]},"references-count":38,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2023,6]]}},"alternative-id":["rs15123077"],"URL":"https:\/\/doi.org\/10.3390\/rs15123077","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,12]]}}}