{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T14:54:29Z","timestamp":1774277669817,"version":"3.50.1"},"reference-count":35,"publisher":"MDPI AG","issue":"15","license":[{"start":{"date-parts":[[2022,8,8]],"date-time":"2022-08-08T00:00:00Z","timestamp":1659916800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62101432"],"award-info":[{"award-number":["62101432"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61834005"],"award-info":[{"award-number":["61834005"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2022JM-508"],"award-info":[{"award-number":["2022JM-508"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["IPBED11"],"award-info":[{"award-number":["IPBED11"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Shaanxi Natural Science Fundamental Research Program Project","award":["62101432"],"award-info":[{"award-number":["62101432"]}]},{"name":"Shaanxi Natural Science Fundamental Research Program Project","award":["61834005"],"award-info":[{"award-number":["61834005"]}]},{"name":"Shaanxi Natural Science Fundamental Research Program Project","award":["2022JM-508"],"award-info":[{"award-number":["2022JM-508"]}]},{"name":"Shaanxi Natural Science Fundamental Research Program Project","award":["IPBED11"],"award-info":[{"award-number":["IPBED11"]}]},{"name":"Shaanxi Key Laboratory of Intelligent Processing for Big Energy Data","award":["62101432"],"award-info":[{"award-number":["62101432"]}]},{"name":"Shaanxi Key Laboratory of Intelligent Processing for Big Energy Data","award":["61834005"],"award-info":[{"award-number":["61834005"]}]},{"name":"Shaanxi Key Laboratory of Intelligent Processing for Big Energy Data","award":["2022JM-508"],"award-info":[{"award-number":["2022JM-508"]}]},{"name":"Shaanxi Key Laboratory of Intelligent Processing for Big Energy Data","award":["IPBED11"],"award-info":[{"award-number":["IPBED11"]}]},{"name":"Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing","award":["62101432"],"award-info":[{"award-number":["62101432"]}]},{"name":"Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing","award":["61834005"],"award-info":[{"award-number":["61834005"]}]},{"name":"Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing","award":["2022JM-508"],"award-info":[{"award-number":["2022JM-508"]}]},{"name":"Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing","award":["IPBED11"],"award-info":[{"award-number":["IPBED11"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>How to generate the path planning of mobile robots quickly is a problem in the field of robotics. The Q-learning(QL) algorithm has recently become increasingly used in the field of mobile robot path planning. However, its selection policy is blind in most cases in the early search process, which slows down the convergence of optimal solutions, especially in a complex environment. Therefore, in this paper, we propose a continuous local search Q-Learning (CLSQL) algorithm to solve these problems and ensure the quality of the planned path. First, the global environment is gradually divided into independent local environments. Then, the intermediate points are searched in each local environment with prior knowledge. After that, the search between each intermediate point is realized to reach the destination point. At last, by comparing other RL-based algorithms, the proposed method improves the convergence speed and computation time while ensuring the optimal path.<\/jats:p>","DOI":"10.3390\/s22155910","type":"journal-article","created":{"date-parts":[[2022,8,9]],"date-time":"2022-08-09T04:16:55Z","timestamp":1660018615000},"page":"5910","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["CLSQL: Improved Q-Learning Algorithm Based on Continuous Local Search Policy for Mobile Robot Path Planning"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5583-0074","authenticated-orcid":false,"given":"Tian","family":"Ma","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6788-3942","authenticated-orcid":false,"given":"Jiahao","family":"Lyu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3135-689X","authenticated-orcid":false,"given":"Jiayi","family":"Yang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]},{"given":"Runtao","family":"Xi","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9185-9974","authenticated-orcid":false,"given":"Yuancheng","family":"Li","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]},{"given":"Jinpeng","family":"An","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]},{"given":"Chao","family":"Li","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Xi\u2019an University of Science and Technology, Xi\u2019an 710054, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Sherwani, F., Asad, M.M., and Ibrahim, B.S.K.K. (2020, January 26\u201327). Collaborative robots and industrial revolution 4.0 (IR 4.0). Proceedings of the 2020 International Conference on Emerging Trends in Smart Technologies (ICETST), Karachi, Pakistan.","DOI":"10.1109\/ICETST49965.2020.9080724"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2886","DOI":"10.1049\/iet-cta.2018.6125","article-title":"Distributed multi-vehicle task assignment in a time-invariant drift field with obstacles","volume":"13","author":"Bai","year":"2019","journal-title":"IET Control Theory Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1109\/TSSC.1968.300136","article-title":"A formal basis for the heuristic determination of minimum cost paths","volume":"4","author":"Hart","year":"1968","journal-title":"IEEE Trans. Syst. Sci. Cybern."},{"key":"ref_4","unstructured":"LaValle, S.M. (1998). Rapidly-Exploring Random Trees: A New Tool for Path Planning, Iowa State University."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1109\/TSMC.1986.289288","article-title":"Optimization of control parameters for genetic algorithms","volume":"16","author":"Grefenstette","year":"1986","journal-title":"IEEE Trans. Syst. Man. Cybern."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"106960","DOI":"10.1016\/j.asoc.2020.106960","article-title":"An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve","volume":"100","author":"Song","year":"2021","journal-title":"Appl. Soft Comput."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1016\/j.ins.2018.04.044","article-title":"An integrated multi-population genetic algorithm for multi-vehicle task assignment in a drift field","volume":"453","author":"Bai","year":"2018","journal-title":"Inf. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2166","DOI":"10.1109\/LRA.2017.2722541","article-title":"Clustering-based algorithms for multivehicle task assignment in a time-invariant drift field","volume":"2","author":"Bai","year":"2017","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_9","first-page":"74","article-title":"Collision-free path planning for indoor mobile robots based on rapidly-exploring random trees and piecewise cubic hermite interpolating polynomial","volume":"19","author":"Hentout","year":"2019","journal-title":"Int. J. Imaging Robot."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2923","DOI":"10.1109\/COMST.2018.2844341","article-title":"Deep learning for IoT big data and streaming analytics: A survey","volume":"20","author":"Mohammadi","year":"2018","journal-title":"IEEE Commun. Surv. Tutorials"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"107602","DOI":"10.1016\/j.asoc.2021.107602","article-title":"3D robotic navigation using a vision-based deep reinforcement learning model","volume":"110","year":"2021","journal-title":"Appl. Soft Comput."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1016\/j.neunet.2008.02.003","article-title":"Reinforcement learning of motor skills with policy gradients","volume":"21","author":"Peters","year":"2008","journal-title":"Neural Netw."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Wen, S., Jiang, Y., Cui, B., Gao, K., and Wang, F. (2022). A Hierarchical Path Planning Approach with Multi-SARSA Based on Topological Map. Sensors, 22.","DOI":"10.3390\/s22062367"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1016\/j.neucom.2007.11.026","article-title":"Natural actor-critic","volume":"71","author":"Peters","year":"2008","journal-title":"Neurocomputing"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1007\/BF00992698","article-title":"Q-learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_16","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement learning in robotics: A survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"6932","DOI":"10.1109\/LRA.2020.3026638","article-title":"Mobile robot path planning in dynamic environments through globally guided reinforcement learning","volume":"5","author":"Wang","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"24884","DOI":"10.1109\/ACCESS.2021.3057485","article-title":"Unmanned aerial vehicle path planning algorithm based on deep reinforcement learning in large-scale and dynamic environments","volume":"9","author":"Xie","year":"2021","journal-title":"IEEE Access"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1141","DOI":"10.1109\/TSMCA.2012.2227719","article-title":"A deterministic improved Q-learning for path planning of a mobile robot","volume":"43","author":"Konar","year":"2013","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, B., Li, G., Zheng, Q., Bai, X., Ding, Y., and Khan, A. (2022). Path Planning for Wheeled Mobile Robot in Partially Known Uneven Terrain. Sensors, 22.","DOI":"10.3390\/s22145217"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhou, S., Liu, X., Xu, Y., and Guo, J. (2010, January 20\u201323). A deep Q-network (DQN) based path planning method for mobile robots. Proceedings of the 2018 IEEE International Conference on Information and Automation (ICIA), Harbin, China.","DOI":"10.1109\/ICInfA.2018.8812452"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1016\/j.robot.2019.02.013","article-title":"Solving the optimal path planning of a mobile robot using improved Q-learning","volume":"115","author":"Low","year":"2019","journal-title":"Robot. Auton. Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"106299","DOI":"10.1016\/j.oceaneng.2019.106299","article-title":"A knowledge-free path planning approach for smart ships based on reinforcement learning","volume":"189","author":"Chen","year":"2019","journal-title":"Ocean Eng."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"47824","DOI":"10.1109\/ACCESS.2020.2978077","article-title":"The experience-memory Q-learning algorithm for robot path planning in unknown environment","volume":"8","author":"Zhao","year":"2020","journal-title":"IEEE Access"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1109\/ISCID.2017.132","article-title":"Using Partial-Policy Q-Learning to Plan Path for Robot Navigation in Unknown Enviroment","volume":"Volume 1","author":"Zhang","year":"2017","journal-title":"Proceedings of the 2017 10th International Symposium on Computational Intelligence and Design (ISCID)"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"106796","DOI":"10.1016\/j.asoc.2020.106796","article-title":"Optimal path planning approach based on Q-learning algorithm for mobile robots","volume":"97","author":"Maoudj","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Das, P.K., Mandhata, S., Behera, H., and Patro, S. (2012). An improved Q-learning algorithm for path-planning of a mobile robot. Int. J. Comput. Appl., 51.","DOI":"10.5120\/8073-1468"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Robards, M., Sunehag, P., Sanner, S., and Marthi, B. (2011, January 5\u20139). Sparse kernel-SARSA (\u03bb) with an eligibility trace. Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Athens, Greece.","DOI":"10.1007\/978-3-642-23808-6_1"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"984","DOI":"10.1007\/s11431-020-1729-2","article-title":"An actor-critic based learning method for decision-making and planning of autonomous vehicles","volume":"64","author":"Xu","year":"2021","journal-title":"Sci. China Technol. Sci."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.oceaneng.2019.04.099","article-title":"Deep reinforcement learning-based controller for path following of an unmanned surface vehicle","volume":"183","author":"Woo","year":"2019","journal-title":"Ocean Eng."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"53296","DOI":"10.1109\/ACCESS.2018.2871222","article-title":"Path planning of industrial robot based on improved RRT algorithm in complex environments","volume":"6","author":"Zhang","year":"2018","journal-title":"IEEE Access"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1016\/j.procs.2018.01.054","article-title":"Grid path planning with deep reinforcement learning: Preliminary results","volume":"123","author":"Panov","year":"2018","journal-title":"Procedia Comput. Sci."},{"key":"ref_34","first-page":"40","article-title":"An Improved Q-Learning Algorithm and Its Application in Path Planning","volume":"51","author":"Mao","year":"2021","journal-title":"J. Taiyuan Univ. Technol."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Simsek, M., Czylwik, A., Galindo-Serrano, A., and Giupponi, L. (2011, January 16\u201318). Improved decentralized Q-learning algorithm for interference reduction in LTE-femtocells. Proceedings of the 2011 Wireless Advanced, Surathkal, India.","DOI":"10.1109\/WiAd.2011.5983301"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/15\/5910\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:05:31Z","timestamp":1760141131000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/15\/5910"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,8]]},"references-count":35,"journal-issue":{"issue":"15","published-online":{"date-parts":[[2022,8]]}},"alternative-id":["s22155910"],"URL":"https:\/\/doi.org\/10.3390\/s22155910","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,8]]}}}