{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T16:15:38Z","timestamp":1774455338849,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2019,9,7]],"date-time":"2019-09-07T00:00:00Z","timestamp":1567814400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51905061"],"award-info":[{"award-number":["51905061"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005230","name":"Natural Science Foundation of Chongqing","doi-asserted-by":"publisher","award":["cstc2019jcyj-msxmX0097"],"award-info":[{"award-number":["cstc2019jcyj-msxmX0097"]}],"id":[{"id":"10.13039\/501100005230","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007957","name":"Chongqing Municipal Education Commission","doi-asserted-by":"publisher","award":["KJQN201801124"],"award-info":[{"award-number":["KJQN201801124"]}],"id":[{"id":"10.13039\/501100007957","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100013160","name":"Venture and Innovation Support Program for Chongqing Overseas Returnees","doi-asserted-by":"publisher","award":["cx2018135"],"award-info":[{"award-number":["cx2018135"]}],"id":[{"id":"10.13039\/501100013160","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Open Project Program of the State Key Laboratory of Engines (Tianjin University)","award":["k2019-02"],"award-info":[{"award-number":["k2019-02"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Reinforcement learning (RL) based techniques have been employed for the tracking and adaptive cruise control of a small-scale vehicle with the aim to transfer the obtained knowledge to a full-scale intelligent vehicle in the near future. Unlike most other control techniques, the purpose of this study is to seek a practical method that enables the vehicle, in the real environment and in real time, to learn the control behavior on its own while adapting to the changing circumstances. In this context, it is necessary to design an algorithm that symmetrically considers both time efficiency and accuracy. Meanwhile, in order to realize adaptive cruise control specifically, a set of symmetrical control actions consisting of steering angle and vehicle speed needs to be optimized simultaneously. In this paper, firstly, the experimental setup of the small-scale intelligent vehicle is introduced. Subsequently, three model-free RL algorithm are conducted to develop and finally form the strategy to keep the vehicle within its lanes at constant and top velocity. Furthermore, a model-based RL strategy is compared that incorporates learning from real experience and planning from simulated experience. Finally, a Q-learning based adaptive cruise control strategy is intermixed to the existing tracking control architecture to allow the vehicle slow-down in the curve and accelerate on straightaways. The experimental results show that the Q-learning and Sarsa (\u03bb) algorithms can achieve a better tracking behavior than the conventional Sarsa, and Q-learning outperform Sarsa (\u03bb) in terms of computational complexity. The Dyna-Q method performs similarly with the Sarsa (\u03bb) algorithms, but with a significant reduction of computational time. Compared with a fine-tuned proportion integration differentiation (PID) controller, the good-balanced Q-learning is seen to perform better and it can also be easily applied to control problems with over one control actions.<\/jats:p>","DOI":"10.3390\/sym11091139","type":"journal-article","created":{"date-parts":[[2019,9,9]],"date-time":"2019-09-09T04:12:40Z","timestamp":1568002360000},"page":"1139","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Reinforcement Learning Approach to Design Practical Adaptive Control for a Small-Scale Intelligent Vehicle"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2995-2358","authenticated-orcid":false,"given":"Bo","family":"Hu","sequence":"first","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"},{"name":"State Key Laboratory of Engines, Tianjin University, Tianjin 300072, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiaxi","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jie","family":"Yang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haitao","family":"Bai","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuang","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Youchang","family":"Sun","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoyu","family":"Yang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,9,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1109\/TIV.2016.2578706","article-title":"A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles","volume":"1","author":"Paden","year":"2016","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3508","DOI":"10.1109\/TITS.2015.2477556","article-title":"PROUD\u2014Public Road Urban Driverless-Car Test","volume":"16","author":"Broggi","year":"2015","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1109\/TIV.2016.2608003","article-title":"Intelligence Testing for Autonomous Vehicles: A New Approach","volume":"1","author":"Li","year":"2016","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Xu, Z., Wang, M., Zhang, F., Jin, S., Zhang, J., and Zhao, X. (2017). Patavtt: A hardware-in-the-loop scaled platform for testing autonomous vehicle trajectory tracking. J. Adv. Transp., 1\u201311.","DOI":"10.1155\/2017\/9203251"},{"key":"ref_5","unstructured":"(2019, September 01). From the Lab to the Street: Solving the Challenge of Accelerating Automated Vehicle Testing. Available online: http:\/\/www.hitachi.com\/rev\/archive\/2018\/r2018_01\/trends2\/index.html\/."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ruz, M.L., Garrido, J., Vazquez, F., and Morilla, F. (2018). Interactive Tuning Tool of Proportional-Integral Controllers for First Order Plus Time Delay Processes. Symmetry, 10.","DOI":"10.3390\/sym10110569"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Liu, X., Shi, Y., and Xu, J. (2017). Parameters Tuning Approach for Proportion Integration Differentiation Controller of Magnetorheological Fluids Brake Based on Improved Fruit Fly Optimization Algorithm. Symmetry, 9.","DOI":"10.3390\/sym9070109"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1109\/TITB.2003.821326","article-title":"Expert PID Control System for Blood Glucose Control in Critically Ill Patients","volume":"7","author":"Chee","year":"2003","journal-title":"IEEE Trans. Inf. Technol. Biomed."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2658","DOI":"10.1016\/j.asoc.2012.11.021","article-title":"A multivariable predictive fuzzy PID control system","volume":"13","author":"Savran","year":"2013","journal-title":"Appl. Soft Comput."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Lopez_Franco, C., Gomez-Avila, J., Alanis, A.Y., Arana-Daniel, N., and Villase\u00f1or, C. (2017). Visual Servoing for an Autonomous Hexarotor Using a Neural Network Based PID Controller. Sensors, 17.","DOI":"10.3390\/s17081865"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Moriyama, K., Nakase, K., Mutoh, A., and Inuzuka, N. (2017, January 6\u20139). The Resilience of Cooperation in a Dilemma Game Played by Reinforcement Learning Agents. Proceedings of the IEEE International Conference on Agents (ICA), Beijing, China.","DOI":"10.1109\/AGENTS.2017.8015297"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/s00521-013-1504-x","article-title":"Robots learn to dance through interaction with humans","volume":"24","author":"Meng","year":"2014","journal-title":"Neural Comput. Appl."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1016\/j.cor.2011.07.019","article-title":"Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning","volume":"39","author":"Zhang","year":"2012","journal-title":"Comput. Oper. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1074","DOI":"10.1016\/j.neunet.2011.05.002","article-title":"An Information-Theoretic Analysis of Return Maximization in Reinforcement Learning","volume":"24","author":"Iwata","year":"2011","journal-title":"Neural Netw."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/j.matcom.2016.05.008","article-title":"Simulation-based optimization of radiotherapy: Agent-based modelling and reinforcement learning","volume":"133","author":"Jalalimanesh","year":"2017","journal-title":"Math. Comput. Simul."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1016\/j.ins.2012.12.021","article-title":"Undesired state-action prediction in multi-Agent reinforcement learning for linked multi-component robotic system control","volume":"232","author":"Marques","year":"2013","journal-title":"Inf. Sci."},{"key":"ref_17","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, The MIT Press. [2nd ed.]."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"7243","DOI":"10.3390\/en8077243","article-title":"Reinforcement Learning\u2013Based Energy Management Strategy for a Hybrid Electric Tracked Vehicle","volume":"8","author":"Liu","year":"2015","journal-title":"Energies"},{"key":"ref_19","first-page":"13","article-title":"Decreasing Induction Motor Loss Using Reinforcement Learning","volume":"4","author":"Sistani","year":"2016","journal-title":"J. Autom. Control Eng."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2442087.2442095","article-title":"Achieving Autonomous Power Management Using Reinforcement Learning","volume":"18","author":"Shen","year":"2013","journal-title":"ACM Trans. Des. Autom. Electron. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1681","DOI":"10.1109\/TSTE.2016.2568754","article-title":"Control of a Point Absorber using Reinforcement Learning","volume":"7","author":"Anderlini","year":"2016","journal-title":"IEEE Trans. Sustain Energy"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Sun, J., Huang, G., Sun, G., Yu, H., Sangaiah, A.K., and Chang, V. (2018). A Q-Learning-Based Approach for Deploying Dynamic Service Function Chains. Symmetry, 10.","DOI":"10.3390\/sym10110646"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1016\/j.engappai.2009.01.014","article-title":"Dynamic scheduling of maintenance tasks in the petroleum industry: A reinforcement approach","volume":"22","author":"Aissani","year":"2009","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Habib, A., Khan, M.I., and Uddin, J. (2016, January 18\u201320). Optimal Route Selection in Complex Multi-stage Supply Chain Networks using SARSA(\u03bb). Proceedings of the 19th International Conference on Computer and Information Technology, North South University, Dhaka, Bangladesh.","DOI":"10.1109\/ICCITECHN.2016.7860190"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Li, Z., Lu, Y., Shi, Y., Wang, Z., Qiao, W., and Liu, Y. (2019). A Dyna-Q-Based Solution for UAV Networks Against Smart Jamming Attacks. Symmetry, 11.","DOI":"10.3390\/sym11050617"},{"key":"ref_26","unstructured":"(2019, April 28). Mit-Racecar. Available online: http\/\/www.Github.com\/mit-racecar\/."},{"key":"ref_27","unstructured":"(2019, April 28). Berkeley Autonomous Race Car. Available online: http\/\/www.barc-project.com\/."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/nature24270","article-title":"Mastering the Game of Go without Human Knowledge","volume":"550","author":"Silver","year":"2017","journal-title":"Nature"},{"key":"ref_29","first-page":"1","article-title":"Reinforcement Learning by Comparing Immediate Reward","volume":"8","author":"Pandey","year":"2010","journal-title":"Int. J. Comput. Sci. Inf. Secur."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1497","DOI":"10.1109\/TMECH.2017.2707338","article-title":"Reinforcement Learning Optimized Look-Ahead Energy Management of a Parallel Hybrid Electric Vehicle","volume":"22","author":"Liu","year":"2017","journal-title":"IEEE\/ASME Trans. Mechatron."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/9\/1139\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:17:35Z","timestamp":1760188655000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/9\/1139"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,9,7]]},"references-count":30,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2019,9]]}},"alternative-id":["sym11091139"],"URL":"https:\/\/doi.org\/10.3390\/sym11091139","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,9,7]]}}}