{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T08:55:47Z","timestamp":1775120147246,"version":"3.50.1"},"reference-count":55,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2023,5,14]],"date-time":"2023-05-14T00:00:00Z","timestamp":1684022400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Advancements in artificial intelligence are leading researchers to find use cases that were not as straightforward to solve in the past. The use case of simulated autonomous driving has been known as a notoriously difficult task to automate, but advancements in the field of reinforcement learning have made it possible to reach satisfactory results. In this paper, we explore the use of the Unity ML-Agents toolkit to train intelligent agents to navigate a racing track in a simulated environment using RL algorithms. The paper compares the performance of several different RL algorithms and configurations on the task of training kart agents to successfully traverse a racing track and identifies the most effective approach for training kart agents to navigate a racing track and avoid obstacles in that track. The best results, value loss of 0.0013 and a cumulative reward of 0.761, were yielded using the Proximal Policy Optimization algorithm. After successfully choosing a model and algorithm that can traverse the track with ease, different objects were added to the track and another model (which used behavioral cloning as a pre-training option) was trained to avoid such obstacles. The aforementioned model resulted in a value loss of 0.001 and a cumulative reward of 0.068, proving that behavioral cloning can help achieve satisfactory results where the in game agents are able to avoid obstacles more efficiently and complete the track with human-like performance, allowing for a deployment of intelligent agents in racing simulators.<\/jats:p>","DOI":"10.3390\/info14050290","type":"journal-article","created":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T02:02:11Z","timestamp":1684116131000},"page":"290","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Simulated Autonomous Driving Using Reinforcement Learning: A Comparative Study on Unity\u2019s ML-Agents Framework"],"prefix":"10.3390","volume":"14","author":[{"given":"Yusef","family":"Savid","sequence":"first","affiliation":[{"name":"Department of Multimedia Engineering, Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7451-4387","authenticated-orcid":false,"given":"Reza","family":"Mahmoudi","sequence":"additional","affiliation":[{"name":"Department of Multimedia Engineering, Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2809-2213","authenticated-orcid":false,"given":"Rytis","family":"Maskeli\u016bnas","sequence":"additional","affiliation":[{"name":"Center of Excellence Forest 4.0, Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9990-1084","authenticated-orcid":false,"given":"Robertas","family":"Dama\u0161evi\u010dius","sequence":"additional","affiliation":[{"name":"Center of Excellence Forest 4.0, Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/MSP.2017.2743240","article-title":"Deep reinforcement learning: A brief survey","volume":"34","author":"Arulkumaran","year":"2017","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"102517","DOI":"10.1016\/j.rcim.2022.102517","article-title":"A review on reinforcement learning for contact-rich robotic manipulation tasks","volume":"81","author":"Chrysostomou","year":"2023","journal-title":"Robot. Comput.-Integr. Manuf."},{"key":"ref_3","unstructured":"Malleret, T., and Schwab, K. (2021). Great Narrative (The Great Reset Book 2), World Economic Forum."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1007\/s42452-020-2560-3","article-title":"Reinforcement learning applied to games","volume":"2","author":"Crespo","year":"2020","journal-title":"SN Appl. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1142\/S2301385023310027","article-title":"Reinforcement Learning Applications in Unmanned Vehicle Control: A Comprehensive Overview","volume":"11","author":"Liu","year":"2022","journal-title":"Unmanned Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"e7403","DOI":"10.1002\/cpe.7403","article-title":"An IoT enabled smart healthcare system using deep reinforcement learning","volume":"34","author":"Jagannath","year":"2022","journal-title":"Concurr. Comput. Pract. Exp."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Shuvo, S.S., Symum, H., Ahmed, M.R., Yilmaz, Y., and Zayas-Castro, J.L. (2022). Multi-Objective Reinforcement Learning Based Healthcare Expansion Planning Considering Pandemic Events. IEEE J. Biomed. Health Inform., 1\u201311.","DOI":"10.1109\/JBHI.2022.3187950"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Faria, R.D.R., Capron, B.D.O., Secchi, A.R., and de Souza, M.B. (2022). Where Reinforcement Learning Meets Process Control: Review and Guidelines. Processes, 10.","DOI":"10.3390\/pr10112311"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"106886","DOI":"10.1016\/j.compchemeng.2020.106886","article-title":"A review On reinforcement learning: Introduction and applications in industrial process control","volume":"139","author":"Nian","year":"2020","journal-title":"Comput. Chem. Eng."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Shaqour, A., and Hagishima, A. (2022). Systematic Review on Deep Reinforcement Learning-Based Energy Management for Different Building Types. Energies, 15.","DOI":"10.3390\/en15228663"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"118926","DOI":"10.1016\/j.eswa.2022.118926","article-title":"REDRL: A review-enhanced Deep Reinforcement Learning model for interactive recommendation","volume":"213","author":"Liu","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_12","first-page":"589","article-title":"Deep Reinforcement Learning in the Advanced Cybersecurity Threat Detection and Protection","volume":"25","author":"Sewak","year":"2022","journal-title":"Inf. Syst. Front."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"7262","DOI":"10.1109\/LRA.2021.3097345","article-title":"Vision-Based Autonomous Car Racing Using Deep Imitative Reinforcement Learning","volume":"6","author":"Cai","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_14","first-page":"33","article-title":"Threading the Needle\u2014Overtaking Framework for Multi-agent Autonomous Racing","volume":"5","author":"Behl","year":"2022","journal-title":"SAE Int. J. Connect. Autom. Veh."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1143","DOI":"10.1109\/LRA.2020.2966414","article-title":"Learning Robust Control Policies for End-to-End Autonomous Driving from Data-Driven Simulation","volume":"5","author":"Amini","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Walker, V., Vanegas, F., and Gonzalez, F. (2022). NanoMap: A GPU-Accelerated OpenVDB-Based Mapping and Simulation Package for Robotic Agents. Remote Sens., 14.","DOI":"10.3390\/rs14215463"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"117798","DOI":"10.1016\/j.eswa.2022.117798","article-title":"Driving support by type-2 fuzzy logic control model","volume":"207","author":"Zielonka","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"957","DOI":"10.53106\/160792642021092205002","article-title":"Design and implementation of autonomous path planning for intelligent vehicle","volume":"22","author":"Wei","year":"2021","journal-title":"J. Internet Technol."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"357","DOI":"10.5755\/j01.itc.50.2.28234","article-title":"Cloud-based multi-robot path planning in complex and crowded environment using fuzzy logic and online learning","volume":"50","author":"Zagradjanin","year":"2021","journal-title":"Inf. Technol. Control"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"507","DOI":"10.5755\/j01.itc.50.3.25979","article-title":"Application of deep reinforcement learning tracking control of 3wd omnidirectional mobile robot","volume":"50","author":"Mehmood","year":"2021","journal-title":"Inf. Technol. Control"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"588","DOI":"10.5755\/j01.itc.50.3.25905","article-title":"Distributed iterative learning formation control for nonholonomic multiple wheeled mobile robots with channel noise","volume":"50","author":"Xuhui","year":"2021","journal-title":"Inf. Technol. Control"},{"key":"ref_22","first-page":"7632892","article-title":"Autonomous Vehicles and Intelligent Automation: Applications, Challenges and Opportunities","volume":"2022","author":"Bathla","year":"2022","journal-title":"Mob. Inf. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"499","DOI":"10.5755\/j01.itc.51.3.30016","article-title":"A Fuzzy Logic Path Planning Algorithm Based on Geometric Landmarks and Kinetic Constraints","volume":"51","author":"Wang","year":"2022","journal-title":"Inf. Technol. Control"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"645","DOI":"10.24425\/mms.2019.130562","article-title":"Energy-efficient walking over irregular terrain: A case of hexapod robot","volume":"26","author":"Luneckas","year":"2019","journal-title":"Metrol. Meas. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1007\/s11370-020-00340-9","article-title":"A hybrid tactile sensor-based obstacle overcoming method for hexapod walking robots","volume":"14","author":"Luneckas","year":"2021","journal-title":"Intell. Serv. Robot."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"179","DOI":"10.5755\/j01.itc.48.2.21390","article-title":"Optimized RRT-A* path planning method for mobile robots in partially known environment","volume":"48","author":"Ayawli","year":"2019","journal-title":"Inf. Technol. Control"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"172988141668711","DOI":"10.1177\/1729881416687111","article-title":"Test bed for applications of heterogeneous unmanned vehicles","volume":"14","author":"Palacios","year":"2017","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Herman, J., Francis, J., Ganju, S., Chen, B., Koul, A., Gupta, A., Skabelkin, A., Zhukov, I., Kumskoy, M., and Nyberg, E. (2021, January 11\u201317). Learn-to-Race: A Multimodal Control Environment for Autonomous Racing. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00965"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Alm\u00f3n-Manzano, L., Pastor-Vargas, R., and Troncoso, J.M.C. (2022). Deep Reinforcement Learning in Agents\u2019 Training: Unity ML-Agents, Springer. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).","DOI":"10.1007\/978-3-031-06527-9_39"},{"key":"ref_30","first-page":"353","article-title":"Game engine (Unity, Unreal Engine)","volume":"71","author":"Yasufuku","year":"2017","journal-title":"Kyokai Joho Imeji Zasshi\/J. Inst. Image Inf. Telev. Eng."},{"key":"ref_31","unstructured":"\u015eerban, G. (2005). Advances in Soft Computing, Springer."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Ramezani Dooraki, A., and Lee, D.J. (2018). An end-to-end deep reinforcement learning-based intelligent agent capable of autonomous exploration in unknown environments. Sensors, 18.","DOI":"10.3390\/s18103575"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Urrea, C., Garrido, F., and Kern, J. (2021). Design and implementation of intelligent agent training systems for virtual vehicles. Sensors, 21.","DOI":"10.3390\/s21020492"},{"key":"ref_34","unstructured":"Juliani, A., Berges, V.P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., and Mattar, M. (2018). Unity: A general platform for intelligent agents. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_36","unstructured":"Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., and Zhang, J. (2016). End to End Learning for Self-Driving Cars. arXiv."},{"key":"ref_37","first-page":"6382","article-title":"Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments","volume":"Volume NIPS\u201917","author":"Lowe","year":"2017","journal-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Guckiran, K., and Bolat, B. (November, January 31). Autonomous Car Racing in Simulation Environment Using Deep Reinforcement Learning. Proceedings of the 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), Izmir, Turkey.","DOI":"10.1109\/ASYU48272.2019.8946332"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","article-title":"Neuronlike adaptive elements that can solve difficult learning control problems","volume":"SMC-13","author":"Barto","year":"1983","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Bhattacharyya, R.P., Phillips, D.J., Wulfe, B., Morton, J., Kuefler, A., and Kochenderfer, M.J. (2018, January 1\u20135). Multi-Agent Imitation Learning for Driving Simulation. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593758"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Palanisamy, P. (2020, January 19\u201324). Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9207663"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1111\/mice.12495","article-title":"A deep learning algorithm for simulating autonomous driving considering prior knowledge and temporal information","volume":"35","author":"Chen","year":"2019","journal-title":"Comput.-Aided Civ. Infrastruct. Eng."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Almasi, P., Moni, R., and Gyires-Toth, B. (2020, January 19\u201324). Robust Reinforcement Learning-based Autonomous Driving Agent for Simulation and Real World. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9207497"},{"key":"ref_44","first-page":"7169594","article-title":"Improving Model-Based Deep Reinforcement Learning with Learning Degree Networks and Its Application in Robot Control","volume":"2022","author":"Ma","year":"2022","journal-title":"J. Robot."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Onishi, T., Motoyoshi, T., Suga, Y., Mori, H., and Ogata, T. (2019, January 14\u201319). End-to-end Learning Method for Self-Driving Cars with Trajectory Recovery Using a Path-following Function. Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8852322"},{"key":"ref_46","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."},{"key":"ref_47","unstructured":"Cohen, A., Teng, E., Berges, V.P., Dong, R.P., Henry, H., Mattar, M., Zook, A., and Ganguly, S. (2021). On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning. arXiv."},{"key":"ref_48","unstructured":"Yu, C., Velu, A., Vinitsky, E., Gao, J., Wang, Y., Bayen, A., and Wu, Y. (2021). The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games. arXiv."},{"key":"ref_49","first-page":"366","article-title":"Online Parallel Boosting","volume":"Volume AAAI\u201904","author":"Reichler","year":"2004","journal-title":"Proceedings of the 19th National Conference on Artifical Intelligence"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Tang, Z., Luo, L., Xie, B., Zhu, Y., Zhao, R., Bi, L., and Lu, C. (2022). Automatic Sparse Connectivity Learning for Neural Networks. arXiv.","DOI":"10.1109\/TNNLS.2022.3141665"},{"key":"ref_51","unstructured":"Zhu, M., and Gupta, S. (2017). To prune or not to prune: Exploring the efficacy of pruning for model compression. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Hu, W., Che, Z., Liu, N., Li, M., Tang, J., Zhang, C., and Wang, J. (2023). CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization. IEEE Trans. Neural Netw. Learn. Syst., 1\u201313.","DOI":"10.1109\/TNNLS.2023.3262952"},{"key":"ref_53","unstructured":"Palacios, E., and Pel\u00e1ez, E. (2021, January 22\u201324). Towards training swarms for game AI. Proceedings of the 22nd International Conference on Intelligent Games and Simulation, GAME-ON 2021, Aveiro, Portugal."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Kovalsk\u00fd, K., and Palamas, G. (2021). Neuroevolution vs. Reinforcement Learning for Training Non Player Characters in Games: The Case of a Self Driving Car, Springer. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering.","DOI":"10.1007\/978-3-030-76426-5_13"},{"key":"ref_55","unstructured":"Laskin, M., Lee, K., Stooke, A., Pinto, L., Abbeel, P., and Srinivas, A. (2020). Reinforcement Learning with Augmented Data. arXiv."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/14\/5\/290\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:34:50Z","timestamp":1760124890000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/14\/5\/290"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,14]]},"references-count":55,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["info14050290"],"URL":"https:\/\/doi.org\/10.3390\/info14050290","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,14]]}}}