{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T00:54:22Z","timestamp":1760057662541,"version":"build-2065373602"},"reference-count":37,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,2,15]],"date-time":"2025-02-15T00:00:00Z","timestamp":1739577600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"A-MoVeR\u2014\u201cMobilizing Agenda for the Development of Products and Systems towards an Intelligent and Green Mobility\u201d","award":["02\/C05-i01.01\/2022.PC646908627-00000069","02\/C05-i01\/2022"],"award-info":[{"award-number":["02\/C05-i01.01\/2022.PC646908627-00000069","02\/C05-i01\/2022"]}]},{"name":"Mobilizing Agendas for Business Innovation","award":["02\/C05-i01.01\/2022.PC646908627-00000069","02\/C05-i01\/2022"],"award-info":[{"award-number":["02\/C05-i01.01\/2022.PC646908627-00000069","02\/C05-i01\/2022"]}]},{"name":"European funds provided to Portugal","award":["02\/C05-i01.01\/2022.PC646908627-00000069","02\/C05-i01\/2022"],"award-info":[{"award-number":["02\/C05-i01.01\/2022.PC646908627-00000069","02\/C05-i01\/2022"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>In the field of gaming artificial intelligence, selecting the appropriate machine learning approach is essential for improving decision-making and automation. This paper examines the effectiveness of deep reinforcement learning (DRL) within interactive gaming environments, focusing on complex decision-making tasks. Utilizing the Unity engine, we conducted experiments to evaluate DRL methodologies in simulating realistic and adaptive agent behavior. A vehicle driving game is implemented, in which the goal is to reach a certain target within a small number of steps, while respecting the boundaries of the roads. Our study compares Proximal Policy Optimization (PPO) and Soft Actor\u2013Critic (SAC) in terms of learning efficiency, decision-making accuracy, and adaptability. The results demonstrate that PPO successfully learns to reach the target, achieving higher and more stable cumulative rewards. Conversely, SAC struggles to reach the target, displaying significant variability and lower performance. These findings highlight the effectiveness of PPO in this context and indicate the need for further development, adaptation, and tuning of SAC. This research contributes to developing innovative approaches in how ML can improve how player agents adapt and react to their environments, thereby enhancing realism and dynamics in gaming experiences. Additionally, this work emphasizes the utility of using games to evolve such models, preparing them for real-world applications, namely in the field of vehicles\u2019 autonomous driving and optimal route calculation.<\/jats:p>","DOI":"10.3390\/a18020106","type":"journal-article","created":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T03:41:47Z","timestamp":1739763707000},"page":"106","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Machine Learning for Decision Support and Automation in Games: A Study on Vehicle Optimal Path"],"prefix":"10.3390","volume":"18","author":[{"given":"Gon\u00e7alo","family":"Penelas","sequence":"first","affiliation":[{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6478-6669","authenticated-orcid":false,"given":"Lu\u00eds","family":"Barbosa","sequence":"additional","affiliation":[{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal"},{"name":"INESC-TEC UTAD Pole, 5000-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9818-7090","authenticated-orcid":false,"given":"Ars\u00e9nio","family":"Reis","sequence":"additional","affiliation":[{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal"},{"name":"INESC-TEC UTAD Pole, 5000-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4847-5104","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Barroso","sequence":"additional","affiliation":[{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal"},{"name":"INESC-TEC UTAD Pole, 5000-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8248-080X","authenticated-orcid":false,"given":"Tiago","family":"Pinto","sequence":"additional","affiliation":[{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes and Alto Douro, 5000-801 Vila Real, Portugal"},{"name":"INESC-TEC UTAD Pole, 5000-801 Vila Real, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"111295","DOI":"10.1016\/j.asoc.2024.111295","article-title":"Dynamic task allocation in multi autonomous underwater vehicle confrontational games with multi-objective evaluation model and particle swarm optimization algorithm","volume":"153","author":"Sun","year":"2024","journal-title":"Appl. Soft Comput."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2193","DOI":"10.1007\/s10462-022-10224-2","article-title":"A survey on deep reinforcement learning for audio-based applications","volume":"56","author":"Latif","year":"2022","journal-title":"Artif. Intell. Rev."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Pang, G., You, L., and Fu, L. (2024, January 21\u201324). An Efficient DRL-Based Link Adaptation for Cellular Networks with Low Overhead. Proceedings of the 2024 IEEE Wireless Communications and Networking Conference (WCNC), Dubai, United Arab Emirates.","DOI":"10.1109\/WCNC57260.2024.10570648"},{"key":"ref_4","first-page":"7366","article-title":"A Comprehensive Survey on the Application of Deep and Reinforcement Learning Approaches in Autonomous Driving","volume":"34","author":"Elallid","year":"2022","journal-title":"J. King Saud Univ.-Comput. Inf. Sci."},{"key":"ref_5","first-page":"100164","article-title":"Autonomous Driving Architectures: Insights of Machine Learning and Deep Learning Algorithms","volume":"6","author":"Bachute","year":"2021","journal-title":"Mach. Learn. Appl."},{"key":"ref_6","unstructured":"Hu, A., Corrado, G., Griffiths, N., Murez, Z., Gurau, C., Yeo, H., Kendall, A., Cipolla, R., and Shotton, J. (December, January 28). Model-Based Imitation Learning for Urban Driving. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), New Orleans, LA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lee, T., and Kang, Y. (2021). Performance Analysis of Deep Neural Network Controller for Autonomous Driving Learning from a Nonlinear Model Predictive Control Method. Electronics, 10.","DOI":"10.3390\/electronics10070767"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Li, Y., and Aghvami, A.H. (2022, January 16\u201320). Intelligent UAV Navigation: A DRL-QiER Solution. Proceedings of the ICC 2022\u2014IEEE International Conference on Communications, Seoul, Republic of Korea.","DOI":"10.1109\/ICC45855.2022.9838566"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"153","DOI":"10.17083\/ijsg.v10i4.638","article-title":"Implementing Deep Reinforcement Learning (DRL)-based Driving Styles for Non-Player Vehicles","volume":"10","author":"Forneris","year":"2023","journal-title":"Int. J. Serious Games"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Yannakakis, G.N., and Togelius, J. (2018). Artificial Intelligence and Games, Springer International Publishing. [1st ed.].","DOI":"10.1007\/978-3-319-63519-4"},{"key":"ref_11","unstructured":"Chen, Y., Ji, C., Cai, Y., Yan, T., and Su, B. (2024). Deep Reinforcement Learning in Autonomous Car Path Planning and Control: A Survey. arXiv."},{"key":"ref_12","unstructured":"(2024, November 29). Wondering What Unity Is? Find Out Who We Are, Where We\u2019ve Been and Where We\u2019re Going. Available online: https:\/\/unity.com\/our-company."},{"key":"ref_13","unstructured":"Technologies, U. (2024, May 22). ML-Agents Overview\u2014Unity ML-Agents Toolkit. Available online: https:\/\/unity-technologies.github.io\/ml-agents\/ML-Agents-Overview."},{"key":"ref_14","unstructured":"Unity-Technologies (2024, May 22). Ml-Agents Documentation. Available online: https:\/\/github.com\/Unity-Technologies\/ml-agents\/blob\/develop\/docs\/Training-ML-Agents.md."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Pearce, T., and Zhu, J. (2022, January 21\u201324). Counter-Strike Deathmatch with Large-Scale Behavioural Cloning. Proceedings of the 2022 IEEE Conference on Games (CoG), Beijing, China.","DOI":"10.1109\/CoG51982.2022.9893617"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2509","DOI":"10.1109\/TITS.2021.3119073","article-title":"A Generative Adversarial Imitation Learning Approach for Realistic Aircraft Taxi-Speed Modeling","volume":"23","author":"Pham","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"128068","DOI":"10.1016\/j.neucom.2024.128068","article-title":"A review of research on reinforcement learning algorithms for multi-agents","volume":"599","author":"Hu","year":"2024","journal-title":"Neurocomputing"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1007\/s12525-021-00475-2","article-title":"Machine learning and deep learning","volume":"31","author":"Janiesch","year":"2021","journal-title":"Electron. Mark."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"4715","DOI":"10.1007\/s11831-021-09552-3","article-title":"Deep Reinforcement Learning Techniques in Diversified Domains: A Survey","volume":"28","author":"Gupta","year":"2021","journal-title":"Arch. Comput. Methods Eng."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3215","DOI":"10.1007\/s10462-020-09938-y","article-title":"A survey on multi-agent deep reinforcement learning: From the perspective of challenges and applications","volume":"54","author":"Du","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lukas, M., Tomicic, I., and Bernik, A. (2022). Anticheat System Based on Reinforcement Learning Agents in Unity. Information, 13.","DOI":"10.3390\/info13040173"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1016\/j.future.2022.06.015","article-title":"Target localization using Multi-Agent Deep Reinforcement Learning with Proximal Policy Optimization","volume":"136","author":"Alagha","year":"2022","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"116718","DOI":"10.1016\/j.eswa.2022.116718","article-title":"Application of advanced tree search and proximal policy optimization on formula-E race strategy development","volume":"197","author":"Liu","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_24","unstructured":"OpenAI (2024, July 16). Proximal Policy Optimization\u2014Spinning Up Documentation. Available online: https:\/\/spinningup.openai.com\/en\/latest\/algorithms\/ppo.html."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"26871","DOI":"10.1109\/ACCESS.2021.3056903","article-title":"Motion Planning for Dual-Arm Robot Based on Soft Actor-Critic","volume":"9","author":"Wong","year":"2021","journal-title":"IEEE Access"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"100101","DOI":"10.1016\/j.egyai.2021.100101","article-title":"Development of a Soft Actor Critic deep reinforcement learning approach for harnessing energy flexibility in a Large Office building","volume":"5","author":"Kathirgamanathan","year":"2021","journal-title":"Energy AI"},{"key":"ref_27","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. arXiv."},{"key":"ref_28","unstructured":"Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., and Abbeel, P. (2019). Soft Actor-Critic Algorithms and Applications. arXiv."},{"key":"ref_29","unstructured":"OpenAI (2024, July 16). Soft Actor-Critic\u2014Spinning Up Documentation. Available online: https:\/\/spinningup.openai.com\/en\/latest\/algorithms\/sac.html."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"231099","DOI":"10.1016\/j.jpowsour.2022.231099","article-title":"A soft actor-critic-based energy management strategy for electric vehicles with hybrid energy storage systems","volume":"524","author":"Xu","year":"2022","journal-title":"J. Power Sources"},{"key":"ref_31","unstructured":"Technologies, U. (2024, May 22). Unity Real-Time Development Platform | 3D, 2D, VR & AR Engine. Available online: https:\/\/unity.com."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Singh, S., and Kaur, A. (2022, January 18\u201319). Game Development using Unity Game Engine. Proceedings of the 2022 3rd International Conference on Computing, Analytics and Networks (ICAN), Rajpura, Punjab, India.","DOI":"10.1109\/ICAN56228.2022.10007155"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"804","DOI":"10.1177\/15554120221139218","article-title":"Designing the Future? The Metaverse, NFTs, & the Future as Defined by Unity Users","volume":"18","author":"Scheiding","year":"2023","journal-title":"Games Cult."},{"key":"ref_34","unstructured":"Hocking, J. (2022). Unity in Action, Manning Publications. [3rd ed.]."},{"key":"ref_35","unstructured":"Unity-Technologies (2024, July 16). Default PPO Training Configurations. Available online: https:\/\/github.com\/Unity-Technologies\/ml-agents\/blob\/develop\/config\/ppo\/FoodCollector.yaml."},{"key":"ref_36","unstructured":"Unity-Technologies (2024, July 16). Default SAC Training Configurations. Available online: https:\/\/github.com\/Unity-Technologies\/ml-agents\/blob\/develop\/config\/sac\/FoodCollector.yaml."},{"key":"ref_37","unstructured":"Unity-Technologies (2024, July 16). Ml-Agents Hyperparameters Documentation. Available online: https:\/\/github.com\/Unity-Technologies\/ml-agents\/blob\/develop\/docs\/Training-Configuration-File.md."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/2\/106\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:35:00Z","timestamp":1760027700000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/2\/106"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,15]]},"references-count":37,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,2]]}},"alternative-id":["a18020106"],"URL":"https:\/\/doi.org\/10.3390\/a18020106","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2025,2,15]]}}}