{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T18:39:44Z","timestamp":1776451184839,"version":"3.51.2"},"reference-count":44,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,3,2]],"date-time":"2023-03-02T00:00:00Z","timestamp":1677715200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000781","name":"European Research Council","doi-asserted-by":"publisher","award":["833915"],"award-info":[{"award-number":["833915"]}],"id":[{"id":"10.13039\/501100000781","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Systems"],"abstract":"<jats:p>Lane-free traffic is a novel research domain, in which vehicles no longer adhere to the notion of lanes, and consider the whole lateral space within the road boundaries. This constitutes an entirely different problem domain for autonomous driving compared to lane-based traffic, as there is no leader vehicle or lane-changing operation. Therefore, the observations of the vehicles need to properly accommodate the lane-free environment without carrying over bias from lane-based approaches. The recent successes of deep reinforcement learning (DRL) for lane-based approaches, along with emerging work for lane-free traffic environments, render DRL for lane-free traffic an interesting endeavor to investigate. In this paper, we provide an extensive look at the DRL formulation, focusing on the reward function of a lane-free autonomous driving agent. Our main interest is designing an effective reward function, as the reward model is crucial in determining the overall efficiency of the resulting policy. Specifically, we construct different components of reward functions tied to the environment at various levels of information. Then, we combine and collate the aforementioned components, and focus on attaining a reward function that results in a policy that manages to both reduce the collisions among vehicles and address their requirement of maintaining a desired speed. Additionally, we employ two popular DRL algorithms\u2014namely, deep Q-networks (enhanced with some commonly used extensions), and deep deterministic policy gradient (DDPG), which results in better policies. Our experiments provide a thorough investigative study on the effectiveness of different combinations among the various reward components we propose, and confirm that our DRL-employing autonomous vehicle is able to gradually learn effective policies in environments with varying levels of difficulty, especially when all of the proposed rewards components are properly combined.<\/jats:p>","DOI":"10.3390\/systems11030134","type":"journal-article","created":{"date-parts":[[2023,3,2]],"date-time":"2023-03-02T03:06:43Z","timestamp":1677726403000},"page":"134","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["Deep Reinforcement Learning Reward Function Design for Autonomous Driving in Lane-Free Traffic"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0911-9692","authenticated-orcid":false,"given":"Athanasia","family":"Karalakou","sequence":"first","affiliation":[{"name":"School of Electrical and Computer Engineering, Technical University of Crete, 73100 Chania, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3228-3888","authenticated-orcid":false,"given":"Dimitrios","family":"Troullinos","sequence":"additional","affiliation":[{"name":"School of Production Engineering and Management, Technical University of Crete, 73100 Chania, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0716-2972","authenticated-orcid":false,"given":"Georgios","family":"Chalkiadakis","sequence":"additional","affiliation":[{"name":"School of Electrical and Computer Engineering, Technical University of Crete, 73100 Chania, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5821-4982","authenticated-orcid":false,"given":"Markos","family":"Papageorgiou","sequence":"additional","affiliation":[{"name":"School of Production Engineering and Management, Technical University of Crete, 73100 Chania, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1109\/TITS.2020.3024655","article-title":"Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles","volume":"23","author":"Aradi","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_2","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. arXiv."},{"key":"ref_3","unstructured":"Badia, A.P., Piot, B., Kapturowski, S., Sprechmann, P., Vitvitskyi, A., Guo, Z.D., and Blundell, C. (2020, January 13\u201318). Agent57: Outperforming the Atari Human Benchmark. Proceedings of the 37th International Conference on Machine Learning, Virtual Event."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"103008","DOI":"10.1016\/j.trc.2021.103008","article-title":"A survey on autonomous vehicle control in the era of mixed-autonomy: From physics-based to AI-guided driving policy learning","volume":"125","author":"Di","year":"2021","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"4909","DOI":"10.1109\/TITS.2021.3054625","article-title":"Deep Reinforcement Learning for Autonomous Driving: A Survey","volume":"23","author":"Kiran","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Kendall, A., Hawke, J., Janz, D., Mazur, P., Reda, D., Allen, J.M., Lam, V.D., Bewley, A., and Shah, A. (2019, January 20\u201324). Learning to Drive in a Day. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793742"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1109\/JPROC.2020.3042681","article-title":"Lane-Free Artificial-Fluid Concept for Vehicular Traffic","volume":"109","author":"Papageorgiou","year":"2021","journal-title":"Proc. IEEE"},{"key":"ref_8","unstructured":"Troullinos, D., Chalkiadakis, G., Papamichail, I., and Papageorgiou, M. (2021, January 3\u20137). Collaborative Multiagent Decision Making for Lane-Free Autonomous Driving. Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS \u201921), Virtual."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Yanumula, V.K., Typaldos, P., Troullinos, D., Malekzadeh, M., Papamichail, I., and Papageorgiou, M. (2021, January 19\u201322). Optimal Path Planning for Connected and Automated Vehicles in Lane-free Traffic. Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA.","DOI":"10.1109\/ITSC48978.2021.9564698"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Karafyllis, I., Theodosis, D., and Papageorgiou, M. (2021, January 14\u201317). Lyapunov-Based Two-Dimensional Cruise Control of Autonomous Vehicles on Lane-Free Roads. Proceedings of the 60th IEEE Conference on Decision and Control (CDC2021), Austin, TX, USA.","DOI":"10.1109\/CDC45484.2021.9682975"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Malekzadeh, M., Manolis, D., Papamichail, I., and Papageorgiou, M. (2022, January 8\u201312). Empirical Investigation of Properties of Lane-free Automated Vehicle Traffic. Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China.","DOI":"10.1109\/ITSC55140.2022.9921864"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Naderi, M., Papageorgiou, M., Karafyllis, I., and Papamichail, I. (2022, January 8\u201312). Automated vehicle driving on large lane-free roundabouts. Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China.","DOI":"10.1109\/ITSC55140.2022.9922249"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Karalakou, A., Troullinos, D., Chalkiadakis, G., and Papageorgiou, M. (2022, January 13\u201315). Deep RL reward function design for lane-free autonomous driving. Proceedings of the 20th International Conference on Practical Applications of Agents and Multi-Agent Systems, L\u2019Aquila, Italy.","DOI":"10.1007\/978-3-031-18192-4_21"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.ifacol.2022.07.576","article-title":"Driving Strategy for Vehicles in Lane-Free Traffic Environment Based on Deep Deterministic Policy Gradient and Artificial Forces","volume":"55","author":"Berahman","year":"2022","journal-title":"IFAC-PapersOnLine"},{"key":"ref_15","first-page":"679","article-title":"A Markovian Decision Process","volume":"6","author":"Bellman","year":"1957","journal-title":"J. Math. Mech."},{"key":"ref_16","unstructured":"Watkins, C.J.C.H. (1989). Learning from Delayed Rewards. [Ph.D. Thesis, King\u2019s College]."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1007\/BF00992698","article-title":"Q-learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_18","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, The MIT Press. [2nd ed.]."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep Reinforcement Learning with Double Q-Learning. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_20","unstructured":"Lafferty, J., Williams, C., Shawe-Taylor, J., Zemel, R., and Culotta, A. (2010). Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_21","unstructured":"Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2016, January 2\u20134). Prioritized Experience Replay. Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico."},{"key":"ref_22","first-page":"1995","article-title":"Dueling Network Architectures for Deep Reinforcement Learning","volume":"Volume 48","author":"Balcan","year":"2016","journal-title":"Proceedings of the 33rd International Conference on Machine Learning"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Baird, L.C. (1993). Advantage Updating, Wright Lab. Technical Report WL-TR-93-1146.","DOI":"10.21236\/ADA280862"},{"key":"ref_24","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2016, January 2\u20134). Continuous control with deep reinforcement learning. Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico."},{"key":"ref_25","first-page":"387","article-title":"Deterministic Policy Gradient Algorithms","volume":"Volume 32","author":"Xing","year":"2014","journal-title":"Proceedings of the 31st International Conference on Machine Learning"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Troullinos, D., Chalkiadakis, G., Samoladas, V., and Papageorgiou, M. (2022, January 23\u201329). Max-Sum with Quadtrees for Decentralized Coordination in Continuous Domains. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Vienna, Austria.","DOI":"10.24963\/ijcai.2022\/74"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bai, Z., Shangguan, W., Cai, B., and Chai, L. (2019, January 27\u201330). Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic. Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China.","DOI":"10.23919\/ChiCC.2019.8866005"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Aradi, S., Becsi, T., and Gaspar, P. (2018, January 21\u201324). Policy Gradient Based Reinforcement Learning Approach for Autonomous Highway Driving. Proceedings of the 2018 IEEE Conference on Control Technology and Applications (CCTA), Copenhagen, Denmark.","DOI":"10.1109\/CCTA.2018.8511514"},{"key":"ref_29","unstructured":"Bacchiani, G., Molinari, D., and Patander, M. (2019, January 13\u201317). Microscopic Traffic Simulation by Cooperative Multi-Agent Deep Reinforcement Learning. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS \u201919), Montreal QC, Canada."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kalantari, R., Motro, M., Ghosh, J., and Bhat, C. (2016, January 1\u20134). A distributed, collective intelligence framework for collision-free navigation through busy intersections. Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, Brazil.","DOI":"10.1109\/ITSC.2016.7795737"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"103487","DOI":"10.1016\/j.trc.2021.103487","article-title":"Optimization-based path-planning for connected and non-connected automated vehicles","volume":"134","author":"Typaldos","year":"2022","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_32","unstructured":"Kingma, D., and Ba, J. (2014). Adam: A Method for Stochastic Optimization. arXiv."},{"key":"ref_33","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). TensorFlow: A system for large-scale machine learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA."},{"key":"ref_34","unstructured":"(2022, February 15). Keras. Available online: https:\/\/keras.io."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1109\/TRO.2021.3087314","article-title":"Flow: A Modular Learning Framework for Mixed Autonomy Traffic","volume":"38","author":"Wu","year":"2021","journal-title":"IEEE Trans. Robot."},{"key":"ref_36","unstructured":"Plappert, M. (2022, February 15). keras-rl. Available online: https:\/\/github.com\/keras-rl\/keras-rl."},{"key":"ref_37","first-page":"245","article-title":"A Survey on the Explainability of Supervised Machine Learning","volume":"70","author":"Burkart","year":"2021","journal-title":"J. Artif. Int. Res."},{"key":"ref_38","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."},{"key":"ref_39","first-page":"2829","article-title":"Continuous Deep Q-Learning with Model-based Acceleration","volume":"Volume 48","author":"Balcan","year":"2016","journal-title":"Proceedings of the 33rd International Conference on Machine Learning"},{"key":"ref_40","unstructured":"Li, C., and Czarnecki, K. (2019, January 13\u201317). Urban Driving with Multi-Objective Deep Reinforcement Learning. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS\u201919), Montreal QC, Canada."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Coulom, R. (2006, January 29\u201331). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. Proceedings of the Computers and Games, Turin, Italy.","DOI":"10.1007\/978-3-540-75538-8_7"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Baheri, A., Nageshrao, S., Tseng, H.E., Kolmanovsky, I., Girard, A., and Filev, D. (November, January 19). Deep Reinforcement Learning with Enhanced Safety for Autonomous Highway Driving. Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA.","DOI":"10.1109\/IV47402.2020.9304744"},{"key":"ref_43","unstructured":"Faust, A., Hsu, D., and Neumann, G. (2021, January 8\u201311). Safe Driving via Expert Guided Policy Optimization. Proceedings of the 5th Conference on Robot Learning, London, UK."},{"key":"ref_44","unstructured":"Shalev-Shwartz, S., Shammah, S., and Shashua, A. (2017). On a Formal Model of Safe and Scalable Self-driving Cars. arXiv."}],"container-title":["Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-8954\/11\/3\/134\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:45:51Z","timestamp":1760121951000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-8954\/11\/3\/134"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,2]]},"references-count":44,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["systems11030134"],"URL":"https:\/\/doi.org\/10.3390\/systems11030134","relation":{},"ISSN":["2079-8954"],"issn-type":[{"value":"2079-8954","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,2]]}}}