{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,9]],"date-time":"2026-07-09T05:07:31Z","timestamp":1783573651838,"version":"3.55.0"},"reference-count":34,"publisher":"MDPI AG","issue":"15","license":[{"start":{"date-parts":[[2024,7,31]],"date-time":"2024-07-31T00:00:00Z","timestamp":1722384000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Graduate Innovation Fund of Jilin University","award":["2022048"],"award-info":[{"award-number":["2022048"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The ability to make informed decisions in complex scenarios is crucial for intelligent automotive systems. Traditional expert rules and other methods often fall short in complex contexts. Recently, reinforcement learning has garnered significant attention due to its superior decision-making capabilities. However, there exists the phenomenon of inaccurate target network estimation, which limits its decision-making ability in complex scenarios. This paper mainly focuses on the study of the underestimation phenomenon, and proposes an end-to-end autonomous driving decision-making method based on an improved TD3 algorithm. This method employs a forward camera to capture data. By introducing a new critic network to form a triple-critic structure and combining it with the target maximization operation, the underestimation problem in the TD3 algorithm is solved. Subsequently, the multi-timestep averaging method is used to address the policy instability caused by the new single critic. In addition, this paper uses Carla platform to construct multi-vehicle unprotected left turn and congested lane-center driving scenarios and verifies the algorithm. The results demonstrate that our method surpasses baseline DDPG and TD3 algorithms in aspects such as convergence speed, estimation accuracy, and policy stability.<\/jats:p>","DOI":"10.3390\/s24154962","type":"journal-article","created":{"date-parts":[[2024,7,31]],"date-time":"2024-07-31T17:16:49Z","timestamp":1722446209000},"page":"4962","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["End-to-End Autonomous Driving Decision Method Based on Improved TD3 Algorithm in Complex Scenarios"],"prefix":"10.3390","volume":"24","author":[{"given":"Tao","family":"Xu","sequence":"first","affiliation":[{"name":"National Key Laboratory of Automotive Chassis Integration and Bionics, Jilin University, Changchun 130015, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhiwei","family":"Meng","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Automotive Chassis Integration and Bionics, Jilin University, Changchun 130015, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0809-3590","authenticated-orcid":false,"given":"Weike","family":"Lu","sequence":"additional","affiliation":[{"name":"School of Rail Transportation, Soochow University, Suzhou 215031, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhongwen","family":"Tong","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Automotive Chassis Integration and Bionics, Jilin University, Changchun 130015, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2024,7,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3485767","article-title":"Level-5 autonomous driving\u2014Are we there yet? a review of research literature","volume":"55","author":"Khan","year":"2022","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"237","DOI":"10.26599\/JICV.2023.9210018","article-title":"Enhanced target tracking algorithm for autonomous driving based on visible and infrared image fusion","volume":"6","author":"Yuan","year":"2023","journal-title":"J. Intell. Connect. Veh."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Hu, Y., Yang, J., Chen, L., Li, K., Sima, C., Zhu, X., Chai, S., Du, S., Lin, T., and Wang, W. (2023, January 17\u201324). Planning-oriented autonomous driving. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01712"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"66031","DOI":"10.1109\/ACCESS.2024.3394869","article-title":"A Comprehensive Review on Deep Learning-Based Motion Planning and End-To-End Learning for Self-Driving Vehicle","volume":"12","author":"Ganesan","year":"2024","journal-title":"IEEE Access"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Liu, H., Huang, Z., Wu, J., and Lv, C. (2022, January 5\u20139). Improved deep reinforcement learning with expert demonstrations for urban autonomous driving. Proceedings of the 2022 IEEE Intelligent Vehicles Symposium (IV), Aachen, Germany.","DOI":"10.1109\/IV51971.2022.9827073"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1007\/s42154-020-00113-1","article-title":"Deep reinforcement learning enabled decision-making for autonomous driving at intersections","volume":"3","author":"Li","year":"2020","journal-title":"Automot. Innov."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"102876","DOI":"10.1016\/j.trb.2023.102876","article-title":"Delay-throughput tradeoffs for signalized networks with finite queue capacity","volume":"180","author":"Cui","year":"2024","journal-title":"Transp. Res. Part B Methodol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"104462","DOI":"10.1016\/j.trc.2023.104462","article-title":"Observer-based event-triggered adaptive platooning control for autonomous vehicles with motion uncertainties","volume":"159","author":"Xue","year":"2024","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3017","DOI":"10.1109\/TMECH.2023.3236245","article-title":"Driver-centric lane-keeping assistance system design: A noncertainty-equivalent neuro-adaptive control approach","volume":"28","author":"Zhou","year":"2023","journal-title":"IEEE\/ASME Trans. Mechatron."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"136","DOI":"10.26599\/JICV.2023.9210013","article-title":"Evaluation of platooning configurations for connected and automated vehicles at an isolated roundabout in a mixed traffic environment","volume":"6","author":"Zhuo","year":"2023","journal-title":"J. Intell. Connect. Veh."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1109\/TIV.2019.2955905","article-title":"Combining planning and deep reinforcement learning in tactical decision making for autonomous driving","volume":"5","author":"Hoel","year":"2019","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, X., Liu, X., Li, X., and Wu, G. (2022). Lane Change Decision Algorithm Based on Deep Q Network for Autonomous Vehicles. SAE Technical Paper, SAE International.","DOI":"10.4271\/2022-01-0084"},{"key":"ref_13","unstructured":"Hauptmann, A., Yu, L., Liu, W., Qian, Y., Cheng, Z., and Gui, L. (2023). Robust Automatic Detection of Traffic Activity, Carnegie Mellon University."},{"key":"ref_14","first-page":"7812","article-title":"Reinforcement learning algorithms: A brief survey","volume":"231","author":"XShakya","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_15","unstructured":"Wang, F., Shi, D., Liu, T., and Tang, X. (2020). Decision-making at unsignalized intersection for autonomous vehicles: Left-turn maneuver with deep reinforcement learning. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1669","DOI":"10.1049\/itr2.12107","article-title":"Continuous decision-making for autonomous driving at intersections using deep deterministic policy gradient","volume":"16","author":"Li","year":"2022","journal-title":"IET Intell. Transp. Syst."},{"key":"ref_17","first-page":"3541","article-title":"Deep reinforcement learning for autonomous vehicles: Lane keep and overtaking scenarios with collision avoidance","volume":"15","author":"Ashwin","year":"2023","journal-title":"Int. J. Inf. Technol."},{"key":"ref_18","unstructured":"Fujimoto, S., Hoof, H., and Meger, D. (2018, January 10\u201315). Addressing function approximation error in actor-critic methods. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Saglam, B., Duran, E., Cicek, D.C., Mutlu, F.B., and Kozat, S.S. (2021, January 1\u20133). Estimation error correction in deep reinforcement learning for deterministic actor-critic methods. Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), Washington, DC, USA.","DOI":"10.1109\/ICTAI52525.2021.00027"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"106736","DOI":"10.1016\/j.knosys.2020.106736","article-title":"Regularly updated deterministic policy gradient algorithm","volume":"214","author":"Han","year":"2021","journal-title":"Knowl. Based Syst."},{"key":"ref_21","unstructured":"Sangoleye, F. (2023). Reinforcement Learning-Based Resilience and Decision Making in Cyber-Physical Systems, The University of New Mexico."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1109\/JAS.2021.1004395","article-title":"Highway lane change decision-making via attention-based deep reinforcement learning","volume":"9","author":"Wang","year":"2021","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Lin, Y., Liu, Y., Lin, F., Zou, L., Wu, P., Zeng, W., Chen, H., and Miao, C. (2023). A survey on reinforcement learning for recommender systems. IEEE Trans. Neural Netw. Learn. Syst., 1\u201321.","DOI":"10.1109\/TNNLS.2023.3280161"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"7812","DOI":"10.1109\/TVT.2024.3360445","article-title":"Decision-Making for Autonomous Vehicles in Random Task Scenarios at Unsignalized Intersection Using Deep Reinforcement Learning","volume":"73","author":"Xiao","year":"2024","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"5064","DOI":"10.1109\/TNNLS.2022.3207346","article-title":"Deep reinforcement learning: A survey","volume":"35","author":"Wang","year":"2022","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"118926","DOI":"10.1016\/j.eswa.2022.118926","article-title":"REDRL: A review-enhanced Deep Reinforcement Learning model for interactive recommendation","volume":"213","author":"Liu","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1109\/TNNLS.2022.3177685","article-title":"Prioritized experience-based reinforcement learning with human guidance for autonomous driving","volume":"35","author":"Wu","year":"2022","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_28","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous Control with Deep Reinforcement Learning. arXiv."},{"key":"ref_29","unstructured":"Thrun, S., and Schwartz, A. Issues in using function approximation for reinforcement learning. Proceedings of the 1993 Connectionist Models Summer School."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"4933","DOI":"10.1109\/TNNLS.2019.2959129","article-title":"Reducing estimation bias via triplet-average deep deterministic policy gradient","volume":"31","author":"Wu","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"6826","DOI":"10.1109\/TWC.2023.3245820","article-title":"Energy harvesting reconfigurable intelligent surface for UAV based on robust deep reinforcement learning","volume":"22","author":"Peng","year":"2023","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"38017","DOI":"10.1109\/ACCESS.2024.3375083","article-title":"UAV path planning based on the average TD3 algorithm with prioritized experience replay","volume":"12","author":"Luo","year":"2024","journal-title":"IEEE Access"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1230","DOI":"10.23919\/cje.2022.00.093","article-title":"Towards V2I age-aware fairness access: A DQN based intelligent vehicular node training and test method","volume":"32","author":"Qiong","year":"2023","journal-title":"Chin. J. Electron."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/15\/4962\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:27:17Z","timestamp":1760110037000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/15\/4962"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,31]]},"references-count":34,"journal-issue":{"issue":"15","published-online":{"date-parts":[[2024,8]]}},"alternative-id":["s24154962"],"URL":"https:\/\/doi.org\/10.3390\/s24154962","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,31]]}}}