{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T15:43:57Z","timestamp":1760888637110,"version":"build-2065373602"},"reference-count":23,"publisher":"Fuji Technology Press Ltd.","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["JRM","J. Robot. Mechatron."],"published-print":{"date-parts":[[2025,10,20]]},"abstract":"<jats:p>This study investigates the effect of sensor-to-actuation delay in end-to-end autonomous driving using deep reinforcement learning (DRL). Although DRL-based methods have demonstrated success in tasks such as lane keeping and obstacle avoidance, numerous challenges remain in real-world applications. A key issue is that real-world latency can violate the assumptions of the Markov decision process (MDP), resulting in degraded performance. To address this problem, a method is introduced wherein past actions are appended to the current state, thereby preserving the MDP property even under delayed control signals. The efficacy of this approach was evaluated by comparing three scenarios in simulation: no delay, delay without compensation, and delay compensation by including past actions. The results revealed that the scenario without delay compensation failed to learn effectively. Subsequently, the trained policy was deployed on a 1\/10 scale experimental vehicle, demonstrating that explicitly modeling delay significantly enhances both stability and reliability in simulation and in physical trials. Moreover, when a longer delay was imposed, the learning process became slower and the action-value estimation was less stable, yet the simulated vehicle still performed successfully. Although experimental vehicle tests under extended delays exhibited some instability, it was confirmed that the approach accounted for such delays to a certain extent, thereby compensating effectively for latency in real-world environments.<\/jats:p>","DOI":"10.20965\/jrm.2025.p1024","type":"journal-article","created":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T15:02:06Z","timestamp":1760886126000},"page":"1024-1033","source":"Crossref","is-referenced-by-count":0,"title":["Application and Experimental Verification of a Delay-Aware Deep Reinforcement Learning Method in End-to-End Autonomous Driving Control"],"prefix":"10.20965","volume":"37","author":[{"given":"Kazuya","family":"Emura","sequence":"first","affiliation":[{"name":"Tokyo University of Science, 6-3-1 Niijuku, Katsushika-ku, Tokyo 125-8585, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ryuzo","family":"Hayashi","sequence":"additional","affiliation":[{"name":"Tokyo University of Science, 6-3-1 Niijuku, Katsushika-ku, Tokyo 125-8585, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"8550","published-online":{"date-parts":[[2025,10,20]]},"reference":[{"key":"key-10.20965\/jrm.2025.p1024-1","doi-asserted-by":"crossref","unstructured":"S. A. Bagloee, M. Tavana, M. Asadi, and T. Oliver, \u201cAutonomous vehicles: Challenges, opportunities, and future implications for transportation policies,\u201d J. Mod. Transp., Vol.24, No.4, pp. 284-303, 2016. https:\/\/doi.org\/10.1007\/s40534-016-0117-3","DOI":"10.1007\/s40534-016-0117-3"},{"key":"key-10.20965\/jrm.2025.p1024-2","doi-asserted-by":"crossref","unstructured":"D. J. Fagnant and K. Kockelman, \u201cPreparing a nation for autonomous vehicles: Opportunities, barriers and policy recommendations,\u201d Transp. Res. A: Policy Pract., Vol.77, pp. 167-181, 2015. https:\/\/doi.org\/10.1016\/j.tra.2015.04.003","DOI":"10.1016\/j.tra.2015.04.003"},{"key":"key-10.20965\/jrm.2025.p1024-3","doi-asserted-by":"crossref","unstructured":"Z. Chen and X. Huang, \u201cEnd-to-end learning for lane keeping of self-driving cars,\u201d 2017 IEEE Intell. Veh. Symp. (IV), pp. 1856-1860, 2017. https:\/\/doi.org\/10.1109\/IVS.2017.7995975.","DOI":"10.1109\/IVS.2017.7995975"},{"key":"key-10.20965\/jrm.2025.p1024-4","doi-asserted-by":"crossref","unstructured":"E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, \u201cA survey of autonomous driving: Common practices and emerging technologies,\u201d IEEE Access, Vol.8, pp. 58443-58469, 2020. https:\/\/doi.org\/10.1109\/ACCESS.2020.2983149","DOI":"10.1109\/ACCESS.2020.2983149"},{"key":"key-10.20965\/jrm.2025.p1024-5","unstructured":"M. Bojarski et al., \u201cEnd to end learning for self-driving cars,\u201d arXiv:1604.07316, 2016. https:\/\/doi.org\/10.48550\/arXiv.1604.07316"},{"key":"key-10.20965\/jrm.2025.p1024-6","doi-asserted-by":"crossref","unstructured":"A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani, \u201cDeep reinforcement learning framework for autonomous driving,\u201d Proc. IS&T Int. Symp. Electron. Imaging: Auton. Veh. Mach., pp. 70-76, 2017. https:\/\/doi.org\/10.2352\/ISSN.2470-1173.2017.19.AVM-023","DOI":"10.2352\/ISSN.2470-1173.2017.19.AVM-023"},{"key":"key-10.20965\/jrm.2025.p1024-7","doi-asserted-by":"crossref","unstructured":"A. Carballo et al., \u201cEnd-to-end autonomous mobile robot navigation with model-based system support,\u201d J. Robot. Mechatron., Vol.30, No.4, pp. 563-583, 2018. https:\/\/doi.org\/10.20965\/jrm.2018.p0563","DOI":"10.20965\/jrm.2018.p0563"},{"key":"key-10.20965\/jrm.2025.p1024-8","doi-asserted-by":"crossref","unstructured":"V. Mnih et al., \u201cHuman-level control through deep reinforcement learning,\u201d Nature, Vol.518, No.7540, pp. 529-533, 2015. https:\/\/doi.org\/10.1038\/nature14236","DOI":"10.1038\/nature14236"},{"key":"key-10.20965\/jrm.2025.p1024-9","doi-asserted-by":"crossref","unstructured":"D. Silver et al., \u201cMastering the game of Go with deep neural networks and tree search,\u201d Nature, Vol.529, No.7587, pp. 484-489, 2016. https:\/\/doi.org\/10.1038\/nature16961","DOI":"10.1038\/nature16961"},{"key":"key-10.20965\/jrm.2025.p1024-10","doi-asserted-by":"crossref","unstructured":"D. Silver et al., \u201cMastering the game of Go without human knowledge,\u201d Nature, Vol.550, No.7676, pp. 354-359, 2017. https:\/\/doi.org\/10.1038\/nature24270","DOI":"10.1038\/nature24270"},{"key":"key-10.20965\/jrm.2025.p1024-11","unstructured":"D. Silver et al., \u201cMastering chess and shogi by self-play with a general reinforcement learning algorithm,\u201d arXiv:1712.01815, 2017. https:\/\/doi.org\/10.48550\/arXiv.1712.01815"},{"key":"key-10.20965\/jrm.2025.p1024-12","doi-asserted-by":"crossref","unstructured":"P. Wolf et al., \u201cLearning how to drive in a real world simulation with deep Q-networks,\u201d 2017 IEEE Intell. Veh. Symp. (IV), pp. 244-250, 2017. https:\/\/doi.org\/10.1109\/IVS.2017.7995727","DOI":"10.1109\/IVS.2017.7995727"},{"key":"key-10.20965\/jrm.2025.p1024-13","doi-asserted-by":"crossref","unstructured":"C.-J. Hoel, K. Wolff, and L. Laine, \u201cAutomated speed and lane change decision making using deep reinforcement learning,\u201d 2018 21st Int. Conf. Intell. Transp. Syst. (ITSC), pp. 2148-2155, 2018. https:\/\/doi.org\/10.1109\/ITSC.2018.8569568","DOI":"10.1109\/ITSC.2018.8569568"},{"key":"key-10.20965\/jrm.2025.p1024-14","doi-asserted-by":"crossref","unstructured":"A. Kendall et al., \u201cLearning to drive in a day,\u201d 2019 Int. Conf. Robot. Autom. (ICRA), pp. 8248-8254, 2019. https:\/\/doi.org\/10.1109\/ICRA.2019.8793742","DOI":"10.1109\/ICRA.2019.8793742"},{"key":"key-10.20965\/jrm.2025.p1024-15","unstructured":"X. Xiong, J. Wang, F. Zhang, and K. Li, \u201cCombining deep reinforcement learning and safety based control for autonomous driving,\u201d arXiv.1612.00147, 2016. https:\/\/doi.org\/10.48550\/arXiv.1612.00147"},{"key":"key-10.20965\/jrm.2025.p1024-16","doi-asserted-by":"crossref","unstructured":"T. Suzuki et al., \u201cAcquisition of cooperative control of multiple vehicles through reinforcement learning utilizing vehicle-to-vehicle communication and map information,\u201d J. Robot. Mechatron., Vol.36, No.3, pp. 642-657, 2024. https:\/\/doi.org\/10.20965\/jrm.2024.p0642","DOI":"10.20965\/jrm.2024.p0642"},{"key":"key-10.20965\/jrm.2025.p1024-17","doi-asserted-by":"crossref","unstructured":"J. Tobin et al., \u201cDomain randomization for transferring deep neural networks from simulation to the real world,\u201d 2017 IEEE\/RSJ Int. Conf. Intell. Robots and Syst. (IROS), pp. 23-30, 2017. https:\/\/doi.org\/10.1109\/IROS.2017.8202133","DOI":"10.1109\/IROS.2017.8202133"},{"key":"key-10.20965\/jrm.2025.p1024-18","doi-asserted-by":"crossref","unstructured":"D. Kalaria, Q. Lin, and J. M. Dolan, \u201cDelay-aware robust control for safe autonomous driving,\u201d 2022 IEEE Intell. Veh. Symp. (IV), pp. 1565-1571, 2022. https:\/\/doi.org\/10.1109\/IV51971.2022.9827111","DOI":"10.1109\/IV51971.2022.9827111"},{"key":"key-10.20965\/jrm.2025.p1024-19","doi-asserted-by":"crossref","unstructured":"D. Kalaria, Q. Lin, and J. M. Dolan, \u201cDelay-aware robust control for safe autonomous driving and racing,\u201d IEEE Trans. Intell. Transp. Syst., Vol.25, No.7, pp. 7140-7150, 2024. https:\/\/doi.org\/10.1109\/TITS.2023.3339708","DOI":"10.1109\/TITS.2023.3339708"},{"key":"key-10.20965\/jrm.2025.p1024-20","doi-asserted-by":"crossref","unstructured":"F. Naseer, M. N. Khan, A. Rasool, and N. Ayub, \u201cA novel approach to compensate delay in communication by predicting teleoperator behaviour using deep learning and reinforcement learning to control telepresence robot,\u201d Electron. Lett., Vol.59, No.9, Article No.e12806, 2023. https:\/\/doi.org\/10.1049\/ell2.12806","DOI":"10.1049\/ell2.12806"},{"key":"key-10.20965\/jrm.2025.p1024-21","doi-asserted-by":"crossref","unstructured":"T. J. Walsh, A. Nouri, L. Li, and M. L. Littman, \u201cLearning and planning in environments with delayed feedback,\u201d Auton. Agents Multi-Agent Syst., Vol.18, No.1, pp. 83-105, 2009. https:\/\/doi.org\/10.1007\/s10458-008-9056-7","DOI":"10.1007\/s10458-008-9056-7"},{"key":"key-10.20965\/jrm.2025.p1024-22","doi-asserted-by":"crossref","unstructured":"M. Hirano et al., \u201cA transparent AI-based approach for controlling processes with time delays: With its experimental evaluation in a real-world plant operation,\u201d Trans. Jpn. Soc. Artif. Intell., Vol.39, No.6, pp. A-O53_1-9, 2024 (in Japanese). https:\/\/doi.org\/10.1527\/tjsai.39-6-A-O53","DOI":"10.1527\/tjsai.39-6-A-O53"},{"key":"key-10.20965\/jrm.2025.p1024-23","unstructured":"V. Mnih et al., \u201cPlaying Atari with deep reinforcement learning,\u201d arXiv:1312.5602, 2013. https:\/\/doi.org\/10.48550\/arXiv.1312.5602"}],"container-title":["Journal of Robotics and Mechatronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.fujipress.jp\/main\/wp-content\/themes\/Fujipress\/hyosetsu.php?ppno=robot003700050002","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T15:02:16Z","timestamp":1760886136000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.fujipress.jp\/jrm\/rb\/robot003700051024"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,20]]},"references-count":23,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,10,20]]},"published-print":{"date-parts":[[2025,10,20]]}},"URL":"https:\/\/doi.org\/10.20965\/jrm.2025.p1024","relation":{},"ISSN":["1883-8049","0915-3942"],"issn-type":[{"value":"1883-8049","type":"electronic"},{"value":"0915-3942","type":"print"}],"subject":[],"published":{"date-parts":[[2025,10,20]]}}}