{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T20:35:57Z","timestamp":1772570157334,"version":"3.50.1"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T00:00:00Z","timestamp":1634515200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T00:00:00Z","timestamp":1634515200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Discov Artif Intell"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Autonomous driving is an important development direction of automobile technology, and driving strategy is the core of the autonomous driving system. Most works in this area focus on single-objective tasks, such as maximizing vehicle speed or lane-keeping, and rare attention has been paid to the quality of driving skills. Therefore, a multi-objective learning method is proposed for autonomous driving strategy based on deep Q-network, where two optimization objectives are involved, i.e., vehicle speed and passenger comfort. An end-to-end autonomous driving model is designed by using vehicle front camera images as inputs to the Q-network and makes decisions based on the output Q values. Considering the vehicle speed and passenger comfort, the reward function is designed for multi-objective optimization. To evaluate the effectiveness of the method, training and testing are performed in a simulator, and a single-objective strategy with the goal of maximizing speed is designed for comparison. The results show that the proposed multi-objective autonomous driving strategy can strike a balance between vehicle speed and passenger comfort. Compared with the single-objective strategy, the multi-objective strategy has a significant improvement in comfort, while the average speed is only slightly reduced.<\/jats:p>","DOI":"10.1007\/s44163-021-00011-3","type":"journal-article","created":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T22:54:07Z","timestamp":1634597647000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Multi-objective optimization for autonomous driving strategy based on Deep Q Network"],"prefix":"10.1007","volume":"1","author":[{"given":"Tianmeng","family":"Hu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Biao","family":"Luo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chunhua","family":"Yang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,10,18]]},"reference":[{"issue":"4","key":"11_CR1","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1109\/TITS.2015.2498841","volume":"17","author":"D Gonz\u00e1lez","year":"2015","unstructured":"Gonz\u00e1lez D, P\u00e9rez J, Milan\u00e9s V, Nashashibi F. A review of motion planning techniques for automated vehicles. IEEE Trans Intell Transp Syst. 2015;17(4):1135\u201345.","journal-title":"IEEE Trans Intell Transp Syst"},{"issue":"1","key":"11_CR2","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1109\/TIV.2016.2578706","volume":"1","author":"B Paden","year":"2016","unstructured":"Paden B, \u010c\u00e1p M, Yong SZ, Yershov D, Frazzoli E. A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans Intell Veh. 2016;1(1):33\u201355.","journal-title":"IEEE Trans Intell Veh"},{"key":"11_CR3","unstructured":"Bojarski M, Del\u00a0Testa D, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel LD, Monfort M, Muller U, Zhang J, et\u00a0al. End to end learning for self-driving cars. 2016. arXiv preprint arXiv:160407316"},{"key":"11_CR4","doi-asserted-by":"crossref","unstructured":"Chen C, Seff A, Kornhauser A, Xiao J. Deepdriving: Learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE international conference on computer vision. 2015. p. 2722\u20132730.","DOI":"10.1109\/ICCV.2015.312"},{"key":"11_CR5","doi-asserted-by":"crossref","unstructured":"Xu H, Gao Y, Yu F, Darrell T. End-to-end learning of driving models from large-scale video datasets. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 2174\u20132182.","DOI":"10.1109\/CVPR.2017.376"},{"key":"11_CR6","unstructured":"Sallab AE, Abdou M, Perot E, Yogamani S. End-to-end deep reinforcement learning for lane keeping assist. 2016. arXiv preprint arXiv:161204340"},{"issue":"7540","key":"11_CR7","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","volume":"518","author":"V Mnih","year":"2015","unstructured":"Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, et al. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529\u201333.","journal-title":"Nature"},{"key":"11_CR8","unstructured":"Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing Atari with deep reinforcement learning. 2013. arXiv preprint arXiv:13125602"},{"issue":"10","key":"11_CR9","doi-asserted-by":"publisher","first-page":"767","DOI":"10.1073\/pnas.42.10.767","volume":"42","author":"R Bellman","year":"1956","unstructured":"Bellman R. Dynamic programming and Lagrange multipliers. Proc Natl Acad Sci USA. 1956;42(10):767.","journal-title":"Proc Natl Acad Sci USA"},{"key":"11_CR10","first-page":"25","volume":"22","author":"P Werbos","year":"1977","unstructured":"Werbos P. Advanced forecasting methods for global crisis warning and models of intelligence. General Syst Yearb. 1977;22:25\u201338.","journal-title":"General Syst Yearb"},{"issue":"3\u20134","key":"11_CR11","first-page":"279","volume":"8","author":"CJ Watkins","year":"1992","unstructured":"Watkins CJ, Dayan P. Q-Learning. Mach Learn. 1992;8(3\u20134):279\u201392.","journal-title":"Mach Learn"},{"key":"11_CR12","unstructured":"Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N. Dueling network architectures for deep reinforcement learning. In: International conference on machine learning. 2016. p. 1995\u20132003."},{"key":"11_CR13","unstructured":"Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D Continuous control with deep reinforcement learning. In: International conference on learning representations (Poster). 2016."},{"key":"11_CR14","unstructured":"Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K. Asynchronous methods for deep reinforcement learning. In: International conference on machine learning. 2016. p. 1928\u20131937."},{"key":"11_CR15","unstructured":"Schulman J, Levine S, Abbeel P, Jordan M, Moritz P. Trust region policy optimization. In: International conference on machine learning. 2015. p. 1889\u20131897 ."},{"key":"11_CR16","unstructured":"Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. 2017. arXiv preprint arXiv:170706347"},{"issue":"7587","key":"11_CR17","doi-asserted-by":"publisher","first-page":"484","DOI":"10.1038\/nature16961","volume":"529","author":"D Silver","year":"2016","unstructured":"Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016;529(7587):484\u20139.","journal-title":"Nature"},{"issue":"7676","key":"11_CR18","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1038\/nature24270","volume":"550","author":"D Silver","year":"2017","unstructured":"Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, et al. Mastering the game of Go without human knowledge. Nature. 2017;550(7676):354\u20139.","journal-title":"Nature"},{"key":"11_CR19","doi-asserted-by":"crossref","unstructured":"Finn C, Levine S. Deep visual foresight for planning robot motion. In: 2017 IEEE international conference on robotics and automation. 2017. p. 2786\u20132793.","DOI":"10.1109\/ICRA.2017.7989324"},{"key":"11_CR20","doi-asserted-by":"crossref","unstructured":"Zheng G, Zhang F, Zheng Z, Xiang Y, Yuan NJ, Xie X, Li Z. Drn: A deep reinforcement learning framework for news recommendation. In: Proceedings of the 2018 world wide web conference. 2018. p. 167\u2013176.","DOI":"10.1145\/3178876.3185994"},{"key":"11_CR21","doi-asserted-by":"crossref","unstructured":"Hu Y, Da Q, Zeng A, Yu Y, Xu Y. Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2018. p. 368\u2013377.","DOI":"10.1145\/3219819.3219846"},{"key":"11_CR22","unstructured":"Ie E, Jain V, Wang J, Narvekar S, Agarwal R, Wu R, Cheng HT, Lustman M, Gatto V, Covington P, et\u00a0al. Reinforcement learning for slate-based recommender systems: a tractable decomposition and practical methodology. 2019. arXiv preprint arXiv:190512767"},{"key":"11_CR23","doi-asserted-by":"crossref","unstructured":"Chae H, Kang CM, Kim B, Kim J, Chung CC, Choi JW. Autonomous braking system via deep reinforcement learning. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems, 2017;1\u20136.","DOI":"10.1109\/ITSC.2017.8317839"},{"key":"11_CR24","doi-asserted-by":"crossref","unstructured":"Wolf P, Hubschneider C, Weber M, Bauer A, H\u00e4rtl J, D\u00fcrr F, Z\u00f6llner JM. Learning how to drive in a real world simulation with Deep Q-Networks. In: 2017 IEEE intelligent vehicles symposium. 2017. p. 244\u2013250.","DOI":"10.1109\/IVS.2017.7995727"},{"key":"11_CR25","doi-asserted-by":"crossref","unstructured":"Chen J, Yuan B, Tomizuka M. Model-free deep reinforcement learning for urban autonomous driving. In: 2019 IEEE intelligent transportation systems conference. 2019. p. 2765\u20132771.","DOI":"10.1109\/ITSC.2019.8917306"},{"key":"11_CR26","doi-asserted-by":"crossref","unstructured":"Kendall A, Hawke J, Janz D, Mazur P, Reda D, Allen JM, Lam VD, Bewley A, Shah A. Learning to drive in a day. In: 2019 international conference on robotics and automation, IEEE; 2019. p. 8248\u20138254.","DOI":"10.1109\/ICRA.2019.8793742"},{"key":"11_CR27","doi-asserted-by":"crossref","unstructured":"Min K, Kim H, Huh K. Deep Q learning based high level driving policy determination. In: 2018 IEEE intelligent vehicles symposium. 2018. p. 226\u2013231.","DOI":"10.1109\/IVS.2018.8500645"},{"key":"11_CR28","unstructured":"Li C, Czarnecki K. Urban driving with multi-objective deep reinforcement learning. In: Proceedings of the 18th international conference on autonomous agents and multiAgent systems. 2019. p. 359\u2013367."},{"issue":"5","key":"11_CR29","first-page":"679","volume":"6","author":"R Bellman","year":"1957","unstructured":"Bellman R. A markovian decision process. J Math Mech. 1957;6(5):679\u201384.","journal-title":"J Math Mech"},{"key":"11_CR30","unstructured":"International Organization for Standardization. Mechanical vibration and shock\u2014evaluation of human exposure to whole-body vibration\u2014part 1: general requirements, ISO 2631\u20131:1997 edn. Geneva: International Organization for Standardization; 1997."}],"container-title":["Discover Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-021-00011-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44163-021-00011-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-021-00011-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,10,19]],"date-time":"2021-10-19T00:05:10Z","timestamp":1634601910000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44163-021-00011-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,18]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["11"],"URL":"https:\/\/doi.org\/10.1007\/s44163-021-00011-3","relation":{},"ISSN":["2731-0809"],"issn-type":[{"value":"2731-0809","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,18]]},"assertion":[{"value":"11 August 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 September 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 October 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"11"}}