{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T10:38:34Z","timestamp":1776335914863,"version":"3.51.2"},"publisher-location":"New York, NY, USA","reference-count":38,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,8,4]],"date-time":"2023-08-04T00:00:00Z","timestamp":1691107200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61960206008"],"award-info":[{"award-number":["61960206008"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,6]]},"DOI":"10.1145\/3580305.3599839","type":"proceedings-article","created":{"date-parts":[[2023,8,4]],"date-time":"2023-08-04T18:13:58Z","timestamp":1691172838000},"page":"4852-4861","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Hierarchical Reinforcement Learning for Dynamic Autonomous Vehicle Navigation at Intelligent Intersections"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2205-4566","authenticated-orcid":false,"given":"Qian","family":"Sun","sequence":"first","affiliation":[{"name":"The Hong Kong University of Science and Technology, Hong Kong SAR, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0894-9651","authenticated-orcid":false,"given":"Le","family":"Zhang","sequence":"additional","affiliation":[{"name":"Baidu Research, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9324-0200","authenticated-orcid":false,"given":"Huan","family":"Yu","sequence":"additional","affiliation":[{"name":"The Hong Kong University of Science and Technology(Guangzhou) &amp; The Hong Kong University of Science and Technology, Guangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5085-5216","authenticated-orcid":false,"given":"Weijia","family":"Zhang","sequence":"additional","affiliation":[{"name":"The Hong Kong University of Science and Technology(Guangzhou), Guangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2620-3589","authenticated-orcid":false,"given":"Yu","family":"Mei","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6016-6465","authenticated-orcid":false,"given":"Hui","family":"Xiong","sequence":"additional","affiliation":[{"name":"The Hong Kong University of Science and Technology(Guangzhou) &amp; The Hong Kong University of Science and Technology, Guangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2023,8,4]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).","author":"Ault James","year":"2021","unstructured":"James Ault and Guni Sharon . 2021 . Reinforcement learning benchmarks for traffic signal control . In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1). James Ault and Guni Sharon. 2021. Reinforcement learning benchmarks for traffic signal control. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)."},{"key":"e_1_3_2_2_2_1","volume-title":"Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems","author":"Barto Andrew G","year":"2003","unstructured":"Andrew G Barto and Sridhar Mahadevan . 2003. Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems , Vol. 13 , 1--2 ( 2003 ), 41--77. Andrew G Barto and Sridhar Mahadevan. 2003. Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems , Vol. 13, 1--2 (2003), 41--77."},{"key":"e_1_3_2_2_3_1","volume-title":"Nicholas Jing Yuan, and Enhong Chen","author":"Chen Liyi","year":"2022","unstructured":"Liyi Chen , Zhi Li , Weidong He , Gong Cheng , Tong Xu , Nicholas Jing Yuan, and Enhong Chen . 2022 . Entity summarization via exploiting description complementarity and salience. IEEE Transactions on Neural Networks and Learning Systems ( 2022). Liyi Chen, Zhi Li, Weidong He, Gong Cheng, Tong Xu, Nicholas Jing Yuan, and Enhong Chen. 2022. Entity summarization via exploiting description complementarity and salience. IEEE Transactions on Neural Networks and Learning Systems (2022)."},{"key":"e_1_3_2_2_4_1","volume-title":"Self-Organizing Traffic Lights: A Realistic Simulation","author":"Cools Seung-Bae","unstructured":"Seung-Bae Cools , Carlos Gershenson , and Bart D'Hooghe . [n.,d.]. Self-Organizing Traffic Lights: A Realistic Simulation . Springer London , London , 45--55. https:\/\/doi.org\/10.1007\/978--1--4471--5113--5_3 10.1007\/978--1--4471--5113--5_3 Seung-Bae Cools, Carlos Gershenson, and Bart D'Hooghe. [n.,d.]. Self-Organizing Traffic Lights: A Realistic Simulation. Springer London, London, 45--55. https:\/\/doi.org\/10.1007\/978--1--4471--5113--5_3"},{"key":"e_1_3_2_2_5_1","volume-title":"A survey on autonomous vehicle control in the era of mixed-autonomy: From physics-based to AI-guided driving policy learning. Transportation research part C: emerging technologies","author":"Di Xuan","year":"2021","unstructured":"Xuan Di and Rongye Shi . 2021. A survey on autonomous vehicle control in the era of mixed-autonomy: From physics-based to AI-guided driving policy learning. Transportation research part C: emerging technologies , Vol. 125 ( 2021 ), 103008. Xuan Di and Rongye Shi. 2021. A survey on autonomous vehicle control in the era of mixed-autonomy: From physics-based to AI-guided driving policy learning. Transportation research part C: emerging technologies , Vol. 125 (2021), 103008."},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01386390"},{"key":"e_1_3_2_2_7_1","volume-title":"An open-source framework for adaptive traffic signal control. arXiv preprint arXiv:1909.00395","author":"Genders Wade","year":"2019","unstructured":"Wade Genders and Saiedeh Razavi . 2019. An open-source framework for adaptive traffic signal control. arXiv preprint arXiv:1909.00395 ( 2019 ). Wade Genders and Saiedeh Razavi. 2019. An open-source framework for adaptive traffic signal control. arXiv preprint arXiv:1909.00395 (2019)."},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCWorkshops50388.2021.9473555"},{"key":"e_1_3_2_2_9_1","volume-title":"CoTV: Cooperative Control for Traffic Light Signals and Connected Autonomous Vehicles using Deep Reinforcement Learning. arXiv preprint arXiv:2201.13143","author":"Guo Jiaying","year":"2022","unstructured":"Jiaying Guo , Long Cheng , and Shen Wang . 2022. CoTV: Cooperative Control for Traffic Light Signals and Connected Autonomous Vehicles using Deep Reinforcement Learning. arXiv preprint arXiv:2201.13143 ( 2022 ). Jiaying Guo, Long Cheng, and Shen Wang. 2022. CoTV: Cooperative Control for Traffic Light Signals and Connected Autonomous Vehicles using Deep Reinforcement Learning. arXiv preprint arXiv:2201.13143 (2022)."},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSSC.1968.300136"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.3390\/make4010009"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357978"},{"key":"e_1_3_2_2_13_1","volume-title":"Real-time deep reinforcement learning based vehicle navigation. Applied soft computing","author":"Koh Songsang","year":"2020","unstructured":"Songsang Koh , Bo Zhou , Hui Fang , Po Yang , Zaili Yang , Qiang Yang , Lin Guan , and Zhigang Ji. 2020. Real-time deep reinforcement learning based vehicle navigation. Applied soft computing , Vol. 96 ( Nov 2020 ), 106694. https:\/\/doi.org\/10.1016\/j.asoc.2020.106694 10.1016\/j.asoc.2020.106694 Songsang Koh, Bo Zhou, Hui Fang, Po Yang, Zaili Yang, Qiang Yang, Lin Guan, and Zhigang Ji. 2020. Real-time deep reinforcement learning based vehicle navigation. Applied soft computing , Vol. 96 (Nov 2020), 106694. https:\/\/doi.org\/10.1016\/j.asoc.2020.106694"},{"key":"e_1_3_2_2_14_1","volume-title":"An optimal control approach of integrating traffic signals and cooperative vehicle trajectories at intersections. Transportmetrica B: transport dynamics","author":"Liu Meiqi","year":"2022","unstructured":"Meiqi Liu , J Zhao , SP Hoogendoorn , and M Wang . 2022a. An optimal control approach of integrating traffic signals and cooperative vehicle trajectories at intersections. Transportmetrica B: transport dynamics , Vol. 10 , 1 ( 2022 ), 971--987. Meiqi Liu, J Zhao, SP Hoogendoorn, and M Wang. 2022a. An optimal control approach of integrating traffic signals and cooperative vehicle trajectories at intersections. Transportmetrica B: transport dynamics , Vol. 10, 1 (2022), 971--987."},{"key":"e_1_3_2_2_15_1","volume-title":"A single-layer approach for joint optimization of traffic signals and cooperative vehicle trajectories at isolated intersections. Transportation research part C: emerging technologies","author":"Liu Meiqi","year":"2022","unstructured":"Meiqi Liu , Jing Zhao , Serge Hoogendoorn , and Meng Wang . 2022b. A single-layer approach for joint optimization of traffic signals and cooperative vehicle trajectories at isolated intersections. Transportation research part C: emerging technologies , Vol. 134 ( 2022 ), 103459. Meiqi Liu, Jing Zhao, Serge Hoogendoorn, and Meng Wang. 2022b. A single-layer approach for joint optimization of traffic signals and cooperative vehicle trajectories at isolated intersections. Transportation research part C: emerging technologies , Vol. 134 (2022), 103459."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/3398761.3398858"},{"key":"e_1_3_2_2_17_1","volume-title":"Feudal Multi-Agent Reinforcement Learning with Adaptive Network Partition for Traffic Signal Control. (May 27","author":"Ma Jinming","year":"2022","unstructured":"Jinming Ma and Feng Wu. 2022. Feudal Multi-Agent Reinforcement Learning with Adaptive Network Partition for Traffic Signal Control. (May 27 , 2022 ). https:\/\/doi.org\/10.48550\/arxiv.2205.13836 10.48550\/arxiv.2205.13836 Jinming Ma and Feng Wu. 2022. Feudal Multi-Agent Reinforcement Learning with Adaptive Network Partition for Traffic Signal Control. (May 27, 2022). https:\/\/doi.org\/10.48550\/arxiv.2205.13836"},{"key":"e_1_3_2_2_18_1","volume-title":"LibSignal: An Open Library for Traffic Signal Control. arXiv preprint arXiv:2211.10649","author":"Mei Hao","year":"2022","unstructured":"Hao Mei , Xiaoliang Lei , Longchao Da , Bin Shi , and Hua Wei . 2022. LibSignal: An Open Library for Traffic Signal Control. arXiv preprint arXiv:2211.10649 ( 2022 ). Hao Mei, Xiaoliang Lei, Longchao Da, Bin Shi, and Hua Wei. 2022. LibSignal: An Open Library for Traffic Signal Control. arXiv preprint arXiv:2211.10649 (2022)."},{"key":"e_1_3_2_2_19_1","volume-title":"Honglak Lee, and Sergey Levine.","author":"Nachum Ofir","year":"2018","unstructured":"Ofir Nachum , Shixiang Shane Gu , Honglak Lee, and Sergey Levine. 2018 . Data-efficient hierarchical reinforcement learning. Advances in neural information processing systems , Vol. 31 (2018). Ofir Nachum, Shixiang Shane Gu, Honglak Lee, and Sergey Levine. 2018. Data-efficient hierarchical reinforcement learning. Advances in neural information processing systems , Vol. 31 (2018)."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453160"},{"key":"e_1_3_2_2_21_1","volume-title":"Reinforcement learning: An introduction","author":"Sutton Richard S","unstructured":"Richard S Sutton and Andrew G Barto . 2018. Reinforcement learning: An introduction . MIT press . Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press."},{"key":"e_1_3_2_2_22_1","volume-title":"Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence","author":"Sutton Richard S","year":"1999","unstructured":"Richard S Sutton , Doina Precup , and Satinder Singh . 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence , Vol. 112 , 1--2 ( 1999 ), 181--211. Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence , Vol. 112, 1--2 (1999), 181--211."},{"key":"e_1_3_2_2_23_1","volume-title":"Emerging technologies","author":"Varaiya Pravin","year":"2013","unstructured":"Pravin Varaiya . 2013. Max pressure control of a network of signalized intersections. Transportation research. Part C , Emerging technologies , Vol. 36 ( Nov 2013 ), 177--195. https:\/\/doi.org\/10.1016\/j.trc.2013.08.014 10.1016\/j.trc.2013.08.014 Pravin Varaiya. 2013. Max pressure control of a network of signalized intersections. Transportation research. Part C, Emerging technologies , Vol. 36 (Nov 2013), 177--195. https:\/\/doi.org\/10.1016\/j.trc.2013.08.014"},{"key":"e_1_3_2_2_24_1","volume-title":"International Conference on Machine Learning. PMLR, 3540--3549","author":"Vezhnevets Alexander Sasha","year":"2017","unstructured":"Alexander Sasha Vezhnevets , Simon Osindero , Tom Schaul , Nicolas Heess , Max Jaderberg , David Silver , and Koray Kavukcuoglu . 2017 . Feudal networks for hierarchical reinforcement learning . In International Conference on Machine Learning. PMLR, 3540--3549 . Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning. PMLR, 3540--3549."},{"key":"e_1_3_2_2_25_1","volume-title":"Conference on robot learning. PMLR, 399--409","author":"Vinitsky Eugene","year":"2018","unstructured":"Eugene Vinitsky , Aboudy Kreidieh , Luc Le Flem , Nishant Kheterpal , Kathy Jang , Cathy Wu , Fangyu Wu , Richard Liaw , Eric Liang , and Alexandre M Bayen . 2018 . Benchmarks for reinforcement learning in mixed-autonomy traffic . In Conference on robot learning. PMLR, 399--409 . Eugene Vinitsky, Aboudy Kreidieh, Luc Le Flem, Nishant Kheterpal, Kathy Jang, Cathy Wu, Fangyu Wu, Richard Liaw, Eric Liang, and Alexandre M Bayen. 2018. Benchmarks for reinforcement learning in mixed-autonomy traffic. In Conference on robot learning. PMLR, 399--409."},{"key":"e_1_3_2_2_26_1","volume-title":"STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control. , 2228--2242 pages. https:\/\/doi.org\/10.1109\/TMC.2020.3033782","author":"Wang Yanan","year":"2022","unstructured":"Yanan Wang , Tong Xu , Xin Niu , Chang Tan , Enhong Chen , and Hui Xiong . 2022 . STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control. , 2228--2242 pages. https:\/\/doi.org\/10.1109\/TMC.2020.3033782 10.1109\/TMC.2020.3033782 Yanan Wang, Tong Xu, Xin Niu, Chang Tan, Enhong Chen, and Hui Xiong. 2022. STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control. , 2228--2242 pages. https:\/\/doi.org\/10.1109\/TMC.2020.3033782"},{"key":"e_1_3_2_2_27_1","volume-title":"XRouting: Explainable Vehicle Rerouting for Urban Road Congestion Avoidance using Deep Reinforcement Learning","author":"Wang Zheng","year":"2022","unstructured":"Zheng Wang and Shen Wang . Jan 01, 2022. XRouting: Explainable Vehicle Rerouting for Urban Road Congestion Avoidance using Deep Reinforcement Learning . The Institute of Electrical and Electronics Engineers, Inc. ( IEEE ), Piscataway. https:\/\/doi.org\/10.1109\/ISC255366. 2022 .9922404 10.1109\/ISC255366.2022.9922404 Zheng Wang and Shen Wang. Jan 01, 2022. XRouting: Explainable Vehicle Rerouting for Urban Road Congestion Avoidance using Deep Reinforcement Learning. The Institute of Electrical and Electronics Engineers, Inc. (IEEE), Piscataway. https:\/\/doi.org\/10.1109\/ISC255366.2022.9922404"},{"key":"e_1_3_2_2_28_1","unstructured":"F. V. Webster. 1958. Traffic Signal Settings.  F. V. Webster. 1958. Traffic Signal Settings."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330949"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357902"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447556.3447565"},{"key":"e_1_3_2_2_32_1","volume-title":"Flow: Architecture and benchmarking for reinforcement learning in traffic control. arXiv preprint arXiv:1710.05465","author":"Wu Cathy","year":"2017","unstructured":"Cathy Wu , Aboudy Kreidieh , Kanaad Parvate , Eugene Vinitsky , and Alexandre M Bayen . 2017 . Flow: Architecture and benchmarking for reinforcement learning in traffic control. arXiv preprint arXiv:1710.05465 , Vol. 10 (2017). Cathy Wu, Aboudy Kreidieh, Kanaad Parvate, Eugene Vinitsky, and Alexandre M Bayen. 2017. Flow: Architecture and benchmarking for reinforcement learning in traffic control. arXiv preprint arXiv:1710.05465 , Vol. 10 (2017)."},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2019.00180"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC48978.2021.9565000"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467388"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3539416"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01138"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC55140.2022.9922535"}],"event":{"name":"KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","location":"Long Beach CA USA","acronym":"KDD '23","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"]},"container-title":["Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580305.3599839","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3580305.3599839","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:23Z","timestamp":1750182563000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580305.3599839"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,4]]},"references-count":38,"alternative-id":["10.1145\/3580305.3599839","10.1145\/3580305"],"URL":"https:\/\/doi.org\/10.1145\/3580305.3599839","relation":{},"subject":[],"published":{"date-parts":[[2023,8,4]]},"assertion":[{"value":"2023-08-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}