{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,8]],"date-time":"2026-07-08T17:38:57Z","timestamp":1783532337720,"version":"3.55.0"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2025,5,22]],"date-time":"2025-05-22T00:00:00Z","timestamp":1747872000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Research Council of Canada\u2019s Artificial Intelligence for Logistics Program"},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>Deep Reinforcement Learning (DRL)-based frameworks, featuring Transformer-style policy networks, have demonstrated their efficacy across various Vehicle Routing Problem (VRP) variants. However, the application of these methods to the Multi-Trip Time-Dependent Vehicle Routing Problem (MTTDVRP) with maximum working hours constraints\u2014a pivotal element of urban logistics\u2014remains largely unexplored. This article introduces a DRL-based method called the Simultaneous Encoder and Dual Decoder Attention Model (SED2AM), tailored for the MTTDVRP with maximum working hours constraints. The proposed method introduces a temporal locality inductive bias to the encoding module of the policy networks, enabling it to effectively account for the time dependency in travel distance\/time. The decoding module of SED2AM includes a vehicle selection decoder that selects a vehicle from the fleet, effectively associating trips with vehicles for functional multi-trip routing. Additionally, this decoding module is equipped with a trip construction decoder leveraged for constructing trips for the vehicles. This policy model is equipped with two classes of state representations, fleet state, and routing state, providing the information needed for effective route construction in the presence of maximum working hours constraints. Experimental results using real-world datasets from two major Canadian cities not only show that SED2AM outperforms the current state-of-the-art DRL-based and metaheuristic-based baselines but also demonstrate its generalizability to solve larger scale problems.<\/jats:p>","DOI":"10.1145\/3721983","type":"journal-article","created":{"date-parts":[[2025,3,5]],"date-time":"2025-03-05T16:09:27Z","timestamp":1741190967000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["SED2AM: Solving Multi-Trip Time-Dependent Vehicle Routing Problem Using Deep Reinforcement Learning"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9938-3560","authenticated-orcid":false,"given":"Arash","family":"Mozhdehi","sequence":"first","affiliation":[{"name":"University of Calgary, Calgary, Alberta, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2320-954X","authenticated-orcid":false,"given":"Yunli","family":"Wang","sequence":"additional","affiliation":[{"name":"National Research Council Canada, Ottawa, Ontario, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7870-9448","authenticated-orcid":false,"given":"Sun","family":"Sun","sequence":"additional","affiliation":[{"name":"National Research Council Canada, Waterloo, Ontario, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3569-2126","authenticated-orcid":false,"given":"Xin","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Calgary, Calgary, Alberta, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,5,22]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Arash Ahmadian Chris Cremer Matthias Gall\u00e9 Marzieh Fadaee Julia Kreutzer Olivier Pietquin Ahmet \u00dcst\u00fcn and Sara Hooker. 2024. Back to basics: Revisiting reinforce style optimization for learning from human feedback in llms. arXiv:2402.14740. Retrieved from https:\/\/arxiv.org\/abs\/2402.14740","DOI":"10.18653\/v1\/2024.acl-long.662"},{"issue":"2","key":"e_1_3_2_3_2","doi-asserted-by":"crossref","first-page":"92","DOI":"10.21776\/ub.jeest.2017.003.02.4","article-title":"Optimization of multi-trip vehicle routing problem with time windows using genetic algorithm","volume":"3","author":"Anggodo Yusuf Priyo","year":"2017","unstructured":"Yusuf Priyo Anggodo, Amalia Kartika Ariyani, Muhammad Khaerul Ardi, and Wayan Firdaus Mahmudy. 2017. Optimization of multi-trip vehicle routing problem with time windows using genetic algorithm. Journal of Environmental Engineering and Sustainable Technology 3, 2 (2017), 92\u201397.","journal-title":"Journal of Environmental Engineering and Sustainable Technology"},{"key":"e_1_3_2_4_2","unstructured":"Thomas Anthony Zheng Tian and David Barber. 2017. Thinking fast and slow with deep learning and tree search. arXiv:1705.08439. Retrieved from https:\/\/arxiv.org\/abs\/1705.08439"},{"key":"e_1_3_2_5_2","unstructured":"Irwan Bello Hieu Pham Quoc V. Le Mohammad Norouzi and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv:1611.09940. Retrieved from https:\/\/arxiv.org\/abs\/1611.09940"},{"key":"e_1_3_2_6_2","unstructured":"Federico Berto Chuanbo Hua Junyoung Park Laurin Luttmann Yining Ma Fanchen Bu Jiarui Wang Haoran Ye Minsu Kim Sanghyeok Choi et al. 2023. Rl4co: An extensive reinforcement learning for combinatorial optimization benchmark. arXiv:2306.17100. Retrieved from https:\/\/arxiv.org\/abs\/2306.17100"},{"key":"e_1_3_2_7_2","first-page":"4247","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Kumar Bhunia Ayan","year":"2021","unstructured":"Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, and Yi-Zhe Song. 2021. More photos are all you need: Semi-supervised learning for fine-grained sketch based image retrieval. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 4247\u20134256."},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compenvurbsys.2017.05.004"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1057\/jors.2013.71"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3546952"},{"key":"e_1_3_2_11_2","unstructured":"Rewon Child Scott Gray Alec Radford and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv:1904.10509. Retrieved from https:\/\/arxiv.org\/abs\/1904.10509"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2006.06.047"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403356"},{"key":"e_1_3_2_14_2","first-page":"399","volume-title":"Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing","author":"Gete Harritxu","year":"2023","unstructured":"Harritxu Gete and Thierry Etchegoyhen. 2023. An evaluation of source factors in concatenation-based context-aware neural machine translation. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, 399\u2013407."},{"key":"e_1_3_2_15_2","first-page":"1","volume-title":"Proceedings of the 15th ACM SIGSPATIAL International Workshop on Computational Transportation Science","author":"Han Jihee","year":"2022","unstructured":"Jihee Han, Arash Mozhdehi, Yunli Wang, Sun Sun, and Xin Wang. 2022. Solving a multi-trip VRP with real heterogeneous fleet and time windows based on ant colony optimization: An industrial case study. In Proceedings of the 15th ACM SIGSPATIAL International Workshop on Computational Transportation Science, 1\u20134."},{"key":"e_1_3_2_16_2","series-title":"Proceedings of Machine Learning Research","first-page":"448","volume-title":"Proceedings of the 32nd International Conference on Machine Learning","volume":"37","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate hift. In Proceedings of the 32nd International Conference on Machine Learning. Francis Bach and David Blei (Eds.), Proceedings of Machine Learning Research, Vol. 37, PMLR, Lille, France, 448\u2013456. Retrieved from https:\/\/proceedings.mlr.press\/v37\/ioffe15.html"},{"issue":"1","key":"e_1_3_2_17_2","doi-asserted-by":"crossref","first-page":"164","DOI":"10.3141\/1771-21","article-title":"Genetic algorithm for the time-dependent vehicle routing problem","volume":"1771","author":"Jung Soojung","year":"2001","unstructured":"Soojung Jung and Ali Haghani. 2001. Genetic algorithm for the time-dependent vehicle routing problem. Transportation Research Record 1771, 1 (2001), 164\u2013171.","journal-title":"Transportation Research Record"},{"key":"e_1_3_2_18_2","first-page":"5156","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Katharopoulos Angelos","year":"2020","unstructured":"Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and Fran\u00e7ois Fleuret. 2020. Transformers are rnns: Fast autoregressive transformers with linear attention. In Proceedings of the International Conference on Machine Learning. PMLR, 5156\u20135165."},{"key":"e_1_3_2_19_2","unstructured":"Wouter Kool Herke Van Hoof and Max Welling. 2018. Attention learn to solve routing problems! arXiv:1803.08475. Retrieved from https:\/\/arxiv.org\/abs\/1803.08475"},{"key":"e_1_3_2_20_2","first-page":"345","volume-title":"Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920), Part XVI","author":"Kwon Heeseung","year":"2020","unstructured":"Heeseung Kwon, Manjin Kim, Suha Kwak, and Minsu Cho. 2020. Motionsqueeze: Neural motion feature learning for video understanding. In Proceedings of the 16th European Conference on Computer Vision (ECCV \u201920), Part XVI. Springer, 345\u2013362."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2022.08.005"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.124514"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3056120"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2020.2977661"},{"key":"e_1_3_2_25_2","unstructured":"Mohammadreza Nazari Afshin Oroojlooy Lawrence V. Snyder and Martin Tak\u00e1\u010d. 2018. Reinforcement learning for solving the vehicle routing problem. arXiv:1802.04240. Retrieved from https:\/\/arxiv.org\/abs\/1802.04240"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2020.09.022"},{"key":"e_1_3_2_27_2","first-page":"7487","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Parisotto Emilio","year":"2020","unstructured":"Emilio Parisotto, Francis Song, Jack Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, et al. 2020. Stabilizing transformers for reinforcement learning. In Proceedings of the International Conference on Machine Learning. PMLR, 7487\u20137498."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.131"},{"key":"e_1_3_2_29_2","unstructured":"Legislative Services. 2021. Consolidated federal laws of Canada Commercial Vehicle Drivers Hours of Service Regulations. Commercial Vehicle Drivers Hours of Service Regulations. Retrieved March 2 2023 from https:\/\/laws-lois.justice.gc.ca\/eng\/regulations\/SOR-2005-313\/"},{"key":"e_1_3_2_30_2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. arXiv:1706.03762. Retrieved from https:\/\/arxiv.org\/abs\/1706.03762"},{"issue":"3","key":"e_1_3_2_31_2","doi-asserted-by":"crossref","first-page":"4779","DOI":"10.1109\/TNNLS.2024.3371781","article-title":"Deep reinforcement learning for solving vehicle routing problems with backhauls","volume":"36","author":"Wang Conghui","year":"2024","unstructured":"Conghui Wang, Zhiguang Cao, Yaoxin Wu, Long Teng, and Guohua Wu. 2024. Deep reinforcement learning for solving vehicle routing problems with backhauls. IEEE Transactions on Neural Networks and Learning Systems 36, 3 (2024), 4779\u20134793.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_2_32_2","unstructured":"Sinong Wang Belinda Z. Li Madian Khabsa Han Fang and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv:2006.04768. Retrieved from https:\/\/arxiv.org\/abs\/2006.04768"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992696"},{"issue":"10","key":"e_1_3_2_34_2","first-page":"11107","article-title":"Reinforcement learning with multiple relational attention for solving vehicle routing problems","volume":"52","author":"Xu Yunqiu","year":"2021","unstructured":"Yunqiu Xu, Meng Fang, Ling Chen, Gangyan Xu, Yali Du, and Chengqi Zhang. 2021. Reinforcement learning with multiple relational attention for solving vehicle routing problems. IEEE Transactions on Cybernetics 52, 10 (2021), 11107\u201311120.","journal-title":"IEEE Transactions on Cybernetics"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i12.17300"},{"key":"e_1_3_2_36_2","first-page":"1","volume-title":"Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN)","author":"Zhang Yongxin","year":"2022","unstructured":"Yongxin Zhang, Jiahai Wang, and Zizhen Zhang. 2022. Edge-based formulation with graph attention network for practical vehicle routing problem with time windows. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 1\u20138."},{"issue":"10","key":"e_1_3_2_37_2","doi-asserted-by":"crossref","first-page":"7978","DOI":"10.1109\/TNNLS.2022.3148435","article-title":"Meta-learning-based deep reinforcement learning for multiobjective optimization problems","volume":"34","author":"Zhang Zizhen","year":"2022","unstructured":"Zizhen Zhang, Zhiyuan Wu, Hang Zhang, and Jiahai Wang. 2022. Meta-learning-based deep reinforcement learning for multiobjective optimization problems. IEEE Transactions on Neural Networks and Learning Systems 34, 10 (2022), 7978\u20137991.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2024.04.011"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3625232"},{"key":"e_1_3_2_40_2","first-page":"9980","volume-title":"Proceedings of the 36th AAAI Conference on Artificial Intelligence","author":"Zong Zefang","year":"2022","unstructured":"Zefang Zong, Meng Zheng, Yong Li, and Depeng Jin. 2022. Mapdp: Cooperative multi-agent reinforcement learning to solve pickup and delivery problems. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, 9980\u20139988."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3721983","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3721983","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:09:48Z","timestamp":1750295388000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3721983"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,22]]},"references-count":39,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3721983"],"URL":"https:\/\/doi.org\/10.1145\/3721983","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,22]]},"assertion":[{"value":"2024-03-21","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-27","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-22","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}