{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T13:53:51Z","timestamp":1774965231760,"version":"3.50.1"},"reference-count":82,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,6,16]],"date-time":"2021-06-16T00:00:00Z","timestamp":1623801600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,6,16]],"date-time":"2021-06-16T00:00:00Z","timestamp":1623801600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"The authors are grateful to CAPES, CNPq\/INERGE, FAPEMIG, UFSJ and UFRB"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The traveling salesman problem (TSP) is one of the best-known combinatorial optimization problems. Many methods derived from TSP have been applied to study autonomous vehicle route planning with fuel constraints. Nevertheless, less attention has been paid to reinforcement learning (RL) as a potential method to solve refueling problems. This paper employs RL to solve the traveling salesman problem With refueling (TSPWR). The technique proposes a model (actions, states, reinforcements) and RL-TSPWR algorithm. Focus is given on the analysis of RL parameters and on the refueling influence in route learning optimization of fuel cost. Two RL algorithms: Q-learning and SARSA are compared. In addition, RL parameter estimation is performed by Response Surface Methodology, Analysis of Variance and Tukey Test. The proposed method achieves the best solution in 15 out of 16 case studies.<\/jats:p>","DOI":"10.1007\/s40747-021-00444-4","type":"journal-article","created":{"date-parts":[[2021,6,16]],"date-time":"2021-06-16T21:02:43Z","timestamp":1623877363000},"page":"2001-2015","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":51,"title":["Reinforcement learning for the traveling salesman problem with refueling"],"prefix":"10.1007","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2136-9870","authenticated-orcid":false,"given":"Andr\u00e9 L. C.","family":"Ottoni","sequence":"first","affiliation":[]},{"given":"Erivelton G.","family":"Nepomuceno","sequence":"additional","affiliation":[]},{"given":"Marcos S. de","family":"Oliveira","sequence":"additional","affiliation":[]},{"given":"Daniela C. R. de","family":"Oliveira","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,6,16]]},"reference":[{"issue":"2","key":"444_CR1","doi-asserted-by":"publisher","first-page":"107","DOI":"10.3233\/MGS-150232","volume":"11","author":"MM Alipour","year":"2015","unstructured":"Alipour MM, Razavi SN (2015) A new multiagent reinforcement learning algorithm to solve the symmetric traveling salesman problem. Multiagent Grid Syst 11(2):107\u2013119","journal-title":"Multiagent Grid Syst"},{"issue":"9","key":"444_CR2","doi-asserted-by":"publisher","first-page":"2935","DOI":"10.1007\/s00521-017-2880-4","volume":"30","author":"MM Alipour","year":"2018","unstructured":"Alipour MM, Razavi SN, Derakhshi MRF, Balafar MA (2018) A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem. Neural Comput Appl 30(9):2935\u20132951","journal-title":"Neural Comput Appl"},{"key":"444_CR3","volume-title":"The traveling salesman problem: a computational study","author":"D Applegate","year":"2011","unstructured":"Applegate D, Bixby R, Chv\u00e1tal V, Cook W (2011) The traveling salesman problem: a computational study. Princeton University Press, Princeton"},{"key":"444_CR4","doi-asserted-by":"publisher","first-page":"706","DOI":"10.1016\/j.cie.2016.10.022","volume":"112","author":"A Arin","year":"2017","unstructured":"Arin A, Rabadi G (2017) Integrating estimation of distribution algorithms versus q-learning into meta-raps for solving the 0\u20131 multidimensional knapsack problem. Comp Ind Eng 112:706\u2013720","journal-title":"Comp Ind Eng"},{"issue":"2","key":"444_CR5","first-page":"43","volume":"1","author":"SJ Bal","year":"2014","unstructured":"Bal SJ, Mahalik NP (2014) A simulation study on reinforcement learning for navigation application. Artif Intell Appl 1(2):43\u201353","journal-title":"Artif Intell Appl"},{"key":"444_CR6","doi-asserted-by":"crossref","unstructured":"Barsce JC, Palombarini JA, Mart\u00ednez EC (2017) Towards autonomous reinforcement learning: automatic setting of hyper-parameters using bayesian optimization. In: 2017 XLIII Latin American Computer Conference (CLEI), pp 1\u20139","DOI":"10.1109\/CLEI.2017.8226439"},{"key":"444_CR7","unstructured":"Bello I, Pham H, Le Q, Norouzi M, Bengio S (2019) Neural combinatorial optimization with reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017\u2014Workshop Track Proceedings (cited By 5)"},{"issue":"2","key":"444_CR8","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1007\/s10846-017-0731-2","volume":"91","author":"RA Bianchi","year":"2018","unstructured":"Bianchi RA, Santos PE, Da Silva IJ, Celiberto LA, de Mantaras RL (2018) Heuristically accelerated reinforcement learning by means of case-based reasoning and transfer learning. J Intell Robot Syst 91(2):301\u2013312","journal-title":"J Intell Robot Syst"},{"key":"444_CR9","unstructured":"Bianchi RAC, Ribeiro CHC, Costa AHR (2009) On the relation between ant colony optimization and heuristically accelerated reinforcement learning. In: 1st International Workshop on Hybrid Control of Autonomous System, pp 49\u201355"},{"issue":"2","key":"444_CR10","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1016\/0305-0548(83)90030-8","volume":"10","author":"L Bodin","year":"1983","unstructured":"Bodin L, Golden B, Assad A, Ball M (1983) Routing and scheduling of vehicles and crews\u2014the state of the art. Comp Oper Res 10(2):63\u2013211","journal-title":"Comp Oper Res"},{"issue":"3","key":"444_CR11","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1007\/s40747-020-00167-y","volume":"6","author":"G Budak","year":"2020","unstructured":"Budak G, Chen X (2020) Evaluation of the size of time windows for the travelling salesman problem in delivery operations. Complex Intell Syst 6(3):681\u2013695","journal-title":"Complex Intell Syst"},{"issue":"2","key":"444_CR12","doi-asserted-by":"publisher","first-page":"2007","DOI":"10.1109\/LRA.2019.2899918","volume":"4","author":"H-TL Chiang","year":"2019","unstructured":"Chiang H-TL, Faust A, Fiser M, Francis A (2019) Learning navigation behaviors end-to-end with autorl. IEEE Robot Autom Lett 4(2):2007\u20132014","journal-title":"IEEE Robot Autom Lett"},{"issue":"10","key":"444_CR13","doi-asserted-by":"publisher","first-page":"4351","DOI":"10.1109\/TLA.2016.7786315","volume":"14","author":"ML Costa","year":"2016","unstructured":"Costa ML, Padilha CAA, Melo JD, Neto ADD (2016) Hierarchical reinforcement learning and parallel computing applied to the k-server problem. IEEE Latin Am Trans 14(10):4351\u20134357","journal-title":"IEEE Latin Am Trans"},{"key":"444_CR14","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1007\/978-3-030-14347-3_34","volume-title":"Hybrid intelligent systems","author":"B Cunha","year":"2020","unstructured":"Cunha B, Madureira AM, Fonseca B, Coelho D (2020) Deep reinforcement learning as a job shop scheduling solver: a literature review. In: Madureira AM, Abraham A, Gandhi N, Varela ML (eds) Hybrid intelligent systems. Springer International Publishing, Cham, pp 350\u2013359"},{"issue":"3\u20134","key":"444_CR15","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1007\/s10846-014-0171-1","volume":"80","author":"J Cunha","year":"2015","unstructured":"Cunha J, Serra R, Lau N, Lopes L, Neves A (2015) Batch reinforcement learning for robotic soccer using the q-batch update-rule. J Intell Robot Syst Theory Appl 80(3\u20134):385\u2013399 cited by 4","journal-title":"J Intell Robot Syst Theory Appl"},{"issue":"1","key":"444_CR16","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1109\/4235.585892","volume":"1","author":"M Dorigo","year":"1997","unstructured":"Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53\u201366","journal-title":"IEEE Trans Evol Comput"},{"key":"444_CR17","first-page":"1","volume":"5","author":"E Even-Dar","year":"2003","unstructured":"Even-Dar E, Mansour Y (2003) Learning rates for Q-learning. J Mach Learn Res 5:1\u201325","journal-title":"J Mach Learn Res"},{"key":"444_CR18","doi-asserted-by":"crossref","unstructured":"Gambardella LM, Dorigo M (1995) Ant-Q: a reinforcement learning approach to the traveling salesman problem. In: Proceedings of the 12th International Conference on Machine Learning, pp 252\u2013260","DOI":"10.1016\/B978-1-55860-377-6.50039-6"},{"key":"444_CR19","doi-asserted-by":"crossref","unstructured":"Giardini G, Kalm\u00e1r-Nagy T (2011). Genetic algorithm for combinatorial path planning: the subtour problem. Math Probl Eng 2011","DOI":"10.1155\/2011\/483643"},{"issue":"2","key":"444_CR20","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1007\/s11370-017-0217-x","volume":"10","author":"S Haghzad Klidbary","year":"2017","unstructured":"Haghzad Klidbary S, Bagheri Shouraki S, Sheikhpour Kourabbaslou S (2017) Path planning of modular robots on various terrains using q-learning versus optimization algorithms. Intell Serv Robot 10(2):121\u2013136","journal-title":"Intell Serv Robot"},{"key":"444_CR21","doi-asserted-by":"crossref","unstructured":"Hamzehi S, Bogenberger K, Franeck P, Kaltenh\u00e4user B (2019) Combinatorial reinforcement learning of linear assignment problems. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp 3314\u20133321","DOI":"10.1109\/ITSC.2019.8916920"},{"key":"444_CR22","doi-asserted-by":"publisher","first-page":"106244","DOI":"10.1016\/j.knosys.2020.106244","volume":"204","author":"Y Hu","year":"2020","unstructured":"Hu Y, Yao Y, Lee W (2020) A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowl-Based Syst 204:106244","journal-title":"Knowl-Based Syst"},{"key":"444_CR23","unstructured":"Hutter F, Hoos H, Leyton-Brown K (2014) An efficient approach for assessing hyperparameter importance. In: Proceedings of International Conference on Machine Learning 2014 (ICML 2014), pp 754\u2013762"},{"key":"444_CR24","doi-asserted-by":"crossref","unstructured":"Hutter F, Kotthoff L, Vanschoren J, editors (2019) Automated machine learning: methods, systems, challenges. Springer. In press, http:\/\/automl.org\/book","DOI":"10.1007\/978-3-030-05318-5"},{"issue":"8","key":"444_CR25","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1080\/15568318.2017.1416505","volume":"12","author":"I-J Jeong","year":"2018","unstructured":"Jeong I-J, Illades Boy C (2018) Routing and refueling plans to minimize travel time in alternative-fuel vehicles. Int J Sustain Transp 12(8):583\u2013591","journal-title":"Int J Sustain Transp"},{"key":"444_CR26","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1613\/jair.301","volume":"4","author":"L Kaelbling","year":"1996","unstructured":"Kaelbling L, Littman M, Moore A (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237\u2013285","journal-title":"J Artif Intell Res"},{"key":"444_CR27","doi-asserted-by":"crossref","unstructured":"Khuller S, Malekian A, Mestre J (2007) To fill or not to fill: the gas station problem. In: European Symposium on Algorithms. Springer, pp 534\u2013545","DOI":"10.1007\/978-3-540-75520-3_48"},{"issue":"11","key":"444_CR28","doi-asserted-by":"publisher","first-page":"1238","DOI":"10.1177\/0278364913495721","volume":"32","author":"J Kober","year":"2013","unstructured":"Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238\u20131274","journal-title":"Int J Robot Res"},{"issue":"5","key":"444_CR29","doi-asserted-by":"publisher","first-page":"1141","DOI":"10.1109\/TSMCA.2012.2227719","volume":"43","author":"A Konar","year":"2013","unstructured":"Konar A, Chakraborty IG, Singh SJ, Jain LC, Nagar AK (2013) A deterministic improved q-learning for path planning of a mobile robot. IEEE Trans Syst Man Cybern Syst 43(5):1141\u20131153","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"issue":"3","key":"444_CR30","doi-asserted-by":"publisher","first-page":"122","DOI":"10.3390\/robotics2030122","volume":"2","author":"P Kormushev","year":"2013","unstructured":"Kormushev P, Calinon S, Caldwell D (2013) Reinforcement learning in robotics: applications and real-world challenges. Robotics 2(3):122\u2013148 cited By 50","journal-title":"Robotics"},{"key":"444_CR31","doi-asserted-by":"publisher","first-page":"225945","DOI":"10.1109\/ACCESS.2020.3045027","volume":"8","author":"PT Kyaw","year":"2020","unstructured":"Kyaw PT, Paing A, Thu TT, Mohan RE, Le AV, Veerajagadheswar P (2020) Coverage path planning for decomposition reconfigurable grid-maps using deep reinforcement learning based travelling salesman problem. IEEE Access 8:225945\u2013225956","journal-title":"IEEE Access"},{"issue":"2","key":"444_CR32","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1016\/0377-2217(92)90138-Y","volume":"59","author":"G Laporte","year":"1992","unstructured":"Laporte G (1992) The traveling salesman problem: an overview of exact and approximate algorithms. Eur J Oper Res 59(2):231\u2013247 cited By 484","journal-title":"Eur J Oper Res"},{"issue":"2","key":"444_CR33","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1023\/A:1006529012972","volume":"13","author":"P Larra\u00f1aga","year":"1999","unstructured":"Larra\u00f1aga P, Kuijpers C, Murga R, Inza I, Dizdarevic S (1999) Genetic algorithms for the travelling salesman problem: a review of representations and operators. Artif Intell Rev 13(2):129\u2013170","journal-title":"Artif Intell Rev"},{"issue":"7","key":"444_CR34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v032.i07","volume":"32","author":"RV Lenth","year":"2009","unstructured":"Lenth RV (2009) Response-surface methods in R, using RSM. J Stat Softw 32(7):1\u201317","journal-title":"J Stat Softw"},{"key":"444_CR35","doi-asserted-by":"crossref","unstructured":"Levy D, Sundar K, Rathinam S (2014) Heuristics for routing heterogeneous unmanned vehicles with fuel constraints. Math Probl Eng 2014","DOI":"10.1155\/2014\/131450"},{"issue":"2","key":"444_CR36","doi-asserted-by":"publisher","first-page":"345","DOI":"10.1016\/j.asr.2020.03.049","volume":"66","author":"C Li","year":"2020","unstructured":"Li C, Xu B (2020) Optimal scheduling of multiple sun-synchronous orbit satellites refueling. Adv Space Res 66(2):345\u2013358","journal-title":"Adv Space Res"},{"issue":"2","key":"444_CR37","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1109\/MCI.2019.2901089","volume":"14","author":"D Li","year":"2019","unstructured":"Li D, Zhao D, Zhang Q, Chen Y (2019) Reinforcement learning and deep learning based lateral control for autonomous driving [application notes]. IEEE Comput Intell Mag 14(2):83\u201398","journal-title":"IEEE Comput Intell Mag"},{"issue":"11","key":"444_CR38","doi-asserted-by":"publisher","first-page":"2390","DOI":"10.1109\/TCYB.2014.2371918","volume":"45","author":"J Li","year":"2015","unstructured":"Li J, Zhou M, Sun Q, Dai X, Yu X (2015) Colored traveling salesman problem. IEEE Trans Cybern 45(11):2390\u20132401","journal-title":"IEEE Trans Cybern"},{"key":"444_CR39","doi-asserted-by":"crossref","unstructured":"Li S, Xu X, Zuo L (2015) Dynamic path planning of a mobile robot with improved q-learning algorithm. In: Information and Automation, 2015 IEEE International Conference on, pp 409\u2013414. IEEE","DOI":"10.1109\/ICInfA.2015.7279322"},{"key":"444_CR40","doi-asserted-by":"crossref","unstructured":"Liessner R, Schmitt J, Dietermann A, B\u00e4ker B (2019) Hyperparameter optimization for deep reinforcement learning in vehicle energy management. In: 11th International Conference on Agents and Artificial Intelligence (ICAART 2019)","DOI":"10.5220\/0007364701340144"},{"key":"444_CR41","first-page":"213","volume-title":"Traveling salesman problem, theory and applications, chapter hybrid metaheuristics using reinforcement learning applied to salesman traveling problem","author":"FC Lima-J\u00fanior","year":"2010","unstructured":"Lima-J\u00fanior FC, Neto ADD, Melo JD (2010) Traveling salesman problem, theory and applications, chapter hybrid metaheuristics using reinforcement learning applied to salesman traveling problem. InTech, London, pp 213\u2013236"},{"key":"444_CR42","doi-asserted-by":"crossref","unstructured":"Lin SH (2008) Finding optimal refueling policies in transportation networks. Algorithmic Aspects in Information and Management, Finding Optimal Refueling Policies in Transportation Networks 5034:280\u2013291","DOI":"10.1007\/978-3-540-68880-8_27"},{"issue":"3","key":"444_CR43","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1016\/j.orl.2006.05.003","volume":"35","author":"SH Lin","year":"2007","unstructured":"Lin SH, Gertsch N, Russell J (2007) A linear-time algorithm for finding optimal vehicle refueling policies. Oper Res Lett 35(3):290\u2013296","journal-title":"Oper Res Lett"},{"key":"444_CR44","doi-asserted-by":"publisher","first-page":"212","DOI":"10.1016\/j.eswa.2019.06.015","volume":"135","author":"RAS Lins","year":"2019","unstructured":"Lins RAS, D\u00f3ria ADN, de Melo JD (2019) Deep reinforcement learning applied to the k-server problem. Expert Syst Appl 135:212\u2013218","journal-title":"Expert Syst Appl"},{"issue":"3","key":"444_CR45","doi-asserted-by":"publisher","first-page":"6995","DOI":"10.1016\/j.eswa.2008.08.026","volume":"36","author":"F Liu","year":"2009","unstructured":"Liu F, Zeng G (2009) Study of genetic algorithm with reinforcement learning to solve the TSP. Expert Syst Appl 36(3):6995\u20137001","journal-title":"Expert Syst Appl"},{"key":"444_CR46","first-page":"718","volume-title":"Kolmogorov\u2013Smirnov test","author":"RHC Lopes","year":"2011","unstructured":"Lopes RHC (2011) Kolmogorov\u2013Smirnov test. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 718\u2013720"},{"key":"444_CR47","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1016\/j.robot.2019.02.013","volume":"115","author":"ES Low","year":"2019","unstructured":"Low ES, Ong P, Cheah KC (2019) Solving the optimal path planning of a mobile robot using improved q-learning. Robot Auton Syst 115:143\u2013161","journal-title":"Robot Auton Syst"},{"issue":"12","key":"444_CR48","doi-asserted-by":"publisher","first-page":"1781","DOI":"10.1017\/S0263574718000735","volume":"36","author":"DG Macharet","year":"2018","unstructured":"Macharet DG, Campos MFM (2018) A survey on routing problems and robotic systems. Robotica 36(12):1781\u20131803","journal-title":"Robotica"},{"key":"444_CR49","volume-title":"Design and analysis of experiments","author":"DC Montgomery","year":"2017","unstructured":"Montgomery DC (2017) Design and analysis of experiments, 9th edn. Wiley, New York","edition":"9"},{"key":"444_CR50","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1016\/j.trc.2015.03.005","volume":"54","author":"C Murray","year":"2015","unstructured":"Murray C, Chu A (2015) The flying sidekick traveling salesman problem: optimization of drone-assisted parcel delivery. Transp Res Part C: Emerg Technol 54:86\u2013109","journal-title":"Transp Res Part C: Emerg Technol"},{"key":"444_CR51","volume-title":"Response surface methodology: process and product optimization using designed experiments","author":"R\u00a0H Myers","year":"2009","unstructured":"Myers R\u00a0H, Montgomery D\u00a0C, Anderson-Cook C\u00a0M (2009) Response surface methodology: process and product optimization using designed experiments, 3rd edn. Wiley, London","edition":"3"},{"issue":"3","key":"444_CR52","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1007\/s40313-018-0374-y","volume":"29","author":"ALC Ottoni","year":"2018","unstructured":"Ottoni ALC, Nepomuceno EG, de Oliveira MS (2018) A response surface model approach to parameter estimation of reinforcement learning for the travelling salesman problem. J Control Autom Electr Syst 29(3):350\u2013359","journal-title":"J Control Autom Electr Syst"},{"issue":"01","key":"444_CR53","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1109\/TLA.2020.9049466","volume":"18","author":"ALC Ottoni","year":"2020","unstructured":"Ottoni ALC, Nepomuceno EG, de Oliveira MS (2020) Development of a pedagogical graphical interface for the reinforcement learning. IEEE Latin Am Trans 18(01):92\u2013101","journal-title":"IEEE Latin Am Trans"},{"issue":"6","key":"444_CR54","doi-asserted-by":"publisher","first-page":"4441","DOI":"10.1007\/s00500-019-04206-w","volume":"24","author":"ALC Ottoni","year":"2020","unstructured":"Ottoni ALC, Nepomuceno EG, de Oliveira MS, de Oliveira DCR (2020) Tuning of reinforcement learning parameters applied to sop using the Scott-Knott method. Soft Comp 24(6):4441\u20134453","journal-title":"Soft Comp"},{"issue":"7\u20138","key":"444_CR55","doi-asserted-by":"publisher","first-page":"1659","DOI":"10.1007\/s00521-013-1402-2","volume":"24","author":"A Ouaarab","year":"2014","unstructured":"Ouaarab A, Ahiod B, Yang X-S (2014) Discrete cuckoo search algorithm for the travelling salesman problem. Neural Comp Appl 24(7\u20138):1659\u20131669","journal-title":"Neural Comp Appl"},{"key":"444_CR56","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1016\/j.ipl.2017.11.009","volume":"131","author":"K Papadopoulos","year":"2018","unstructured":"Papadopoulos K, Christofides D (2018) A fast algorithm for the gas station problem. Inform Process Lett 131:55\u201359 cited By 3","journal-title":"Inform Process Lett"},{"key":"444_CR57","doi-asserted-by":"crossref","unstructured":"Polychronis G, Lalis S (2019) Dynamic vehicle routing under uncertain travel costs and refueling opportunities. In: Proceedings of the 5th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS 2019), pp 52\u201363","DOI":"10.5220\/0007673900002179"},{"key":"444_CR58","volume-title":"R: a language and environment for statistical computing","author":"R Core Team","year":"2018","unstructured":"R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna"},{"issue":"4","key":"444_CR59","doi-asserted-by":"publisher","first-page":"814","DOI":"10.1109\/TSMCA.2012.2226024","volume":"43","author":"P Rakshit","year":"2013","unstructured":"Rakshit P, Konar A, Bhowmik P, Goswami I, Das S, Jain LC, Nagar AK (2013) Realization of an adaptive memetic algorithm using differential evolution and q-learning: a case study in multirobot path planning. IEEE Trans Syst Man Cybern Syst 43(4):814\u2013831","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"issue":"4","key":"444_CR60","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1590\/S2238-10312013000400002","volume":"7","author":"AD Rodrigues Junior","year":"2013","unstructured":"Rodrigues Junior AD, Cruz MMC (2013) A generic decision model of refueling policies: a case study of a Brazilian motor carrier. J Transp Lit 7(4):8\u201322","journal-title":"J Transp Lit"},{"key":"444_CR61","unstructured":"Russell SJ, Norvig P (2013) Artificial intelligence. Campus, 3rd ed"},{"issue":"2","key":"444_CR62","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1287\/trsc.2018.0836","volume":"53","author":"M Schiffer","year":"2019","unstructured":"Schiffer M, Schneider M, Walther G, Laporte G (2019) Vehicle routing and location routing with intermediate stops: a review. Transp Sci 53(2):319\u2013343 cited By 3","journal-title":"Transp Sci"},{"issue":"1","key":"444_CR63","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1016\/S0893-6080(02)00228-9","volume":"16","author":"N Schweighofer","year":"2003","unstructured":"Schweighofer N, Doya K (2003) Meta-learning in reinforcement learning. Neural Netw 16(1):5\u20139","journal-title":"Neural Netw"},{"key":"444_CR64","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1016\/j.eswa.2019.04.056","volume":"131","author":"MAL Silva","year":"2019","unstructured":"Silva MAL, de Souza SR, Souza MJF, Bazzan ALC (2019) A reinforcement learning-based multi-agent framework applied for solving routing and scheduling problems. Expert Syst Appl 131:148\u2013171","journal-title":"Expert Syst Appl"},{"issue":"4","key":"444_CR65","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1016\/j.robot.2007.09.011","volume":"56","author":"A Sipahioglu","year":"2008","unstructured":"Sipahioglu A, Yazici A, Parlaktuna O, Gurel U (2008) Real-time tour construction for a mobile robot in a dynamic environment. Robot Auton Syst 56(4):289\u2013295","journal-title":"Robot Auton Syst"},{"key":"444_CR66","unstructured":"Sun R, Tatsumi S, Zhao G (2001) Multiagent reinforcement learning method with an improved ant colony system. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics 3:1612\u20131617"},{"issue":"1","key":"444_CR67","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1109\/TASE.2013.2279544","volume":"11","author":"K Sundar","year":"2014","unstructured":"Sundar K, Rathinam S (2014) Algorithms for routing an unmanned aerial vehicle in the presence of refueling depots. IEEE Trans Autom Sci Eng 11(1):287\u2013294 cited By 54","journal-title":"IEEE Trans Autom Sci Eng"},{"key":"444_CR68","volume-title":"Reinforcement learning: an introduction","author":"R Sutton","year":"2018","unstructured":"Sutton R, Barto A (2018) Reinforcement learning: an introduction, 2nd edn. MIT Press, Cambridge","edition":"2"},{"issue":"8","key":"444_CR69","doi-asserted-by":"publisher","first-page":"737","DOI":"10.1002\/nav.20317","volume":"55","author":"Y Suzuki","year":"2008","unstructured":"Suzuki Y (2008) A generic model of motor-carrier fuel optimization. Naval Res Logist 55(8):737\u2013746","journal-title":"Naval Res Logist"},{"issue":"2","key":"444_CR70","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1016\/j.dss.2008.09.005","volume":"46","author":"Y Suzuki","year":"2009","unstructured":"Suzuki Y (2009) A decision support system of dynamic vehicle refueling. Decis Support Syst 46(2):522\u2013531","journal-title":"Decis Support Syst"},{"issue":"1","key":"444_CR71","doi-asserted-by":"publisher","first-page":"758","DOI":"10.1016\/j.dss.2012.09.004","volume":"54","author":"Y Suzuki","year":"2012","unstructured":"Suzuki Y (2012) A decision support system of vehicle routing and refueling for motor carriers with time-sensitive demands. Decis Support Syst 54(1):758\u2013767","journal-title":"Decis Support Syst"},{"key":"444_CR72","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1016\/j.ijpe.2016.03.008","volume":"176","author":"Y Suzuki","year":"2016","unstructured":"Suzuki Y (2016) A dual-objective metaheuristic approach to solve practical pollution routing problem. Int J Prod Econ 176:143\u2013153","journal-title":"Int J Prod Econ"},{"key":"444_CR73","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1016\/j.ijpe.2018.05.007","volume":"202","author":"Y Suzuki","year":"2018","unstructured":"Suzuki Y, Lan B (2018) Cutting fuel consumption of truckload carriers by using new enhanced refueling policies. Int J Prod Econ 202:69\u201380","journal-title":"Int J Prod Econ"},{"issue":"3","key":"444_CR74","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/BF00992698","volume":"8","author":"CJ Watkins","year":"1992","unstructured":"Watkins CJ, Dayan P (1992) Technical note Q-learning. Mach Learn 8(3):279\u2013292","journal-title":"Mach Learn"},{"issue":"10","key":"444_CR75","doi-asserted-by":"publisher","first-page":"4933","DOI":"10.1007\/s12206-018-0941-y","volume":"32","author":"MH Woo","year":"2018","unstructured":"Woo MH, Lee S-H, Cha HM (2018) A study on the optimal route design considering time of mobile robot using recurrent neural network and reinforcement learning. J Mech Sci Technol 32(10):4933\u20134939","journal-title":"J Mech Sci Technol"},{"key":"444_CR76","doi-asserted-by":"crossref","unstructured":"Yan C, Xiang X (2018) A path planning algorithm for UAV based on improved q-learning. In: 2018 2nd International Conference on Robotics and Automation Sciences (ICRAS), pp 1\u20135","DOI":"10.1109\/ICRAS.2018.8443226"},{"key":"444_CR77","doi-asserted-by":"crossref","unstructured":"Yavuz M, \u00c7apar I (2017) Alternative-fuel vehicle adoption in service fleets: Impact evaluation through optimization modeling. Transp Sci 51(2):480\u2013493 cited By 5","DOI":"10.1287\/trsc.2016.0697"},{"issue":"5","key":"444_CR78","doi-asserted-by":"publisher","first-page":"438","DOI":"10.1177\/0278364915595278","volume":"35","author":"C Yoo","year":"2016","unstructured":"Yoo C, Fitch R, Sukkarieh S (2016) Online task planning and control for fuel-constrained aerial robots in wind fields. Int J Robot Res 35(5):438\u2013453","journal-title":"Int J Robot Res"},{"issue":"10","key":"444_CR79","doi-asserted-by":"publisher","first-page":"3806","DOI":"10.1109\/TITS.2019.2909109","volume":"20","author":"JJQ Yu","year":"2019","unstructured":"Yu JJQ, Yu W, Gu J (2019) Online vehicle routing with neural combinatorial optimization and deep reinforcement learning. IEEE Trans Intell Transp Syst 20(10):3806\u20133817","journal-title":"IEEE Trans Intell Transp Syst"},{"key":"444_CR80","unstructured":"Yu Z, Jinhai L, Guochang G, Rubo Z, Haiyan Y (2002) An implementation of evolutionary computation for path planning of cooperative mobile robots. In: Intelligent Control and Automation, 2002. Proceedings of the 4th World Congress on, vol 3, pages 1798\u20131802. IEEE"},{"key":"444_CR81","doi-asserted-by":"crossref","unstructured":"Zhang R, Prokhorchuk A, Dauwels J (2020) Deep reinforcement learning for traveling salesman problem with time windows and rejections. In: Proceedings of the International Joint Conference on Neural Networks, pp 1\u20138","DOI":"10.1109\/IJCNN48605.2020.9207026"},{"key":"444_CR82","doi-asserted-by":"crossref","unstructured":"Zhang T-J, Yang Y-K, Wang B-H, Li Z, Shen H-X, Li H-N (2019) Optimal scheduling for location geosynchronous satellites refueling problem. Acta Astronautica","DOI":"10.1016\/j.actaastro.2019.01.024"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00444-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-021-00444-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00444-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,4]],"date-time":"2023-02-04T01:31:33Z","timestamp":1675474293000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-021-00444-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,16]]},"references-count":82,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["444"],"URL":"https:\/\/doi.org\/10.1007\/s40747-021-00444-4","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,16]]},"assertion":[{"value":"1 December 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 June 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 June 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors listed in this article declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"This article does not contain any studies with human participants or animals performed by any of the authors.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}}]}}