{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T08:57:28Z","timestamp":1772701048830,"version":"3.50.1"},"reference-count":45,"publisher":"American Institute of Aeronautics and Astronautics (AIAA)","issue":"3","funder":[{"DOI":"10.13039\/501100007694","name":"Korea Agency for Infrastructure Technology Advancement","doi-asserted-by":"publisher","award":["RS-2023-00262688"],"award-info":[{"award-number":["RS-2023-00262688"]}],"id":[{"id":"10.13039\/501100007694","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["arc.aiaa.org"],"crossmark-restriction":true},"short-container-title":["Journal of Aerospace Information Systems"],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:p>This study addresses a drone delivery routing optimization problem with battery constraints, a variant of a vehicle routing problem (VRP), where drones deliver goods to multiple customer nodes while adhering to operational limitations such as battery and payload capacities. Our study proposes Adaptive-Policy Optimization with Multiple Optima (A-POMO), an enhanced framework based on POMO, which is a reinforcement-learning-based method for solving combinatorial optimization problems like VRP without relying on labeled data. A-POMO for battery-constrained drone delivery routing improves solution quality and computational efficiency through a constraint-guided probability distribution and an adaptive elite reward mechanism. Especially, A-POMO adjusts reward weights adaptively and employs a refined baseline for elite solutions to strike a balance between exploration and exploitation. Numerical experiments with 20, 50, and 100 customer nodes demonstrate that A-POMO achieves near-optimal solutions compared to MILP, with significant time efficiency for small-scale instances. It also outperforms the original POMO in large-scale instances for constraint-guided route optimization.<\/jats:p>","DOI":"10.2514\/1.i011709","type":"journal-article","created":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T06:44:31Z","timestamp":1763534671000},"page":"257-270","update-policy":"https:\/\/doi.org\/10.2514\/aiaa_crossmarkpolicy","source":"Crossref","is-referenced-by-count":0,"title":["Adaptive Policy Optimization for Battery-Constrained Drone Delivery Routing"],"prefix":"10.2514","volume":"23","author":[{"given":"Gangsan","family":"Kim","sequence":"first","affiliation":[{"name":"Korea Aerospace University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0145-5985","authenticated-orcid":false,"given":"Sang Hyun","family":"Kim","sequence":"additional","affiliation":[{"name":"Korea Aerospace University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1387","reference":[{"key":"r1","doi-asserted-by":"publisher","DOI":"10.1016\/j.trd.2023.103831"},{"key":"r2","doi-asserted-by":"publisher","DOI":"10.1016\/j.techsoc.2016.02.009"},{"key":"r3","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-72322-4_200-1"},{"key":"r4","doi-asserted-by":"publisher","DOI":"10.2514\/6.2022-4028"},{"key":"r5","doi-asserted-by":"publisher","DOI":"10.1016\/j.cie.2018.02.005"},{"key":"r6","doi-asserted-by":"publisher","DOI":"10.3390\/s20020515"},{"key":"r7","doi-asserted-by":"publisher","DOI":"10.1109\/TASE.2022.3213254"},{"key":"r8","doi-asserted-by":"publisher","DOI":"10.1016\/j.cor.2022.106112"},{"key":"r9","doi-asserted-by":"publisher","DOI":"10.1109\/TASE.2022.3175565"},{"key":"r10","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.2016.2582745"},{"key":"r11","doi-asserted-by":"publisher","DOI":"10.1007\/s10846-019-01034-w"},{"key":"r12","doi-asserted-by":"publisher","DOI":"10.1111\/itor.12783"},{"key":"r13","doi-asserted-by":"publisher","DOI":"10.1016\/S0305-0548(02)00051-5"},{"key":"r14","doi-asserted-by":"publisher","DOI":"10.1016\/j.aei.2004.07.001"},{"key":"r15","unstructured":"BelloI.PhamH.LeQ. V.NorouziM.BengioS. \u201cNeural Combinatorial Optimization with Reinforcement Learning,\u201d Proceedings of the International Conference on Learning Representations (ICLR), 2017. 10.48550\/arXiv.1611.09940"},{"key":"r16","unstructured":"KwonY.D.ChooJ.KimB.YoonI.GwonY.MinS. \u201cPOMO: Policy Optimization with Multiple Optima for Reinforcement Learning,\u201d Advances in Neural Information Processing Systems, 2020. 10.48550\/arXiv.2010.16011"},{"key":"r17","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611973594.fm"},{"key":"r18","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-005-0644-x"},{"key":"r19","doi-asserted-by":"publisher","DOI":"10.3390\/sym13101923"},{"key":"r20","doi-asserted-by":"publisher","DOI":"10.1016\/j.trb.2023.03.003"},{"key":"r21","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.116264"},{"key":"r22","doi-asserted-by":"publisher","DOI":"10.1287\/opre.2019.1874"},{"key":"r23","doi-asserted-by":"publisher","DOI":"10.1016\/j.cie.2024.110330"},{"key":"r24","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898718515.ch9"},{"key":"r25","doi-asserted-by":"publisher","DOI":"10.1111\/itor.12796"},{"key":"r26","doi-asserted-by":"publisher","DOI":"10.1016\/j.cie.2023.109340"},{"key":"r27","doi-asserted-by":"publisher","DOI":"10.1016\/j.cie.2022.108389"},{"key":"r28","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-77778-8_1"},{"key":"r29","doi-asserted-by":"publisher","DOI":"10.1016\/0305-0548(84)90007-8"},{"key":"r30","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2022.12.011"},{"key":"r31","unstructured":"ChoiY.SchonfeldP. M. \u201cOptimization of Multi-Package Drone Deliveries Considering Battery Capacity,\u201d Proceedings of the 96th Annual Meeting of the Transportation Research Board, Transportation Research Board, Washington, D.C., 2017, pp.\u00a08\u201312."},{"key":"r32","doi-asserted-by":"publisher","DOI":"10.1007\/PL00011424"},{"key":"r33","doi-asserted-by":"publisher","DOI":"10.1002\/net.3230110211"},{"key":"r34","unstructured":"PaulusA.Rol\u00ednekM.MusilV.AmosB.MartiusG. \u201cComboptnet: Fit the Right Np-Hard Problem by Learning Integer Programming Constraints,\u201d Proceedings of the 38th International Conference on Machine Learning, Vol.\u00a0139, Journal of Machine Learning Research, Inc., Cambridge, MA, 2021, pp.\u00a08443\u20138453. 10.48550\/arXiv.2105.02343"},{"key":"r35","doi-asserted-by":"publisher","DOI":"10.1016\/j.cie.2023.109179"},{"key":"r36","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2021.3049635"},{"key":"r37","doi-asserted-by":"crossref","unstructured":"VieA.KleinnijenhuisA. M.FarmerD. J. \u201cQualities, Challenges and Future of Genetic Algorithms,\u201d Available at SSRN 3726035, Social Science Research Network, 2020. 10.2139\/ssrn.3726035","DOI":"10.2139\/ssrn.3726035"},{"key":"r38","doi-asserted-by":"publisher","DOI":"10.1016\/j.apm.2011.08.010"},{"key":"r39","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-022-03555-8"},{"key":"r40","unstructured":"VinyalsO.FortunatoM.JaitlyN. \u201cPointer Networks,\u201d Advances in Neural Information Processing Systems, Vol.\u00a028, 2015. 10.48550\/arXiv.1506.03134"},{"key":"r41","unstructured":"MaY.LiJ.CaoZ.SongW.ZhangL.ChenZ.TangJ. \u201cLearning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer,\u201d Advances in Neural Information Processing Systems, Vol.\u00a034, Curran Assoc., Inc., Red Hook, NY, 2021, pp.\u00a011,096\u201311,107. 10.48550\/arXiv.2110.02544"},{"key":"r42","doi-asserted-by":"crossref","unstructured":"LuoF.LinX.LiuF.ZhangQ.WangZ. \u201cNeural Combinatorial Optimization with Heavy Decoder: Toward Large Scale Generalization,\u201d Advances in Neural Information Processing Systems, Vol.\u00a036, 2023, pp.\u00a08845\u20138864. 10.48550\/arXiv.2310.07985","DOI":"10.52202\/075280-0387"},{"key":"r43","unstructured":"VaswaniA.ShazeerN.ParmarN.UszkoreitJ.JonesL.GomezA. N.Kaiser\u0141.PolosukhinI. \u201cAttention Is All You Need,\u201d Advances in Neural Information Processing Systems, Vol.\u00a030, Curran Assoc., Inc., Red Hook, NY, 2017. 10.48550\/arXiv.1706.03762"},{"key":"r44","unstructured":"KoolW.Van HoofH.WellingM. \u201cAttention, Learn to Solve Routing Problems!\u201d Proceedings of the 7th International Conference on Learning Representations (ICLR), 2019, pp.\u00a01\u201325. 10.48550\/arXiv.1803.08475"},{"key":"r45","unstructured":"WangC.ChengP.LiJ.SunW. \u201cLeader Reward for POMO-Based Neural Combinatorial Optimization,\u201d arXiv preprint arXiv: 2405.13947, 2024, not yet published. 10.48550\/arXiv.2405.13947"}],"container-title":["Journal of Aerospace Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/arc.aiaa.org\/doi\/pdf\/10.2514\/1.I011709","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T08:05:32Z","timestamp":1772697932000},"score":1,"resource":{"primary":{"URL":"https:\/\/arc.aiaa.org\/doi\/10.2514\/1.I011709"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3]]},"references-count":45,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["10.2514\/1.I011709"],"URL":"https:\/\/doi.org\/10.2514\/1.i011709","relation":{},"ISSN":["1940-3151","2327-3097"],"issn-type":[{"value":"1940-3151","type":"print"},{"value":"2327-3097","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3]]},"assertion":[{"value":"2025-06-10","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-08","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-11-18","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}