{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:56:59Z","timestamp":1760144219045,"version":"build-2065373602"},"reference-count":33,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2024,3,31]],"date-time":"2024-03-31T00:00:00Z","timestamp":1711843200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Culture and Innovation of Hungary from the National Research, Development and Innovation Fund","award":["TKP2021-NVA-10"],"award-info":[{"award-number":["TKP2021-NVA-10"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>We analyzed a special class of graph traversal problems, where the distances are stochastic, and the agent is restricted to take a limited range in one go. We showed that both constrained shortest Hamiltonian pathfinding problems and disassembly line balancing problems belong to the class of constrained shortest pathfinding problems, which can be represented as mixed-integer optimization problems. Reinforcement learning (RL) methods have proven their efficiency in multiple complex problems. However, researchers concluded that the learning time increases radically by growing the state- and action spaces. In continuous cases, approximation techniques are used, but these methods have several limitations in mixed-integer searching spaces. We present the Q-table compression method as a multistep method with dimension reduction, state fusion, and space compression techniques that project a mixed-integer optimization problem into a discrete one. The RL agent is then trained using an extended Q-value-based method to deliver a human-interpretable model for optimal action selection. Our approach was tested in selected constrained stochastic graph traversal use cases, and comparative results are shown to the simple grid-based discretization method.<\/jats:p>","DOI":"10.3390\/info15040193","type":"journal-article","created":{"date-parts":[[2024,3,31]],"date-time":"2024-03-31T13:32:56Z","timestamp":1711891976000},"page":"193","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Generally Applicable Q-Table Compression Method and Its Application for Constrained Stochastic Graph Traversal Optimization Problems"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9003-7776","authenticated-orcid":false,"given":"Tam\u00e1s","family":"Kegyes","sequence":"first","affiliation":[{"name":"HUN-REN-PE Complex Systems Monitoring Research Group, University of Pannonia, Egyetem Utca 10, 8200 Veszpr\u00e9m, Hungary"},{"name":"Department of Computer Science and Systems Technology, University of Pannonia, Egyetem Utca 10, 8200 Veszpr\u00e9m, Hungary"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6550-5101","authenticated-orcid":false,"given":"Alex","family":"Kummer","sequence":"additional","affiliation":[{"name":"HUN-REN-PE Complex Systems Monitoring Research Group, University of Pannonia, Egyetem Utca 10, 8200 Veszpr\u00e9m, Hungary"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5589-2355","authenticated-orcid":false,"given":"Zolt\u00e1n","family":"S\u00fcle","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Systems Technology, University of Pannonia, Egyetem Utca 10, 8200 Veszpr\u00e9m, Hungary"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8593-1493","authenticated-orcid":false,"given":"J\u00e1nos","family":"Abonyi","sequence":"additional","affiliation":[{"name":"HUN-REN-PE Complex Systems Monitoring Research Group, University of Pannonia, Egyetem Utca 10, 8200 Veszpr\u00e9m, Hungary"}]}],"member":"1968","published-online":{"date-parts":[[2024,3,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1007\/s40747-020-00143-6","article-title":"An algorithmic approach for finding the fuzzy constrained shortest paths in a fuzzy graph","volume":"7","author":"Liao","year":"2021","journal-title":"Complex Intell. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1007\/s42524-021-0157-1","article-title":"A review on the electric vehicle routing problems: Variants and algorithms","volume":"8","author":"Qin","year":"2021","journal-title":"Front. Eng. Manag."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.trb.2021.04.002","article-title":"Scheduling and shortest path for trucks with working hours and parking availability constraints","volume":"148","author":"Vital","year":"2021","journal-title":"Transp. Res. Part B Methodol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1007\/s00450-014-0287-3","article-title":"Towards route planning algorithms for electric vehicles with realistic constraints","volume":"31","author":"Baum","year":"2016","journal-title":"Comput.-Sci.-Res. Dev."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1287\/trsc.2018.0889","article-title":"Shortest feasible paths with charging stops for battery electric vehicles","volume":"53","author":"Baum","year":"2019","journal-title":"Transp. Sci."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1007\/s11067-013-9221-7","article-title":"The electric vehicle shortest-walk problem with battery exchanges","volume":"16","author":"Adler","year":"2016","journal-title":"Netw. Spat. Econ."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"109014","DOI":"10.1016\/j.ijpe.2023.109014","article-title":"Integrating distributed disassembly line balancing and vehicle routing problem in supply chain: Integer programming, constraint programming, and heuristic algorithms","volume":"265","author":"Kenger","year":"2023","journal-title":"Int. J. Prod. Econ."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"100008","DOI":"10.1016\/j.ejtl.2020.100008","article-title":"On modeling stochastic dynamic vehicle routing problems","volume":"9","author":"Ulmer","year":"2020","journal-title":"EURO J. Transp. Logist."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1016\/j.ifacol.2019.11.225","article-title":"Disassembly scheduling problem: Literature review and future research directions","volume":"52","author":"Slama","year":"2019","journal-title":"IFAC-PapersOnLine"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1016\/j.cor.2016.04.002","article-title":"The constrained shortest path tour problem","volume":"74","author":"Ferone","year":"2016","journal-title":"Comput. Oper. Res."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1016\/S0007-8506(07)62136-2","article-title":"Parallel disassembly sequencing with sequence-dependent operation times","volume":"50","author":"Kang","year":"2001","journal-title":"CIRP Ann."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"012077","DOI":"10.1088\/1757-899X\/917\/1\/012077","article-title":"Comparative analysis between Dijkstra and Bellman-Ford algorithms in shortest path optimization","volume":"Volume 917","author":"AbuSalim","year":"2020","journal-title":"IOP Conference Series: Materials Science and Engineering"},{"key":"ref_13","unstructured":"Toroslu, I.H. (2021). Improving the floyd-warshall all pairs shortest paths algorithm. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1080\/10556788.2018.1548015","article-title":"An efficient exact approach for the constrained shortest path tour problem","volume":"35","author":"Ferone","year":"2020","journal-title":"Optim. Methods Softw."},{"key":"ref_15","first-page":"257","article-title":"A new formulation to the shortest path problem with time windows and capacity constraints","volume":"42","author":"Dondo","year":"2012","journal-title":"Lat. Am. Appl. Res."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1016\/j.neucom.2022.11.024","article-title":"A survey for solving mixed integer programming via machine learning","volume":"519","author":"Zhang","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_17","first-page":"99","article-title":"A review and evaluations of shortest path algorithms","volume":"2","author":"Magzhan","year":"2013","journal-title":"Int. J. Sci. Technol. Res"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"106071","DOI":"10.1016\/j.cor.2022.106071","article-title":"Opportunities for reinforcement learning in stochastic dynamic vehicle routing","volume":"150","author":"Hildebrandt","year":"2023","journal-title":"Comput. Oper. Res."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Li, S.E. (2023). Reinforcement Learning for Sequential Decision and Optimal Control, Springer.","DOI":"10.1007\/978-981-19-7784-8"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Dong, W., Zhang, W., and Yang, W. (2016, January 6\u201310). Node constraint routing algorithm based on reinforcement learning. Proceedings of the 2016 IEEE 13th International Conference on Signal Processing (ICSP), Chengdu, China.","DOI":"10.1109\/ICSP.2016.7878128"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1007\/s10479-005-3976-2","article-title":"Solving planning and design problems in the process industry using mixed integer and global optimization","volume":"140","author":"Kallrath","year":"2005","journal-title":"Ann. Oper. Res."},{"key":"ref_22","unstructured":"Qi, M., Wang, M., and Shen, Z.J. (2021). Smart feasibility pump: Reinforcement learning for (mixed) integer programming. arXiv."},{"key":"ref_23","unstructured":"Tang, Y., Agrawal, S., and Faenza, Y. (2020, January 13\u201318). Reinforcement learning for integer programming: Learning to cut. Proceedings of the 37th International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"5219","DOI":"10.1016\/j.ifacol.2020.12.1196","article-title":"Reinforcement learning for mixed-integer problems based on mpc","volume":"53","author":"Gros","year":"2020","journal-title":"IFAC-PapersOnLine"},{"key":"ref_25","first-page":"30075","article-title":"Learning large neighborhood search policy for integer programming","volume":"34","author":"Wu","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Cappart, Q., Moisan, T., Rousseau, L.M., Pr\u00e9mont-Schwarz, I., and Cire, A.A. (2021, January 2\u20139). Combining reinforcement learning and constraint programming for combinatorial optimization. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual.","DOI":"10.1609\/aaai.v35i5.16484"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"157807","DOI":"10.1109\/ACCESS.2019.2950055","article-title":"Reinforcement learning based stochastic shortest path finding in wireless sensor networks","volume":"7","author":"Xia","year":"2019","journal-title":"IEEE Access"},{"key":"ref_28","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, The MIT Press. [2nd ed.]."},{"key":"ref_29","unstructured":"Arts, L., Heskes, T., and de Vries, A.P. (2024, February 02). Comparing Discretization Methods for Applying Q-Learning in Continuous State-Action Space. Available online: https:\/\/www.cs.ru.nl\/bachelors-theses\/2017\/Luuk_Arts___4396863___Comparing_Discretization_Methods_for_Applying_Q-learning_in_Continuous_State-Action_Space.pdf."},{"key":"ref_30","unstructured":"Sinclair, S.R., Banerjee, S., and Yu, C.L. (2020, January 8\u201312). Adaptive discretization for episodic reinforcement learning in metric spaces. Proceedings of the ACM on Measurement and Analysis of Computing Systems, Boston, MA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1754-1611-3-11","article-title":"Solving a Hamiltonian Path Problem with a bacterial computer","volume":"3","author":"Baumgardner","year":"2009","journal-title":"J. Biol. Eng."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1007\/s10845-012-0711-0","article-title":"Solving large scale disassembly line balancing problem with uncertainty using reinforcement learning","volume":"25","author":"Tuncel","year":"2014","journal-title":"J. Intell. Manuf."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Lambert, A., and Gupta, S. (2004). Disassembly Modeling for Assembly, Maintenance, Reuse and Recycling, CRC Press.","DOI":"10.1201\/9780203487174"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/4\/193\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:21:49Z","timestamp":1760106109000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/4\/193"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,31]]},"references-count":33,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["info15040193"],"URL":"https:\/\/doi.org\/10.3390\/info15040193","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2024,3,31]]}}}