{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T17:43:54Z","timestamp":1776879834556,"version":"3.51.2"},"reference-count":49,"publisher":"EDP Sciences","issue":"3","license":[{"start":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T00:00:00Z","timestamp":1750377600000},"content-version":"vor","delay-in-days":50,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51975231"],"award-info":[{"award-number":["51975231"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["RAIRO-Oper. Res."],"accepted":{"date-parts":[[2025,3,24]]},"published-print":{"date-parts":[[2025,5]]},"abstract":"<jats:p>This paper examines the value-based reinforcement learning method applied to the optimization of the rectangular strip packing problem with three different approaches to define the state and the action. The episode in the reinforcement learning is defined as a round of placement of all the given pieces. We analyze the drawbacks of two previously designed approaches and propose that the state is defined by the stage along with the selected piece. We also record the fitness value of the placement during the packing and design a fitness-based reward. The three methods are evaluated on a group of random packing problems in terms of time consumption by searching for the particular state, memory consumption by recording the past state, and the efficiency of packing optimization. The results show that the proposed reinforcement learning with fitness-based reward delivers a good comprehensive performance. The proposed method is also tested on a few well-known benchmark problems, and the results indicate that the proposed method could be an effective tool. We discuss the similarities and differences between the reinforcement learning method and the local search method.<\/jats:p>","DOI":"10.1051\/ro\/2025034","type":"journal-article","created":{"date-parts":[[2025,3,26]],"date-time":"2025-03-26T08:57:07Z","timestamp":1742979427000},"page":"1551-1568","source":"Crossref","is-referenced-by-count":1,"title":["Solving rectangular strip packing problem with reinforcement learning: a comparative case study"],"prefix":"10.1051","volume":"59","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9624-0356","authenticated-orcid":false,"given":"Xusheng","family":"Zhao","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7752-3502","authenticated-orcid":false,"given":"Yunqing","family":"Rao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-3836-0695","authenticated-orcid":false,"given":"Sirui","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-4380-7635","authenticated-orcid":false,"given":"Nai","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"250","published-online":{"date-parts":[[2025,6,20]]},"reference":[{"key":"R1","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1016\/j.cor.2006.07.004","volume":"35","author":"Alvarez-Vald\u00e9s","year":"2008","journal-title":"Comput. Oper. Res."},{"key":"R2","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1057\/jors.1985.51","volume":"36","author":"Beasley","year":"1985","journal-title":"J. Oper. Res. Soc."},{"key":"R3","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1287\/opre.33.1.49","volume":"33","author":"Beasley","year":"1985","journal-title":"Oper. Res."},{"key":"R4","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1093\/comjnl\/25.3.353","volume":"25","author":"Bengtsson","year":"1982","journal-title":"Comput. J."},{"key":"R5","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1016\/j.ejor.2006.11.038","volume":"184","author":"Bennell","year":"2008","journal-title":"Eur. J. Oper. Res."},{"key":"R6","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1016\/j.ijpe.2013.04.040","volume":"145","author":"Bennell","year":"2013","journal-title":"Int. J. Prod. Econ."},{"key":"R7","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1057\/jors.1987.70","volume":"38","author":"Berkey","year":"1987","journal-title":"J. Oper. Res. Soc."},{"key":"R8","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1287\/opre.1040.0109","volume":"52","author":"Burke","year":"2004","journal-title":"Oper. Res."},{"key":"R9","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1287\/ijoc.1080.0306","volume":"21","author":"Burke","year":"2009","journal-title":"Inf. J. Comput."},{"key":"R10","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1287\/opre.25.1.30","volume":"25","author":"Christofides","year":"1977","journal-title":"Oper. Res."},{"key":"R11","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1287\/opre.2013.1248","volume":"62","author":"C^ot\u00e9","year":"2014","journal-title":"Oper. Res."},{"key":"R12","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1287\/ijoc.1070.0250","volume":"20","author":"Crainic","year":"2008","journal-title":"Inf. J. Comput."},{"key":"R13","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1287\/mnsc.3.3.279","volume":"3","author":"Eisemann","year":"1957","journal-title":"Manage. Sci."},{"key":"R14","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1016\/j.ejor.2015.04.029","volume":"246","author":"Gon\u00e7alves","year":"2015","journal-title":"Eur. J. Oper. Res."},{"key":"R15","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1016\/S0360-8352(99)00097-2","volume":"37","author":"Hopper","year":"1999","journal-title":"Comput. Ind. Eng."},{"key":"R16","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/S0377-2217(99)00357-4","volume":"128","author":"Hopper","year":"2001","journal-title":"Eur. J. Oper. Res."},{"key":"R17","doi-asserted-by":"crossref","first-page":"104781","DOI":"10.1016\/j.cor.2019.104781","volume":"113","author":"Hottung","year":"2020","journal-title":"Comput. Oper. Res."},{"key":"R18","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1016\/j.ejor.2020.06.050","volume":"289","author":"Iori","year":"2021","journal-title":"Eur. J. Oper. Res."},{"key":"R19","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1007\/s11590-021-01808-y","volume":"16","author":"Iori","year":"2022","journal-title":"Optim. Lett."},{"key":"R20","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1016\/0377-2217(94)00166-9","volume":"88","author":"Jakobs","year":"1996","journal-title":"Eur. J. Oper. Res."},{"key":"R21","first-page":"1287","volume":"219","author":"Kang","year":"2012","journal-title":"Appl. Math. Comput."},{"key":"R22","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/j.ejor.2008.08.020","volume":"198","author":"Kenmochi","year":"2009","journal-title":"Eur. J. Oper. Res."},{"key":"R23","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1177\/0278364907087426","volume":"27","author":"Kollar","year":"2008","journal-title":"Int. J. Rob. Res."},{"key":"R24","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1016\/j.ejor.2019.04.045","volume":"282","author":"Leao","year":"2020","journal-title":"Eur. J. Oper. Res."},{"key":"R25","doi-asserted-by":"crossref","first-page":"13032","DOI":"10.1016\/j.eswa.2011.04.105","volume":"38","author":"Leung","year":"2011","journal-title":"Expert Syst. App."},{"key":"R26","doi-asserted-by":"crossref","first-page":"4273","DOI":"10.1109\/JIOT.2018.2846694","volume":"5","author":"Liu","year":"2018","journal-title":"IEEE Int. Things J."},{"key":"R27","doi-asserted-by":"crossref","first-page":"6876","DOI":"10.1016\/j.eswa.2014.04.043","volume":"41","author":"L\u00f3pez-Camacho","year":"2014","journal-title":"Expert Syst. App."},{"key":"R28","doi-asserted-by":"crossref","first-page":"1129","DOI":"10.1002\/oca.2832","volume":"44","author":"Lu","year":"2023","journal-title":"Opt. Control App. Methods"},{"key":"R29","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1016\/j.cor.2009.08.005","volume":"37","author":"Macedo","year":"2010","journal-title":"Comput. Oper. Res."},{"key":"R30","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1287\/mnsc.44.3.388","volume":"44","author":"Martello","year":"1998","journal-title":"Manage. Sci."},{"key":"R31","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1016\/j.future.2021.09.023","volume":"138","author":"Mishra","year":"2023","journal-title":"Future Gener. Comput. Syst."},{"key":"R32","doi-asserted-by":"crossref","first-page":"3698","DOI":"10.1109\/TSG.2018.2834219","volume":"10","author":"Mocanu","year":"2018","journal-title":"IEEE Trans. Smart Grid"},{"key":"R33","doi-asserted-by":"crossref","first-page":"3238","DOI":"10.1111\/itor.13236","volume":"30","author":"Oliveira","year":"2023","journal-title":"Int. Trans. Oper. Res."},{"key":"R34","doi-asserted-by":"crossref","first-page":"106649","DOI":"10.1016\/j.compchemeng.2019.106649","volume":"133","author":"Petsagkourakis","year":"2020","journal-title":"Comput. Chem. Eng."},{"key":"R35","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/j.ijpe.2013.04.031","volume":"145","author":"Russo","year":"2013","journal-title":"Int. J. Prod. Econ."},{"key":"R36","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1016\/j.ejor.2014.02.059","volume":"237","author":"Silva","year":"2014","journal-title":"Eur. J. Oper. Res."},{"key":"R37","doi-asserted-by":"crossref","first-page":"110080","DOI":"10.1016\/j.jcp.2020.110080","volume":"428","author":"Viquerat","year":"2021","journal-title":"J. Comput. Phys."},{"key":"R38","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1016\/j.engappai.2016.01.001","volume":"52","author":"Walraven","year":"2016","journal-title":"Eng. App. Artif. Intell."},{"key":"R39","doi-asserted-by":"crossref","first-page":"3297","DOI":"10.1016\/j.eswa.2014.12.021","volume":"42","author":"Wang","year":"2015","journal-title":"Expert Syst. App."},{"key":"R40","doi-asserted-by":"crossref","first-page":"107526","DOI":"10.1016\/j.knosys.2021.107526","volume":"233","author":"Wang","year":"2021","journal-title":"Knowl.-Based Syst."},{"key":"R41","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1016\/j.ejor.2005.12.047","volume":"183","author":"W\u00a8ascher","year":"2007","journal-title":"Eur. J. Oper. Res."},{"key":"R42","doi-asserted-by":"crossref","first-page":"2662","DOI":"10.1016\/j.cor.2013.05.017","volume":"40","author":"Wauters","year":"2013","journal-title":"Comput. Oper. Res."},{"key":"R43","first-page":"337","volume":"215","author":"Wei","year":"2011","journal-title":"Eur. J. Oper. Res."},{"key":"R44","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1111\/itor.12138","volume":"23","author":"Wei","year":"2016","journal-title":"Int. Trans. Oper. Res."},{"key":"R45","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.cor.2012.05.001","volume":"40","author":"Yang","year":"2013","journal-title":"Comput. Oper. Res."},{"key":"R46","doi-asserted-by":"crossref","unstructured":"Zhao H., She Q., Zhu C., Yang Y. and Xu K., Online 3D bin packing with constrained deep reinforcement learning, in Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35. PKP Publishing Services Network (2021) 741\u2013749.","DOI":"10.1609\/aaai.v35i1.16155"},{"key":"R47","doi-asserted-by":"crossref","first-page":"012002","DOI":"10.1088\/1742-6596\/2181\/1\/012002","volume":"2181","author":"Zhao","year":"2022","journal-title":"J. Phys. Conf. Ser."},{"key":"R48","doi-asserted-by":"crossref","first-page":"12057","DOI":"10.1007\/s00500-023-08381-9","volume":"27","author":"Zhao","year":"2023","journal-title":"Soft Comput."},{"key":"R49","doi-asserted-by":"crossref","first-page":"1337","DOI":"10.1021\/acscentsci.7b00492","volume":"3","author":"Zhou","year":"2017","journal-title":"ACS Cent. Sci."}],"container-title":["RAIRO - Operations Research"],"original-title":[],"link":[{"URL":"https:\/\/www.rairo-ro.org\/10.1051\/ro\/2025034\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T07:49:43Z","timestamp":1750405783000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.rairo-ro.org\/10.1051\/ro\/2025034"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5]]},"references-count":49,"journal-issue":{"issue":"3"},"alternative-id":["ro230815"],"URL":"https:\/\/doi.org\/10.1051\/ro\/2025034","relation":{},"ISSN":["0399-0559","2804-7303"],"issn-type":[{"value":"0399-0559","type":"print"},{"value":"2804-7303","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5]]}}}