{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T13:04:33Z","timestamp":1775739873350,"version":"3.50.1"},"reference-count":68,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T00:00:00Z","timestamp":1709596800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T00:00:00Z","timestamp":1709596800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100018222","name":"Universit\u00e4t Siegen","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100018222","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Oper. Res. Forum"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Reinforcement learning (RL) algorithms have proven to be useful tools for combinatorial optimisation. However, they are still underutilised in facility layout problems (FLPs). At the same time, RL research relies on standardised benchmarks such as the Arcade Learning Environment. To address these issues, we present an open-source Python package (gym-flp) that utilises the OpenAI Gym toolkit, specifically designed for developing and comparing RL algorithms. The package offers one discrete and three continuous problem representation environments with customisable state and action spaces. In addition, the package provides 138 discrete and 61 continuous problems commonly used in FLP literature and supports submitting custom problem sets. The user can choose between numerical and visual output of observations, depending on the RL approach being used. The package aims to facilitate experimentation with different algorithms in a reproducible manner and advance RL use in factory planning.<\/jats:p>","DOI":"10.1007\/s43069-024-00301-3","type":"journal-article","created":{"date-parts":[[2024,3,5]],"date-time":"2024-03-05T08:03:22Z","timestamp":1709625802000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["gym-flp: A Python Package for Training Reinforcement Learning Algorithms on Facility Layout Problems"],"prefix":"10.1007","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2552-2855","authenticated-orcid":false,"given":"Benjamin","family":"Heinbach","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Burggr\u00e4f","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Johannes","family":"Wagner","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,3,5]]},"reference":[{"issue":"2","key":"301_CR1","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1016\/j.arcontrol.2007.04.001","volume":"31","author":"A Drira","year":"2007","unstructured":"Drira A, Pierreval H, Hajri-Gabouj S (2007) Facility layout problems: a survey. Annu Rev Control 31(2):255\u2013267. https:\/\/doi.org\/10.1016\/j.arcontrol.2007.04.001","journal-title":"Annu Rev Control"},{"issue":"1\u20134","key":"301_CR2","doi-asserted-by":"publisher","first-page":"957","DOI":"10.1007\/s00170-017-0895-8","volume":"94","author":"H Hosseini-Nasab","year":"2018","unstructured":"Hosseini-Nasab H, Fereidouni S, Ghomi Fatemi, Taghi Seyyed Mohammad et al (2018) Classification of facility layout problems: a review study. Int J Adv Manuf Technol 94(1\u20134):957\u2013977. https:\/\/doi.org\/10.1007\/s00170-017-0895-8","journal-title":"Int J Adv Manuf Technol"},{"issue":"12","key":"301_CR3","doi-asserted-by":"publisher","first-page":"3777","DOI":"10.1080\/00207543.2021.1897176","volume":"59","author":"P P\u00e9rez-Gosende","year":"2021","unstructured":"P\u00e9rez-Gosende P, Mula J, D\u00edaz-Madro\u00f1ero M (2021) Facility layout planning. An extended literature review. Int J Prod Res 59(12):3777\u20133816. https:\/\/doi.org\/10.1080\/00207543.2021.1897176","journal-title":"Int J Prod Res"},{"key":"301_CR4","doi-asserted-by":"publisher","first-page":"22569","DOI":"10.1109\/ACCESS.2021.3054563","volume":"9","author":"P Burggr\u00e4f","year":"2021","unstructured":"Burggr\u00e4f P, Wagner J, Heinbach B (2021) Bibliometric study on the use of machine learning as resolution technique for facility layout problems. IEEE Access 9:22569\u201322586. https:\/\/doi.org\/10.1109\/ACCESS.2021.3054563","journal-title":"IEEE Access"},{"key":"301_CR5","doi-asserted-by":"publisher","unstructured":"Burggr\u00e4f P, Wagner J, Koke B (2018) Artificial intelligence in production management: a review of the current state of affairs and research trends in Academia. 2018 International Conference on Information Management and Processing (ICIMP 2018): Jan. 12\u201314, 2018, London, UK. IEEE, Piscataway, NJ, pp 82\u201388. https:\/\/doi.org\/10.1109\/ICIMP1.2018.8325846","DOI":"10.1109\/ICIMP1.2018.8325846"},{"key":"301_CR6","doi-asserted-by":"publisher","first-page":"891","DOI":"10.1016\/j.procir.2020.03.047","volume":"93","author":"P Burggr\u00e4f","year":"2020","unstructured":"Burggr\u00e4f P, Wagner J, Koke B et al (2020) Performance assessment methodology for AI-supported decision-making in production management. Procedia CIRP 93:891\u2013896. https:\/\/doi.org\/10.1016\/j.procir.2020.03.047","journal-title":"Procedia CIRP"},{"key":"301_CR7","doi-asserted-by":"publisher","first-page":"142434","DOI":"10.1109\/ACCESS.2020.3010050","volume":"8","author":"P Burggr\u00e4f","year":"2020","unstructured":"Burggr\u00e4f P, Wagner J, Koke B et al (2020) Approaches for the prediction of lead times in an engineer to order environment\u2013a systematic review. IEEE Access 8:142434\u2013142445. https:\/\/doi.org\/10.1109\/ACCESS.2020.3010050","journal-title":"IEEE Access"},{"key":"301_CR8","unstructured":"Burggr\u00e4f P, Wagner J, Koke B et\u00a0al (2019) Sensor retrofit for a coffee machine as condition monitoring and predictive maintenance use case. Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019): Tagungsband. Universit\u00e4tsbibliothek Siegen, pp 62\u201366. https:\/\/aisel.aisnet.org\/wi2019\/track01\/papers\/5\/"},{"issue":"1","key":"301_CR9","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1080\/21693277.2016.1192517","volume":"4","author":"T Wuest","year":"2016","unstructured":"Wuest T, Weimer D, Irgens C et al (2016) Machine learning in manufacturing: advantages, challenges, and applications. Prod Manuf Res 4(1):23\u201345. https:\/\/doi.org\/10.1080\/21693277.2016.1192517","journal-title":"Prod Manuf Res"},{"key":"301_CR10","doi-asserted-by":"publisher","unstructured":"Dogan A, Birant D (2021) Machine learning and data mining in manufacturing. Expert Syst Appl 166:114060. https:\/\/doi.org\/10.1016\/j.eswa.2020.114060. https:\/\/www.sciencedirect.com\/science\/article\/abs\/pii\/S095741742030823X","DOI":"10.1016\/j.eswa.2020.114060"},{"key":"301_CR11","doi-asserted-by":"publisher","unstructured":"Bahrpeyma F, Reichelt D (2022) A review of the applications of multi-agent reinforcement learning in smart factories. Front Robot AI 9:1027340. https:\/\/doi.org\/10.3389\/frobt.2022.1027340. https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2022.1027340\/full","DOI":"10.3389\/frobt.2022.1027340"},{"issue":"13","key":"301_CR12","doi-asserted-by":"publisher","first-page":"4316","DOI":"10.1080\/00207543.2021.1973138","volume":"60","author":"M Panzer","year":"2022","unstructured":"Panzer M, Bender B (2022) Deep reinforcement learning in production systems: a systematic literature review. Int J Prod Res 60(13):4316\u20134341. https:\/\/doi.org\/10.1080\/00207543.2021.1973138","journal-title":"Int J Prod Res"},{"issue":"20","key":"301_CR13","doi-asserted-by":"publisher","first-page":"7151","DOI":"10.1080\/00207543.2022.2140221","volume":"61","author":"B Rolf","year":"2023","unstructured":"Rolf B, Jackson I, M\u00fcller M et al (2023) A review on reinforcement learning algorithms and applications in supply chain management. Int J Prod Res 61(20):7151\u20137179. https:\/\/doi.org\/10.1080\/00207543.2022.2140221","journal-title":"Int J Prod Res"},{"key":"301_CR14","doi-asserted-by":"publisher","unstructured":"Del Real Torres A, Andreiana DS, Ojeda Rold\u00e1n \u00c1 et al (2022) A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework. Appl Sci 12(23):12377. https:\/\/doi.org\/10.3390\/app122312377. https:\/\/www.mdpi.com\/2076-3417\/12\/23\/12377","DOI":"10.3390\/app122312377"},{"issue":"2","key":"301_CR15","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1016\/j.ejor.2020.07.063","volume":"290","author":"Y Bengio","year":"2021","unstructured":"Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: A methodological tour d\u2019horizon. Eur J Oper Res 290(2):405\u2013421. https:\/\/doi.org\/10.1016\/j.ejor.2020.07.063","journal-title":"Eur J Oper Res"},{"key":"301_CR16","doi-asserted-by":"publisher","first-page":"163818","DOI":"10.1109\/ACCESS.2020.3021753","volume":"8","author":"ER Zuniga","year":"2020","unstructured":"Zuniga ER, Moris MU, Syberfeldt A et al (2020) A simulation-based optimization methodology for facility layout design in manufacturing. IEEE Access 8:163818\u2013163828. https:\/\/doi.org\/10.1109\/ACCESS.2020.3021753","journal-title":"IEEE Access"},{"key":"301_CR17","volume-title":"Facilities planning","author":"J Tompkins","year":"2010","unstructured":"Tompkins J, White JA, Bozer YA (2010) Facilities planning, 4th edn. Wiley, Hoboken, NJ","edition":"4"},{"issue":"1","key":"301_CR18","doi-asserted-by":"publisher","first-page":"53","DOI":"10.2307\/1907742","volume":"25","author":"TC Koopmans","year":"1957","unstructured":"Koopmans TC, Beckmann M (1957) Assignment problems and the location of economic activities. Econometrica 25(1):53. https:\/\/doi.org\/10.2307\/1907742","journal-title":"Econometrica"},{"key":"301_CR19","unstructured":"Tong X (1991) SECOT: a sequential construction technique for facility design. University of Pittsburgh, Pittsburgh, PA. https:\/\/elibrary.ru\/item.asp?id=5805928. Accessed 16 Feb 2024"},{"issue":"6","key":"301_CR20","doi-asserted-by":"publisher","first-page":"660","DOI":"10.1016\/j.orl.2005.09.009","volume":"34","author":"A Konak","year":"2006","unstructured":"Konak A, Kulturel-Konak S, Norman BA et al (2006) A new mixed integer programming formulation for facility layout design using flexible bays. Oper Res Lett 34(6):660\u2013672. https:\/\/doi.org\/10.1016\/j.orl.2005.09.009","journal-title":"Oper Res Lett"},{"issue":"5","key":"301_CR21","doi-asserted-by":"publisher","first-page":"5384","DOI":"10.1016\/j.eswa.2011.11.046","volume":"39","author":"B Haktanirlar Ulutas","year":"2012","unstructured":"Haktanirlar Ulutas B, Kulturel-Konak S (2012) An artificial immune system based algorithm to solve unequal area facility layout problem. Expert Syst Appl 39(5):5384\u20135395. https:\/\/doi.org\/10.1016\/j.eswa.2011.11.046","journal-title":"Expert Syst Appl"},{"issue":"1","key":"301_CR22","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1080\/00207549208942878","volume":"30","author":"KYR Tam","year":"1992","unstructured":"Tam KYR (1992) A simulated annealing algorithm for allocating space to manufacturing cells. Int J Prod Res 30(1):63\u201387. https:\/\/doi.org\/10.1080\/00207549208942878","journal-title":"Int J Prod Res"},{"key":"301_CR23","doi-asserted-by":"publisher","unstructured":"Niroomand S, Hadi-Vencheh A, \u015eahin R et al (2015) Modified migrating birds optimization algorithm for closed loop layout with exact distances in flexible manufacturing systems. Expert Syst Appl 42(19):6586\u20136597. https:\/\/doi.org\/10.1016\/j.eswa.2015.04.040. https:\/\/www.sciencedirect.com\/science\/article\/pii\/s0957417415002821?casa_token=iavfj2vewwsaaaaa:qispvydrz7gpionc_mjuodgdxjif3ge1ijutuhrqdqpnc5mrfhj7bxkcpwcqalzeecruopmy","DOI":"10.1016\/j.eswa.2015.04.040"},{"key":"301_CR24","doi-asserted-by":"publisher","unstructured":"Yang T, Su CT, Hsu YR (2000) Systematic layout planning: a study on semiconductor wafer fabrication facilities. Int J Oper Prod Manag 20(11):1359\u20131371. https:\/\/doi.org\/10.1108\/01443570010348299. https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/01443570010348299\/full\/","DOI":"10.1108\/01443570010348299"},{"key":"301_CR25","doi-asserted-by":"publisher","unstructured":"Anjos MF, Vieira MV (2021) Facility layout: mathematical optimization techniques and engineering applications, 1st edn. EURO Advanced Tutorials on Operational Research, Springer International Publishing and Imprint Springer, Cham. https:\/\/doi.org\/10.1007\/978-3-030-70990-7","DOI":"10.1007\/978-3-030-70990-7"},{"key":"301_CR26","doi-asserted-by":"publisher","unstructured":"Ueda K, Fujii N, Hatono I et al (2002) Facility layout planning using self-organization method. CIRP Ann 51(1):399\u2013402. https:\/\/doi.org\/10.1016\/S0007-8506(07)61546-7. https:\/\/www.sciencedirect.com\/science\/article\/pii\/s0007850607615467","DOI":"10.1016\/S0007-8506(07)61546-7"},{"key":"301_CR27","doi-asserted-by":"publisher","unstructured":"Tsuchiya K, Bharitkar S, Takefuji Y (1996) A neural network approach to facility layout problems. Eur J Oper Res 89(3):556\u2013563. https:\/\/doi.org\/10.1016\/0377-2217(95)00051-8. https:\/\/www.sciencedirect.com\/science\/article\/pii\/0377221795000518","DOI":"10.1016\/0377-2217(95)00051-8"},{"key":"301_CR28","doi-asserted-by":"publisher","unstructured":"Garc\u00eda-Hern\u00e1ndez L, P\u00e9rez-Ortiz M, Ara\u00fazo-Azofra A et al (2014) An evolutionary neural system for incorporating expert knowledge into the UA-FLP. Neurocomputing 135:69\u201378. https:\/\/doi.org\/10.1016\/j.neucom.2013.01.068. https:\/\/www.sciencedirect.com\/science\/article\/pii\/s0925231213011430","DOI":"10.1016\/j.neucom.2013.01.068"},{"issue":"2","key":"301_CR29","doi-asserted-by":"publisher","first-page":"582","DOI":"10.1016\/j.ejor.2017.06.052","volume":"264","author":"T Weitzel","year":"2018","unstructured":"Weitzel T, Glock CH (2018) Energy management for stationary electric energy storage systems: a systematic literature review. Eur J Oper Res 264(2):582\u2013606. https:\/\/doi.org\/10.1016\/j.ejor.2017.06.052","journal-title":"Eur J Oper Res"},{"issue":"11","key":"301_CR30","doi-asserted-by":"publisher","first-page":"3362","DOI":"10.1080\/00207543.2020.1717008","volume":"58","author":"D Shi","year":"2020","unstructured":"Shi D, Fan W, Xiao Y et al (2020) Intelligent scheduling of discrete automated production line via deep reinforcement learning. Int J Prod Res 58(11):3362\u20133380. https:\/\/doi.org\/10.1080\/00207543.2020.1717008","journal-title":"Int J Prod Res"},{"key":"301_CR31","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1016\/j.procir.2019.02.101","volume":"79","author":"A Kuhnle","year":"2019","unstructured":"Kuhnle A, R\u00f6hrig N, Lanza G (2019) Autonomous order dispatching in the semiconductor industry using reinforcement learning. Procedia CIRP 79:391\u2013396. https:\/\/doi.org\/10.1016\/j.procir.2019.02.101","journal-title":"Procedia CIRP"},{"issue":"1","key":"301_CR32","doi-asserted-by":"publisher","first-page":"397","DOI":"10.1016\/j.cirp.2020.04.001","volume":"69","author":"A Malus","year":"2020","unstructured":"Malus A, Kozjek D, Vrabi\u010d R (2020) Real-time order dispatching for a fleet of autonomous mobile robots using multi-agent reinforcement learning. CIRP Ann 69(1):397\u2013400. https:\/\/doi.org\/10.1016\/j.cirp.2020.04.001","journal-title":"CIRP Ann"},{"key":"301_CR33","unstructured":"Khalil E, Dai H, Zhang Y et\u00a0al (2017) Learning combinatorial optimization algorithms over graphs. In: Guyon I, Von Luxburg U, Bengio S et\u00a0al (eds) Advances in Neural Information Processing Systems, vol\u00a030. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper\/2017\/file\/d9896106ca98d3d05b8cbdf4fd8b13a1-Paper.pdf"},{"key":"301_CR34","doi-asserted-by":"publisher","unstructured":"Unger H, B\u00f6rner F (2021) Reinforcement learning for layout planning \u2013 modelling the layout problem as MDP. In: Dolgui A, Bernard A, Lemoine D et\u00a0al (eds) Advances in production management systems, IFIP Advances in Information and Communication Technology, vol 632. Springer, Cham, pp 471\u2013479. https:\/\/doi.org\/10.1007\/978-3-030-85906-0_52","DOI":"10.1007\/978-3-030-85906-0_52"},{"key":"301_CR35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.mfglet.2021.08.003","volume":"30","author":"M Klar","year":"2021","unstructured":"Klar M, Glatt M, Aurich JC (2021) An implementation of a reinforcement learning based algorithm for factory layout planning. Manuf Lett 30:1\u20134. https:\/\/doi.org\/10.1016\/j.mfglet.2021.08.003","journal-title":"Manuf Lett"},{"key":"301_CR36","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1016\/j.procir.2022.04.027","volume":"107","author":"M Klar","year":"2022","unstructured":"Klar M, Hussong M, Ruediger-Flore P et al (2022) Scalability investigation of double deep q learning for factory layout planning. Procedia CIRP 107:161\u2013166. https:\/\/doi.org\/10.1016\/j.procir.2022.04.027","journal-title":"Procedia CIRP"},{"key":"301_CR37","doi-asserted-by":"publisher","unstructured":"Klar M, Glatt M, Aurich JC (2023) Performance comparison of reinforcement learning and metaheuristics for factory layout planning. CIRP J Manuf Sci Technol 45:10\u201325. https:\/\/doi.org\/10.1016\/j.cirpj.2023.05.008. https:\/\/www.sciencedirect.com\/science\/article\/pii\/s1755581723000718","DOI":"10.1016\/j.cirpj.2023.05.008"},{"key":"301_CR38","doi-asserted-by":"publisher","unstructured":"Ikeda H, Nakagawa H, Tsuchiya T (2022) Towards automatic facility layout design using reinforcement learning. Communication Papers of the 17th Conference on Computer Science and Intelligence Systems, vol\u00a032. PTI, pp 11\u201320. https:\/\/doi.org\/10.15439\/2022f25","DOI":"10.15439\/2022f25"},{"key":"301_CR39","doi-asserted-by":"publisher","unstructured":"Unger H, B\u00f6rner F, Fischer D (2024) Reinforcement learning for layout planning \u2013 automated pathway generation for arbitrary factory layouts. In: Silva FJG, Ferreira LP, S\u00e1 JC, et\u00a0al (eds) Flexible Automation and Intelligent Manufacturing: Establishing Bridges for More Sustainable Manufacturing Systems. Springer Nature Switzerland and Imprint Springer, Cham, Lecture Notes in Mechanical Engineering, pp 1031\u20131039. https:\/\/doi.org\/10.1007\/978-3-031-38165-2_118. https:\/\/link.springer.com\/chapter\/10.1007\/978-3-031-38165-2_118","DOI":"10.1007\/978-3-031-38165-2_118"},{"key":"301_CR40","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1016\/j.mfglet.2023.09.007","volume":"38","author":"B Heinbach","year":"2023","unstructured":"Heinbach B, Burggr\u00e4f P, Wagner J (2023) Deep reinforcement learning for layout planning - an MDP-based approach for the facility layout problem. Manuf Lett 38:40\u201343. https:\/\/doi.org\/10.1016\/j.mfglet.2023.09.007","journal-title":"Manuf Lett"},{"key":"301_CR41","doi-asserted-by":"publisher","unstructured":"Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529\u2013533. https:\/\/doi.org\/10.1038\/nature14236. https:\/\/www.nature.com\/articles\/nature14236?wm=book_wap_0005","DOI":"10.1038\/nature14236"},{"key":"301_CR42","doi-asserted-by":"publisher","unstructured":"Silver D, Huang A, Maddison CJ et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484\u2013489. https:\/\/doi.org\/10.1038\/nature16961. https:\/\/www.nature.com\/articles\/nature16961?mrk_cmpg_source=sm_tw_pp","DOI":"10.1038\/nature16961"},{"key":"301_CR43","series-title":"Adaptive Computation and Machine Learning","volume-title":"Reinforcement learning: an introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. Adaptive Computation and Machine Learning. The MIT Press, Cambridge, Massachusetts","edition":"2"},{"key":"301_CR44","doi-asserted-by":"publisher","unstructured":"Dong H, Ding Z, Zhang S (2020) Deep reinforcement learning: fundamentals, research and applications, 1st edn. Springer eBook Collection, Springer Singapore and Imprint Springer, Singapore. https:\/\/doi.org\/10.1007\/978-981-15-4095-0","DOI":"10.1007\/978-981-15-4095-0"},{"key":"301_CR45","unstructured":"Mnih V, Kavukcuoglu K, Silver D et\u00a0al (2013) Playing Atari with deep reinforcement learning. Preprint at http:\/\/arxiv.org\/abs\/1312.5602"},{"key":"301_CR46","volume-title":"Methoden zur optimalen Maschinenanordnung","author":"H Schmigalla","year":"1970","unstructured":"Schmigalla H (1970) Methoden zur optimalen Maschinenanordnung. VEB Verlag Technik"},{"key":"301_CR47","doi-asserted-by":"publisher","unstructured":"Patel D, Hazan H, Saunders DJ et al (2019) Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to Atari breakout game. Neural Networks: the Official Journal of the International Neural Network Society 120:108\u2013115. https:\/\/doi.org\/10.1016\/j.neunet.2019.08.009. https:\/\/www.sciencedirect.com\/science\/article\/pii\/s0893608019302266","DOI":"10.1016\/j.neunet.2019.08.009"},{"key":"301_CR48","doi-asserted-by":"publisher","unstructured":"Li C, Zheng P, Yin Y et al (2023) Deep reinforcement learning in smart manufacturing: a review and prospects. CIRP J Manuf Sci Technol 40:75\u2013101. https:\/\/doi.org\/10.1016\/j.cirpj.2022.11.003. https:\/\/www.sciencedirect.com\/science\/article\/pii\/s1755581722001717","DOI":"10.1016\/j.cirpj.2022.11.003"},{"key":"301_CR49","unstructured":"Brockman G, Cheung V, Pettersson L et\u00a0al (2016) OpenAI gym. Preprint at http:\/\/arxiv.org\/abs\/1606.01540"},{"key":"301_CR50","doi-asserted-by":"publisher","unstructured":"Hein D, Depeweg S, Tokic M et al (2018) A benchmark environment motivated by industrial control problems. 2017 SSCI proceedings: 2017 IEEE SSCI, Honolulu, Hawaii, UA. IEEE, Piscataway, NJ. https:\/\/doi.org\/10.1109\/ssci.2017.8280935","DOI":"10.1109\/ssci.2017.8280935"},{"issue":"3","key":"301_CR51","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1007\/s43069-020-00024-1","volume":"1","author":"T Serra","year":"2020","unstructured":"Serra T, O\u2019Neil RJ (2020) MIPLIBing: seamless benchmarking of mathematical optimization problems and metadata extensions. SN Operations Research Forum 1(3):14. https:\/\/doi.org\/10.1007\/s43069-020-00024-1","journal-title":"SN Operations Research Forum"},{"key":"301_CR52","doi-asserted-by":"publisher","unstructured":"Li F, Du Y (2018) From AlphaGo to power system AI: what engineers can learn from solving the most complex board game. IEEE Power Energ Mag 16(2):76\u201384. https:\/\/doi.org\/10.1109\/mpe.2017.2779554","DOI":"10.1109\/mpe.2017.2779554"},{"key":"301_CR53","doi-asserted-by":"publisher","unstructured":"V\u00e1zquez-Canteli JR, K\u00e4mpf J, Henze G et\u00a0al (2019) Citylearn v1.0. In: Zhang M (ed) BuildSys \u201919: Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation: November 13-14, 2019, New York, NY, USA. The Association for Computing Machinery, New York, New York, pp 356\u2013357. https:\/\/doi.org\/10.1145\/3360322.3360998","DOI":"10.1145\/3360322.3360998"},{"key":"301_CR54","unstructured":"Spangher L, Gokul A, Palakapilly J et\u00a0al (2020) Officelearn: an OpenAI Gym environment for reinforcement learning on occupant-level building\u2019s energy demand response. https:\/\/www.climatechange.ai\/papers\/neurips2020\/56\/paper.pdf. Accessed 16 Feb 2024"},{"key":"301_CR55","doi-asserted-by":"publisher","first-page":"1264","DOI":"10.1016\/j.procir.2018.03.212","volume":"72","author":"B Waschneck","year":"2018","unstructured":"Waschneck B, Reichstaller A, Belzner L et al (2018) Optimization of global production scheduling with deep reinforcement learning. Procedia CIRP 72:1264\u20131269. https:\/\/doi.org\/10.1016\/j.procir.2018.03.212","journal-title":"Procedia CIRP"},{"key":"301_CR56","unstructured":"Zamora I, Lopez NG, Vilches VM et\u00a0al (2016) Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and gazebo. Preprint at https:\/\/arxiv.org\/pdf\/1608.05742.pdf"},{"key":"301_CR57","unstructured":"Gaw\u0142owicz P, Zubow A (2018) ns3-gym: extending OpenAI Gym for networking research. Preprint at http:\/\/arxiv.org\/pdf\/1810.03943v2"},{"key":"301_CR58","doi-asserted-by":"publisher","unstructured":"Hubbs CD, Perez HD, Sarwar O et\u00a0al (2020) OR-Gym: a reinforcement learning library for operations research problems. https:\/\/doi.org\/10.48550\/arXiv.2008.06319. Accessed 16 Feb 2024","DOI":"10.48550\/arXiv.2008.06319"},{"key":"301_CR59","unstructured":"Raffin A, Hill A, Ernestus M et\u00a0al (2019) Stable baselines3. https:\/\/github.com\/DLR-RM\/stable-baselines3. Accessed 16 Feb 2024"},{"key":"301_CR60","unstructured":"Philipp Moritz, Robert Nishihara, Stephanie Wang et\u00a0al (2018) Ray: a distributed framework for emerging AI applications. 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp 561\u2013577. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/moritz"},{"key":"301_CR61","unstructured":"Schulman J, Wolski F, Dhariwal P et\u00a0al (2017) Proximal policy optimization algorithms. Preprint at http:\/\/arxiv.org\/abs\/1707.06347"},{"key":"301_CR62","unstructured":"Mnih V, Badia AP, Mirza M et\u00a0al (2016) Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol\u00a048. PMLR, pp 1928\u20131937. https:\/\/arxiv.org\/pdf\/1602.01783"},{"key":"301_CR63","unstructured":"Lillicrap TP, Hunt JJ, Pritzel A et\u00a0al (2015) Continuous control with deep reinforcement learning. Preprint at http:\/\/arxiv.org\/abs\/1509.02971"},{"key":"301_CR64","unstructured":"Fujimoto S, van Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol\u00a080. PMLR, pp 1587\u20131596. https:\/\/proceedings.mlr.press\/v80\/fujimoto18a.html"},{"key":"301_CR65","unstructured":"Haarnoja T, Zhou A, Abbeel P et\u00a0al (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol\u00a080. PMLR, pp 1861\u20131870. https:\/\/proceedings.mlr.press\/v80\/haarnoja18b.html"},{"issue":"4","key":"301_CR66","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1023\/A:1008293323270","volume":"10","author":"RE Burkard","year":"1997","unstructured":"Burkard RE, Karisch SE, Rendl F (1997) QAPLIB - a quadratic assignment problem library. J Global Optim 10(4):391\u2013403. https:\/\/doi.org\/10.1023\/A:1008293323270","journal-title":"J Global Optim"},{"issue":"4","key":"301_CR67","doi-asserted-by":"publisher","first-page":"453","DOI":"10.5267\/j.ijiec.2019.5.001","volume":"10","author":"G La Scalia","year":"2019","unstructured":"La Scalia G, Micale R, Enea M (2019) Facility layout problem: bibliometric and benchmarking analysis. Int J Ind Eng Comput 10(4):453\u2013472. https:\/\/doi.org\/10.5267\/j.ijiec.2019.5.001","journal-title":"Int J Ind Eng Comput"},{"key":"301_CR68","doi-asserted-by":"publisher","first-page":"2623","DOI":"10.1145\/3292500.3330701","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"T Akiba","year":"2019","unstructured":"Akiba T, Sano S, Yanase T et al (2019) Optuna. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, New York, NY, USA, p 2623. https:\/\/doi.org\/10.1145\/3292500.3330701"}],"container-title":["Operations Research Forum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s43069-024-00301-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s43069-024-00301-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s43069-024-00301-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T02:24:02Z","timestamp":1711419842000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s43069-024-00301-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,5]]},"references-count":68,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,3]]}},"alternative-id":["301"],"URL":"https:\/\/doi.org\/10.1007\/s43069-024-00301-3","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-371586\/v1","asserted-by":"object"}]},"ISSN":["2662-2556"],"issn-type":[{"value":"2662-2556","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,5]]},"assertion":[{"value":"27 March 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 March 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}}],"article-number":"20"}}