{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T13:49:08Z","timestamp":1768830548543,"version":"3.49.0"},"reference-count":49,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,9,22]],"date-time":"2021-09-22T00:00:00Z","timestamp":1632268800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,9,22]],"date-time":"2021-09-22T00:00:00Z","timestamp":1632268800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Discov Artif Intell"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Applying machine learning methods to improve the efficiency of complex manufacturing processes, such as material handling, can be challenging. The interconnectedness of the multiple components that make up real-world manufacturing processes and the typically very large number of variables required to specify procedures and plans within them combine to make it very difficult to map the details of such processes to a formal mathematical representation suitable for conventional optimization methods. Instead, in this work reinforcement learning was applied to produce increasingly efficient plans for material handling in representative manufacturing facilities. Doing so included defining a formal representation of a realistically complex material handling plan, specifying a set of suitable plan change operators as reinforcement learning actions, implementing a simulation-based multi-objective reward function that considers multiple components of material handling costs, and abstracting the many possible material handling plans into a state set small enough to enable reinforcement learning. Experimentation with multiple material handling plans on two separate factory layouts indicated that reinforcement learning could consistently reduce the cost of material handling. This work demonstrates the applicability of reinforcement learning with a multi-objective reward function to realistically complex material handling processes.<\/jats:p>","DOI":"10.1007\/s44163-021-00003-3","type":"journal-article","created":{"date-parts":[[2021,9,27]],"date-time":"2021-09-27T06:56:27Z","timestamp":1632725787000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Applying reinforcement learning to plan manufacturing material handling"],"prefix":"10.1007","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3289-0234","authenticated-orcid":false,"given":"Swetha","family":"Govindaiah","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9838-2016","authenticated-orcid":false,"given":"Mikel D.","family":"Petty","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,9,22]]},"reference":[{"issue":"4","key":"3_CR1","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1007\/BF00186471","volume":"2","author":"AK Sethi","year":"1990","unstructured":"Sethi AK, Sethi SP. Flexibility in manufacturing: a survey. Int J Flex Manuf Syst. 1990;2(4):289\u2013328.","journal-title":"Int J Flex Manuf Syst"},{"issue":"7","key":"3_CR2","doi-asserted-by":"publisher","first-page":"767","DOI":"10.1108\/MRR-08-2013-0194","volume":"38","author":"A Tiwari","year":"2015","unstructured":"Tiwari A, Tiwari A, Samuel C. Supply chain flexibility: a comprehensive review. Manag Res Rev. 2015;38(7):767\u201392.","journal-title":"Manag Res Rev"},{"issue":"10","key":"3_CR3","doi-asserted-by":"publisher","first-page":"3133","DOI":"10.1080\/00207543.2016.1138151","volume":"54","author":"M P\u00e9rez P\u00e9rez","year":"2016","unstructured":"P\u00e9rez P\u00e9rez M, Serrano Bedia A, L\u00f3pez FM. Int J Prod Res. 2016;54(10):3133\u201348.","journal-title":"Int J Prod Res"},{"issue":"8","key":"3_CR4","doi-asserted-by":"publisher","first-page":"2252","DOI":"10.1080\/00207543.2011.575095","volume":"50","author":"IN Pujawan","year":"2012","unstructured":"Pujawan IN, Smart AU. Factors affecting schedule instability in manufacturing companies. Int J Prod Res. 2012;50(8):2252\u201366. https:\/\/doi.org\/10.1080\/00207543.2011.575095.","journal-title":"Int J Prod Res"},{"issue":"6","key":"3_CR5","doi-asserted-by":"publisher","first-page":"1233","DOI":"10.1007\/s10845-013-0852-9","volume":"26","author":"C Chen","year":"2015","unstructured":"Chen C, Xia B, Zhou BH, Xi L. A reinforcement learning based approach for a multiple-load carrier scheduling problem. J Intell Manuf. 2015;26(6):1233\u201345.","journal-title":"J Intell Manuf"},{"key":"3_CR6","volume-title":"Reinforcement learning applied to manufacturing material handling","author":"S Govindaiah","year":"2019","unstructured":"Govindaiah S. Reinforcement learning applied to manufacturing material handling. Huntsville: Ph.D. Dissertation, University of Alabama in Huntsville; 2019."},{"key":"3_CR7","unstructured":"Govindaiah S, Petty MD. A discrete event simulation-based multi-objective reinforcement learning reward function for optimizing manufacturing material handling. Proceedings of the 2019 Simulation Innovation Workshop. Orlando FL, February 11\u201315 2019."},{"key":"3_CR8","doi-asserted-by":"publisher","unstructured":"Govindaiah S, Petty MD. Applying reinforcement learning to plan manufacturing material handling, part 1: Background and formal problem specification. Proceedings of the 2019 ACM Southeast Conference. Kennesaw GA, April 18\u201320 2019, 168\u2013171; https:\/\/doi.org\/10.1145\/3299815.3314451.","DOI":"10.1145\/3299815.3314451"},{"key":"3_CR9","doi-asserted-by":"publisher","unstructured":"Govindaiah S, Petty MD. Applying reinforcement learning to plan manufacturing material handling, part 2: Experimentation and results. Proceedings of the 2019 ACM Southeast Conference. Kennesaw GA, April 18\u201320 2019, 16\u201323; https:\/\/doi.org\/10.1145\/3299815.3314427.","DOI":"10.1145\/3299815.3314427"},{"key":"3_CR10","volume-title":"Management of Business Logistics","author":"JJ Coyle","year":"1992","unstructured":"Coyle JJ. Management of Business Logistics. Mason: South-Western; 1992."},{"key":"3_CR11","doi-asserted-by":"publisher","unstructured":"White JA. Material handling research: Needs and opportunities. In: Material Handling \u201990, Progress in Material Handling and Logistics, Vol 2. Berlin: Springer, 1991; https:\/\/doi.org\/10.1007\/978-3-642-84356-3_1.","DOI":"10.1007\/978-3-642-84356-3_1"},{"key":"3_CR12","volume-title":"Manufacturing facilities design and material handling","author":"M Stephens","year":"2013","unstructured":"Stephens M, Meyers F. Manufacturing facilities design and material handling. West Lafayette: Purdue University Press; 2013."},{"issue":"19","key":"3_CR13","doi-asserted-by":"publisher","first-page":"5946","DOI":"10.1080\/00207543.2013.824627","volume":"51","author":"A Jain","year":"2013","unstructured":"Jain A, Jain P, Chan F, Singh S. A review on manufacturing flexibility. Int J Prod Res. 2013;51(19):5946\u201370.","journal-title":"Int J Prod Res"},{"key":"3_CR14","volume-title":"Pattern recognition and machine learning","author":"C Bishop","year":"2006","unstructured":"Bishop C. Pattern recognition and machine learning. New York: Springer; 2006."},{"key":"3_CR15","volume-title":"Artificial intelligence: a modern approach","author":"S Russell","year":"2021","unstructured":"Russell S, Norvig P. Artificial intelligence: a modern approach. 4th ed. Hoboken: Pearson Education Limited; 2021.","edition":"4"},{"key":"3_CR16","volume-title":"Reinforcement learning: an introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton RS, Barto AG. Reinforcement learning: an introduction. 2nd ed. Cambridge: MIT Press; 2018.","edition":"2"},{"key":"3_CR17","volume-title":"Introduction to machine learning","author":"E Alpaydin","year":"2020","unstructured":"Alpaydin E. Introduction to machine learning. 4th ed. Cambridge: The MIT Press; 2020.","edition":"4"},{"issue":"2","key":"3_CR18","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1007\/s13748-014-0057-2","volume":"3","author":"SE Barbosa","year":"2015","unstructured":"Barbosa SE, Petty MD. Exploiting spatio-temporal patterns using partial-state reinforcement learning in a synthetically-augmented environment. Prog Artif Intell. 2015;3(2):55\u201371.","journal-title":"Prog Artif Intell"},{"key":"3_CR19","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2020.101738","author":"JA Bland","year":"2020","unstructured":"Bland JA, Petty MD, Whitaker TS, Maxwell KP, Cantrell WA. Machine learning cyberattack and defense strategies. Comput Secur. 2020. https:\/\/doi.org\/10.1016\/j.cose.2020.101738.","journal-title":"Comput Secur."},{"key":"3_CR20","volume-title":"Dynamic programming","author":"R Bellman","year":"1957","unstructured":"Bellman R. Dynamic programming. Princeton: Princeton University Press; 1957."},{"issue":"2\u20133","key":"3_CR21","doi-asserted-by":"publisher","first-page":"169","DOI":"10.1016\/S0921-8890(00)00087-7","volume":"33","author":"ME Aydin","year":"2000","unstructured":"Aydin ME, \u00d6ztemel E. Dynamic job-shop scheduling using reinforcement learning agents. Robot Auton Syst. 2000;33(2\u20133):169\u201378.","journal-title":"Robot Auton Syst"},{"issue":"3\u20134","key":"3_CR22","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1007\/s00170-006-0465-y","volume":"33","author":"YC Wang","year":"2007","unstructured":"Wang YC, Usher JM. A reinforcement learning approach for developing routing policies in multi-agent production scheduling. The Int J Adv Manuf Technol. 2007;33(3\u20134):323\u201333.","journal-title":"The Int J Adv Manuf Technol"},{"issue":"5","key":"3_CR23","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1016\/j.simpat.2004.12.003","volume":"13","author":"CD Paternina-Arboleda","year":"2005","unstructured":"Paternina-Arboleda CD, Das TK. A multi-agent reinforcement learning approach to obtaining dynamic control policies for stochastic lot scheduling problem. Simul Model Pract Theory. 2005;13(5):389\u2013406.","journal-title":"Simul Model Pract Theory"},{"issue":"4","key":"3_CR24","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1080\/10429247.2010.11431878","volume":"22","author":"Z Sui","year":"2010","unstructured":"Sui Z, Gosavi A, Lin L. A reinforcement learning approach for inventory replenishment in vendor-managed inventory systems with consignment inventory. Eng Manag J. 2010;22(4):44\u201353.","journal-title":"Eng Manag J"},{"issue":"1","key":"3_CR25","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1016\/j.engappai.2004.08.018","volume":"18","author":"YC Wang","year":"2005","unstructured":"Wang YC, Usher JM. Application of reinforcement learning for agent-based production scheduling. Eng Appl Artif Intell. 2005;18(1):73\u201382.","journal-title":"Eng Appl Artif Intell"},{"key":"3_CR26","doi-asserted-by":"crossref","unstructured":"Gabel T, Riedmiller M. Adaptive reactive job-shop scheduling with reinforcement learning agents. 2008. http:\/\/ml.informatik.uni-freiburg.de\/former\/_media\/publications\/gr07.pdf. Accessed 10 May 2021.","DOI":"10.1109\/SCIS.2007.367699"},{"issue":"7","key":"3_CR27","doi-asserted-by":"publisher","first-page":"1089","DOI":"10.1016\/j.engappai.2009.01.014","volume":"22","author":"N Aissani","year":"2009","unstructured":"Aissani N, Beldjilali B, Trentesaux D. Dynamic scheduling of maintenance tasks in the petroleum industry: a reinforcement approach. Eng Appl Artif Intell. 2009;22(7):1089\u2013103.","journal-title":"Eng Appl Artif Intell"},{"key":"3_CR28","doi-asserted-by":"publisher","DOI":"10.1155\/2015\/597956","author":"J Dou","year":"2015","unstructured":"Dou J, Chen C, Yang P. Genetic scheduling and reinforcement learning in multirobot systems for intelligent warehouses. Math Probl Eng. 2015. https:\/\/doi.org\/10.1155\/2015\/597956.","journal-title":"Math Probl Eng"},{"issue":"6","key":"3_CR29","doi-asserted-by":"publisher","first-page":"1299","DOI":"10.1080\/00207540110118640","volume":"40","author":"P Pontrandolfo","year":"2002","unstructured":"Pontrandolfo P, Gosavi A, Okogbaa OG, Das TK. Global supply chain management: a reinforcement learning approach. Int J Prod Res. 2002;40(6):1299\u2013317.","journal-title":"Int J Prod Res"},{"issue":"3\u20134","key":"3_CR30","first-page":"279","volume":"8","author":"CJ Watkins","year":"1992","unstructured":"Watkins CJ, Dayan P. Q-learning. Mach Learn. 1992;8(3\u20134):279\u201392.","journal-title":"Mach Learn"},{"key":"3_CR31","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-45511-7","volume-title":"Multiple objective decision making\u2014methods and applications","author":"CL Hwang","year":"1979","unstructured":"Hwang CL, Masud AM. Multiple objective decision making\u2014methods and applications. Berlin: Springer; 1979."},{"issue":"1\u20132","key":"3_CR32","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1007\/s10994-010-5232-5","volume":"84","author":"P Vamplew","year":"2011","unstructured":"Vamplew P, Dazeley R, Berry A, Issabekov R, Dekker E. Empirical evaluation methods for multiobjective reinforcement learning algorithms. Mach Learn. 2011;84(1\u20132):51\u201380.","journal-title":"Mach Learn"},{"key":"3_CR33","doi-asserted-by":"crossref","unstructured":"Natarajan S, Tadepalli, P. Dynamic preferences in multi-criteria reinforcement learning. Proceedings of the 22nd international Conference on Machine Learning, 601\u2013608. Bonn: ACM; 2005.","DOI":"10.1145\/1102351.1102427"},{"key":"3_CR34","doi-asserted-by":"crossref","unstructured":"Van Moffaert K, Drugan MM, Now\u00e9 A. Scalarized multi-objective reinforcement learning: Novel design techniques. IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Singapore: IEEE Press; 2013, 191\u2013199.","DOI":"10.1109\/ADPRL.2013.6615007"},{"key":"3_CR35","unstructured":"G\u00e1bor Z, Kalm\u00e1r Z, Szepesv\u00e1ri C. Multi-criteria reinforcement learning. Proceedings of the 15th International Conference on Machine Learning. San Francisco: ACM; 1998, 197\u2013205."},{"key":"3_CR36","doi-asserted-by":"crossref","unstructured":"Van Moffaert K, Drugan MM, Now\u00e9 A. Hypervolume-based multi-objective reinforcement learning. Proceedings of the International Conference on Evolutionary Multi-Criterion Optimization. Berlin: Springer; 2013, 352\u2013366.","DOI":"10.1007\/978-3-642-37140-0_28"},{"key":"3_CR37","doi-asserted-by":"crossref","unstructured":"Barrett L, Narayanan S. Learning all optimal policies with multiple criteria. Proceedings of the 25th International Conference on Machine Learning. Helsinki: ACM; 2008, 41\u201347.","DOI":"10.1145\/1390156.1390162"},{"issue":"1","key":"3_CR38","first-page":"3483","volume":"15","author":"K Van Moffaert","year":"2014","unstructured":"Van Moffaert K, Now\u00e9 A. Multi-objective reinforcement learning using sets of pareto dominating policies. J Mach Learn Res. 2014;15(1):3483\u2013512.","journal-title":"J Mach Learn Res"},{"key":"3_CR39","unstructured":"Mossalam H, Assael YM, Roijers DM, Whiteson, S. Multi-objective deep reinforcement learning. 2016. https:\/\/arxiv.org\/ftp\/arxiv\/papers\/1803\/1803.02965.pdf. Accessed 10 May 2021."},{"issue":"4","key":"3_CR40","doi-asserted-by":"publisher","first-page":"3698","DOI":"10.1109\/TSG.2018.2834219","volume":"10","author":"E Mocanu","year":"2018","unstructured":"Mocanu E, Mocanu DC, Nguyen PH, Liotta A, Webber ME, Gibescu M, Slootweg JG. On-line building energy optimization using deep reinforcement learning. IEEE Trans Smart Grid. 2018;10(4):3698\u2013708.","journal-title":"IEEE Trans Smart Grid"},{"key":"3_CR41","first-page":"325","volume":"5","author":"S Mannor","year":"2004","unstructured":"Mannor S, Shimkin N. A geometric approach to multi-criterion reinforcement learning. J Mach Learn Res. 2004;5:325\u201360.","journal-title":"J Mach Learn Res"},{"key":"3_CR42","unstructured":"Zhang W, Dietterich TG. A reinforcement learning approach to job-shop scheduling. Proceedings of the 14th international joint conference on Artificial intelligence. Montreal: Morgan Kaufmann; 1995, 1114\u20131120."},{"key":"3_CR43","first-page":"1","volume":"1","author":"W Zhang","year":"2000","unstructured":"Zhang W, Dietterich TG. Solving combinatorial optimization tasks by reinforcement learning: a general methodology applied to resource-constrained scheduling. J Artif Intell Res. 2000;1:1\u201338.","journal-title":"J Artif Intell Res"},{"issue":"6","key":"3_CR44","doi-asserted-by":"publisher","first-page":"1087","DOI":"10.1063\/1.1699114","volume":"21","author":"N Metropolis","year":"1953","unstructured":"Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21(6):1087\u201392.","journal-title":"J Chem Phys"},{"key":"3_CR45","unstructured":"Plehn C, Stein F, Reinhart G. Modeling factory systems using graphs \u2013 Ontology-based design of a domain specific modeling method. Proceedings of the 20th International Conference on Engineering Design, Volume 4: Design for X, Design to X. Milan: The Design Society; 2015, 163\u2013172."},{"key":"3_CR46","volume-title":"Introduction to algorithms","author":"TH Cormen","year":"2009","unstructured":"Cormen TH, Leiserson CE, Rivest RL, Stein C. Introduction to algorithms. 3rd ed. Cambridge: The MIT Press; 2009.","edition":"3"},{"key":"3_CR47","doi-asserted-by":"publisher","DOI":"10.1177\/0037549718809542","author":"R Rooeinfar","year":"2018","unstructured":"Rooeinfar R, Raissi S, Ghezavati VR. Stochastic flexible flow shop scheduling problem with limited buffers and fixed interval preventive maintenance: a hybrid approach of simulation and metaheuristic algorithms. Trans Soc Model Simul Int. 2018. https:\/\/doi.org\/10.1177\/0037549718809542.","journal-title":"Trans Soc Model Simul Int"},{"key":"3_CR48","volume-title":"Discrete event system simulation","author":"J Banks","year":"2010","unstructured":"Banks J, Carson JS, Nelson BL, Nicol DM. Discrete event system simulation. 5th ed. Upper Saddle River: Prentice Hall; 2010.","edition":"5"},{"issue":"3","key":"3_CR49","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1287\/opre.15.3.537","volume":"15","author":"PC Fishburn","year":"1967","unstructured":"Fishburn PC. Additive utilities with incomplete product sets: application to priorities and assignments. Oper Res. 1967;15(3):537\u201342.","journal-title":"Oper Res"}],"container-title":["Discover Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-021-00003-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44163-021-00003-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-021-00003-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,8]],"date-time":"2024-09-08T20:52:24Z","timestamp":1725828744000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44163-021-00003-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,22]]},"references-count":49,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["3"],"URL":"https:\/\/doi.org\/10.1007\/s44163-021-00003-3","relation":{},"ISSN":["2731-0809"],"issn-type":[{"value":"2731-0809","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,22]]},"assertion":[{"value":"10 May 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 August 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 September 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that there are no conflicts of interest or competing interests associated with this article or with the research it describes.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"8"}}