{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T03:56:00Z","timestamp":1777348560671,"version":"3.51.4"},"reference-count":32,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2020,6,16]],"date-time":"2020-06-16T00:00:00Z","timestamp":1592265600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["2018R1A6A1A03025526"],"award-info":[{"award-number":["2018R1A6A1A03025526"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Intralogistics is a technology that optimizes, integrates, automates, and manages the logistics flow of goods within a logistics transportation and sortation center. As the demand for parcel transportation increases, many sortation systems have been developed. In general, the goal of sortation systems is to route (or sort) parcels correctly and quickly. We design an n-grid sortation system that can be flexibly deployed and used at intralogistics warehouse and develop a collaborative multi-agent reinforcement learning (RL) algorithm to control the behavior of emitters or sorters in the system. We present two types of RL agents, emission agents and routing agents, and they are trained to achieve the given sortation goals together. For the verification of the proposed system and algorithm, we implement them in a full-fledged cyber-physical system simulator and describe the RL agents\u2019 learning performance. From the learning results, we present that the well-trained collaborative RL agents can optimize their performance effectively. In particular, the routing agents finally learn to route the parcels through their optimal paths, while the emission agents finally learn to balance the inflow and outflow of parcels.<\/jats:p>","DOI":"10.3390\/s20123401","type":"journal-article","created":{"date-parts":[[2020,6,16]],"date-time":"2020-06-16T13:20:43Z","timestamp":1592313643000},"page":"3401","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Sortation Control Using Multi-Agent Deep Reinforcement Learning in N-Grid Sortation System"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6406-3092","authenticated-orcid":false,"given":"Ju-Bong","family":"Kim","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Korea University of Technology and Education, Cheonan 31253, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ho-Bin","family":"Choi","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Korea University of Technology and Education, Cheonan 31253, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gyu-Young","family":"Hwang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Korea University of Technology and Education, Cheonan 31253, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kwihoon","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Knowledge-Converged Super Brain Convergence Research, Electronics and Telecommunications Research Institute, Daejeon 34129, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yong-Geun","family":"Hong","sequence":"additional","affiliation":[{"name":"Department of Knowledge-Converged Super Brain Convergence Research, Electronics and Telecommunications Research Institute, Daejeon 34129, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5835-7972","authenticated-orcid":false,"given":"Youn-Hee","family":"Han","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Korea University of Technology and Education, Cheonan 31253, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,6,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Benzi, F., Bassi, E., Marabelli, F., Belloni, N., and Lombardi, M. (2019, January 18\u201320). IIoT-based Motion Control Efficiency in Automated Warehouses. Proceedings of the AEIT International Annual Conference (AEIT), Florence, Italy.","DOI":"10.23919\/AEIT.2019.8893370"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kirks, T., Jost, J., Uhlott, T., and Jakobs, M. (2018, January 4\u20137). Towards Complex Adaptive Control Systems for Human-Robot- Interaction in Intralogistics. Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA.","DOI":"10.1109\/ITSC.2018.8569949"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Harrison, R. (2019, January 10\u201313). Dynamically Integrating Manufacturing Automation with Logistics. Proceedings of the 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Zaragoza, Spain.","DOI":"10.1109\/ETFA.2019.8869405"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Seibold, Z., Stoll, T., and Furmans, K. (2013, January 15\u201318). Layout-optimized Sorting of Goods with Decentralized Controlled Conveying modules. Proceedings of the 2013 IEEE International Systems Conference (SysCon), Orlando, FL, USA.","DOI":"10.1109\/SysCon.2013.6549948"},{"key":"ref_5","first-page":"866","article-title":"A Sortation System Model","volume":"1","author":"Jayaraman","year":"1997","journal-title":"Winter Simul. Conf. Proc"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Beyer, T., Jazdi, N., Gohner, P., and Yousefifar, R. (2015, January 8\u201311). Knowledge-based planning and adaptation of industrial automation systems. Proceedings of the 2015 IEEE 20th Conference on Emerging Technologies Factory Automation (ETFA), Luxembourg.","DOI":"10.1109\/ETFA.2015.7301635"},{"key":"ref_7","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, The MIT Press. [2nd ed.]."},{"key":"ref_8","unstructured":"Hasselt, H.v., Guez, A., and Silver, D. (2016, January 12\u201317). Deep Reinforcement Learning with Double Q-Learning. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/s10846-018-0891-8","article-title":"A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform","volume":"93","author":"Sampedro","year":"2019","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"36682","DOI":"10.1109\/ACCESS.2019.2905621","article-title":"Imitation Reinforcement Learning-Based Remote Rotary Inverted Pendulum Control in OpenFlow Network","volume":"7","author":"Kim","year":"2019","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Xiao, D., and Tan, A. (2008, January 9\u201312). Scaling Up Multi-agent Reinforcement Learning in Complex Domains. Proceedings of the 2008 IEEE\/WIC\/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Sydney, NSW, Australia.","DOI":"10.1109\/WIIAT.2008.259"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1109\/TSMCC.2007.913919","article-title":"A Comprehensive Survey of Multiagent Reinforcement Learning","volume":"38","author":"Busoniu","year":"2008","journal-title":"Trans. Sys. Man Cyber Part C"},{"key":"ref_14","unstructured":"Foerster, J.N., Assael, Y.M., de Freitas, N., and Whiteson, S. (2016, January 5\u201310). Learning to Communicate with Deep Multi-agent Reinforcement Learning. Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS\u201916, Barcelona, Spain."},{"key":"ref_15","first-page":"2681","article-title":"Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability","volume":"Volume 70","author":"Precup","year":"2017","journal-title":"Proceedings of the 34th International Conference on Machine Learning"},{"key":"ref_16","unstructured":"Palmer, G., Tuyls, K., Bloembergen, D., and Savani, R. (2018, January 10\u201315). Lenient Multi-Agent Deep Reinforcement Learning. Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, Richland, SC, USA."},{"key":"ref_17","unstructured":"(2020). Emergent Tool Use From Multi-Agent Autocurricula. Submitted to International Conference on Learning Representations. under review."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, S., Wan, J., Zhang, D., Li, D., and Zhang, C. (2016). Towards Smart Factory for Industry 4.0: A Self-organized Multi-agent System with Big Data Based Feedback and Coordination. Comput. Netw., 101.","DOI":"10.1016\/j.comnet.2015.12.017"},{"key":"ref_19","unstructured":"Rosendahl, R., Cala, A., Kirchheim, K., Luder, A., and D\u2019Agostino, N. (2018). Towards Smart Factory: Multi-Agent Integration on Industrial Standards for Service-oriented Communication and Semantic Data Exchange."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1287\/msom.4.4.258.5732","article-title":"Performance Analysis of Split-Case Sorting Systems","volume":"4","author":"Johnson","year":"2002","journal-title":"Manuf. Serv. Oper. Manag."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.2174\/1874110X01408011116","article-title":"Simulation Design of Express Sorting System - Example of SF\u2019s Sorting Center","volume":"8","author":"Pan","year":"2014","journal-title":"Open Cybern. Syst. J."},{"key":"ref_22","unstructured":"(2019, December 07). Gebhardt GridSorter - Decentralized Plug&Play Sorter & Sequenzer. Available online: https:\/\/www.gebhardt-foerdertechnik.de\/en\/products\/sorting-technology\/gridsorter\/."},{"key":"ref_23","unstructured":"(2019, December 07). Factoryio Features. Available online: https:\/\/factoryio.com\/features."},{"key":"ref_24","unstructured":"Lem, H.J., and Mahwah, N. (2019, December 07). Conveyor Sortation System. 914155. Available online: https:\/\/patentimages.storage.googleapis.com\/2b\/2f\/56\/6eaeeaeb32b18d\/US4249661.pdf."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"591","DOI":"10.1080\/00207548208947789","article-title":"An analytical model for recirculating conveyors with stochastic inputs and outputs","volume":"20","author":"Sonderman","year":"1982","journal-title":"Int. J. Prod. Res."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/0377-2217(88)90028-8","article-title":"Analytical solution of closed-loop conveyor systems with discrete and deterministic material flow","volume":"35","author":"Bastani","year":"1988","journal-title":"Eur. J. Oper. Res."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chen, J.C., Huang, C., Chen, T., and Lee, Y. (2019, January 12\u201315). Solving a Sortation Conveyor Layout Design Problem with Simulation-optimization Approach. Proceedings of the 2019 IEEE 6th International Conference on Industrial Engineering and Applications (ICIEA), Tokyo, Japan.","DOI":"10.1109\/IEA.2019.8715232"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Westbrink, F., Sivanandan, R., Sch\u00fctte, T., and Schwung, A. (2019, January 8\u201312). Design approach and simulation of a peristaltic sortation machine. Proceedings of the IEEE\/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Hong Kong, China.","DOI":"10.1109\/AIM.2019.8868435"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Arulkumaran, K., Deisenroth, M.P., Brundage, M., and Bharath, A.A. (2017). A Brief Survey of Deep Reinforcement Learning. arXiv.","DOI":"10.1109\/MSP.2017.2743240"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1007\/BF00992698","article-title":"Technical Note: Q-Learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_31","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv."},{"key":"ref_32","unstructured":"Cowan, J.D., Tesauro, G., and Alspector, J. (1994). Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach. Advances in Neural Information Processing Systems 6, The MIT Press."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/12\/3401\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:39:36Z","timestamp":1760175576000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/12\/3401"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,16]]},"references-count":32,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2020,6]]}},"alternative-id":["s20123401"],"URL":"https:\/\/doi.org\/10.3390\/s20123401","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,16]]}}}