{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T07:14:33Z","timestamp":1771658073192,"version":"3.50.1"},"reference-count":86,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2022,8,30]],"date-time":"2022-08-30T00:00:00Z","timestamp":1661817600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)","award":["491268466"],"award-info":[{"award-number":["491268466"]}]},{"name":"Hamburg University of Technology (TUHH)","award":["491268466"],"award-info":[{"award-number":["491268466"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>Controlling a fleet of autonomous mobile robots (AMR) is a complex problem of optimization. Many approached have been conducted for solving this problem. They range from heuristics, which usually do not find an optimum, to mathematical models, which are limited due to their high computational effort. Machine Learning (ML) methods offer another potential trajectory for solving such complex problems. The focus of this brief survey is on Reinforcement Learning (RL) as a particular type of ML. Due to the reward-based optimization, RL offers a good basis for the control of fleets of AMR. In the context of this survey, different control approaches are investigated and the aspects of fleet control of AMR with respect to RL are evaluated. As a result, six fundamental key problems should be put on the current research agenda to enable a broader application in industry: (1) overcoming the \u201csim-to-real gap\u201d, (2) increasing the robustness of algorithms, (3) improving data efficiency, (4) integrating different fields of application, (5) enabling heterogeneous fleets with different types of AMR and (6) handling of deadlocks.<\/jats:p>","DOI":"10.3390\/robotics11050085","type":"journal-article","created":{"date-parts":[[2022,8,31]],"date-time":"2022-08-31T00:13:56Z","timestamp":1661904836000},"page":"85","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Controlling Fleets of Autonomous Mobile Robots with Reinforcement Learning: A Brief Survey"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1797-6168","authenticated-orcid":false,"given":"Mike","family":"Wesselh\u00f6ft","sequence":"first","affiliation":[{"name":"Institute for Technical Logistics, Hamburg University of Technology, 21079 Hamburg, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9823-7679","authenticated-orcid":false,"given":"Johannes","family":"Hinckeldeyn","sequence":"additional","affiliation":[{"name":"Institute for Technical Logistics, Hamburg University of Technology, 21079 Hamburg, Germany"}]},{"given":"Jochen","family":"Kreutzfeldt","sequence":"additional","affiliation":[{"name":"Institute for Technical Logistics, Hamburg University of Technology, 21079 Hamburg, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,30]]},"reference":[{"key":"ref_1","unstructured":"International Federation of Robotics (2020). World Robotics Report 2020, International Federation of Robotics."},{"key":"ref_2","unstructured":"International Federation of Robotics (2021). Robot Sales Rise again, International Federation of Robotics."},{"key":"ref_3","unstructured":"The Logistics IQ (2021). AGV-AMR Market Map 2021, The Logistics IQ."},{"key":"ref_4","unstructured":"Steeb, R., Cammarata, S., Hayes-Roth, F.A., Thorndyke, P.W., and Wesson, R.B. (1981). Distributed Intelligence for Air Fleet Control. Readings in Distributed Artificial Intelligence, Elsevier."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Naumov, V., Kubek, D., Wi\u0119cek, P., Skalna, I., Duda, J., Goncerz, R., and Derlecki, T. (2021). Optimizing Energy Consumption in Internal Transportation Using Dynamic Transportation Vehicles Assignment Model: Case Study in Printing Company. Energies, 14.","DOI":"10.3390\/en14154557"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Alexovi\u010d, S., Lacko, M., Ba\u010d\u00edk, J., and Perdukov\u00e1, D. (2021). Introduction into Autonomous Mobile Robot Research and Multi Cooperation, Springer.","DOI":"10.1007\/978-3-030-77445-5_30"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","article-title":"Grandmaster level in StarCraft II using multi-agent reinforcement learning","volume":"575","author":"Vinyals","year":"2019","journal-title":"Nature"},{"key":"ref_8","unstructured":"Akkaya, I., Andrychowicz, M., Chociej, M., Litwin, M., McGrew, B., Petron, A., Paino, A., Plappert, M., Powell, G., and Ribas, R. (2019). Solving Rubik\u2019s Cube with a robot hand. arXiv."},{"key":"ref_9","unstructured":"Sutton, R.S., and Barto, A.G. (2011). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2042","DOI":"10.1109\/TNNLS.2017.2773458","article-title":"Optimal and Autonomous Control Using Reinforcement Learning: A Survey","volume":"29","author":"Kiumarsi","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1072","DOI":"10.1016\/j.apenergy.2018.11.002","article-title":"Reinforcement learning for demand response: A review of algorithms and modeling techniques","volume":"235","author":"Nagy","year":"2019","journal-title":"Appl. Energy"},{"key":"ref_12","first-page":"96","article-title":"Mobile Robot Navigation and Obstacle Avoidance Techniques: A Review","volume":"2","author":"Pandey","year":"2017","journal-title":"Int. Robot. Autom. J."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Panchpor, A.A., Shue, S., and Conrad, J.M. (2018, January 4\u20135). A survey of methods for mobile robot localization and mapping in dynamic indoor environments. Proceedings of the 2018 Conference on Signal Processing and Communication Engineering Systems (SPACES), Vaddeswaram, India.","DOI":"10.1109\/SPACES.2018.8316333"},{"key":"ref_14","unstructured":"Shabbir, J., and Anwer, T. (2018). A Survey of Deep Learning Techniques for Mobile Robot Applications. arXiv."},{"key":"ref_15","unstructured":"Farazi, N.P., Ahamed, T., Barua, L., and Zou, B. (2020). Deep Reinforcement Learning and Transportation Research: A Comprehensive Review. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Singh, P., Tiwari, R., and Bhattacharya, M. (2016, January 11\u201313). Navigation in Multi Robot system using cooperative learning: A survey. Proceedings of the 2016 International Conference on Computational Techniques in Information and Communication Technologies (ICCTICT), New Delhi, India.","DOI":"10.1109\/ICCTICT.2016.7514569"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3826","DOI":"10.1109\/TCYB.2020.2977374","article-title":"Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications","volume":"50","author":"Nguyen","year":"2020","journal-title":"IEEE Trans. Cybern."},{"key":"ref_18","unstructured":"OroojlooyJadid, A., and Hajinezhad, D. (2019). A Review of Cooperative Multi-Agent Deep Reinforcement Learning. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1109\/TCDS.2018.2840971","article-title":"Decision Making in Multiagent Systems: A Survey","volume":"10","author":"Rizk","year":"2018","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"114660","DOI":"10.1016\/j.eswa.2021.114660","article-title":"Trajectory Planning for Multi-Robot Systems: Methods and Applications","volume":"173","author":"Madridano","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_21","first-page":"027836492098785","article-title":"How to Train Your Robot with Deep Reinforcement Learning; Lessons We\u2019ve Learned","volume":"7","author":"Ibarz","year":"2021","journal-title":"Int. J. Robot. Res."},{"key":"ref_22","unstructured":"Xiao, X., Liu, B., Warnell, G., and Stone, P. (2020). Motion Control for Mobile Robot Navigation Using Machine Learning: A Survey. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Jiang, H., Wang, H., Yau, W.Y., and Wan, K.W. (2020, January 9\u201313). A Brief Survey: Deep Reinforcement Learning in Mobile Robot Navigation. Proceedings of the 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway.","DOI":"10.1109\/ICIEA48937.2020.9248288"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1109\/TITS.2020.3024655","article-title":"Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles","volume":"23","author":"Aradi","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"674","DOI":"10.26599\/TST.2021.9010012","article-title":"Deep reinforcement learning based mobile robot navigation: A review","volume":"26","author":"Zhu","year":"2021","journal-title":"Tsinghua Sci. Technol."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1111\/1467-8551.00375","article-title":"Towards a Methodology for Developing Evidence-Informed Management Knowledge by Means of Systematic Review","volume":"14","author":"Tranfield","year":"2003","journal-title":"Br. J. Manag."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/S0004-3702(98)00023-X","article-title":"Planning and acting in partially observable stochastic domains","volume":"101","author":"Kaelbling","year":"1998","journal-title":"Artif. Intell."},{"key":"ref_28","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. arXiv."},{"key":"ref_29","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1007\/978-3-030-65955-4_16","article-title":"Deep Reinforcement Learning for Solving AGVs Routing Problem","volume":"Volume 12519","author":"Lu","year":"2020","journal-title":"Verification and Evaluation of Computer and Communication Systems"},{"key":"ref_31","unstructured":"Zhang, D. (2021, January 14\u201316). Action-limited, Multimodal Deep Q Learning for AGV Fleet Route Planning. Proceedings of the Proceedings of the 5th International Conference on Control Engineering and Artificial Intelligence, Sanya, China."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1109\/LCSYS.2020.3007663","article-title":"Integral Reinforcement Learning-Based Multi-Robot Minimum Time-Energy Path Planning Subject to Collision Avoidance and Unknown Environmental Disturbances","volume":"5","author":"He","year":"2021","journal-title":"IEEE Control. Syst. Lett."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"4163","DOI":"10.1109\/LRA.2021.3068955","article-title":"Learning to Herd Agents Amongst Obstacles: Training Robust Shepherding Behaviors Using Deep Reinforcement Learning","volume":"6","author":"Zhi","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Meerza, S.I.A., Islam, M., and Uzzal, M.M. (2019, January 3\u20135). Q-Learning Based Particle Swarm Optimization Algorithm for Optimal Path Planning of Swarm of Mobile Robots. Proceedings of the 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT 2019), Dhaka, Bangladesh.","DOI":"10.1109\/ICASERT.2019.8934450"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"6807","DOI":"10.1109\/TITS.2021.3062500","article-title":"Reinforcement Learning and Particle Swarm Optimization Supporting Real-Time Rescue Assignments for Multiple Autonomous Underwater Vehicles","volume":"23","author":"Wu","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, M., Zeng, B., and Wang, Q. (2021). Research on Motion Planning Based on Flocking Control and Reinforcement Learning for Multi-Robot Systems. Machines, 9.","DOI":"10.3390\/machines9040077"},{"key":"ref_37","unstructured":"Vogel-Heuser, B. (2018, January 20\u201324). Performance Evaluation of the Dyna-Q algorithm for Robot Navigation. Proceedings of the 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), Munich, Germany."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"042008","DOI":"10.1088\/1742-6596\/1624\/4\/042008","article-title":"Multi-Robot Path Planning Method Based on Prior Knowledge and Q-learning Algorithms","volume":"1624","author":"Li","year":"2020","journal-title":"J. Physics Conf. Ser."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Bae, H., Kim, G., Kim, J., Qian, D., and Lee, S. (2019). Multi-Robot Path Planning Method Using Reinforcement Learning. Appl. Sci., 9.","DOI":"10.3390\/app9153057"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"2378","DOI":"10.1109\/LRA.2019.2903261","article-title":"PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning","volume":"4","author":"Sartoretti","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"2666","DOI":"10.1109\/LRA.2021.3062803","article-title":"PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning\u2014Lifelong","volume":"6","author":"Damani","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"6932","DOI":"10.1109\/LRA.2020.3026638","article-title":"Mobile Robot Path Planning in Dynamic Environments through Globally Guided Reinforcement Learning","volume":"5","author":"Wang","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Liu, Z., Chen, B., Zhou, H., Koushik, G., Hebert, M., and Zhao, D. (2020, January 25\u201329). MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9340876"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1455","DOI":"10.1109\/LRA.2021.3139145","article-title":"Learning Selective Communication for Multi-Agent Path Finding","volume":"7","author":"Ma","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ma, Z., Luo, Y., and Ma, H. (June, January 30). Distributed Heuristic Multi-Agent Path Finding with Communication. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi\u2019an, China.","DOI":"10.1109\/ICRA48506.2021.9560748"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"14413","DOI":"10.1109\/TVT.2020.3034800","article-title":"Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning","volume":"69","author":"Hu","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"929","DOI":"10.1007\/s10514-015-9503-7","article-title":"Cooperative multi-robot patrol with Bayesian learning","volume":"40","author":"Portugal","year":"2016","journal-title":"Auton. Robot."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"35287","DOI":"10.1109\/ACCESS.2022.3163393","article-title":"A Low-Cost Q-Learning-Based Approach to Handle Continuous Space Problems for Decentralized Multi-Agent Robot Navigation in Cluttered Environments","volume":"10","author":"Ajabshir","year":"2022","journal-title":"IEEE Access"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Chen, Y.F., Liu, M., Everett, M., and How, J.P. (June, January 29). Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989037"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"856","DOI":"10.1177\/0278364920916531","article-title":"Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios","volume":"39","author":"Fan","year":"2020","journal-title":"Int. J. Robot. Res."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Yao, S., Chen, G., Pan, L., Ma, J., Ji, J., and Chen, X. (2020, January 9\u201311). Multi-Robot Collision Avoidance with Map-based Deep Reinforcement Learning. Proceedings of the 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), Baltimore, MD, USA.","DOI":"10.1109\/ICTAI50040.2020.00088"},{"key":"ref_52","unstructured":"Fan, T., Long, P., Liu, W., and Pan, J. (2018). Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Long, P., Fan, T., Liao, X., Liu, W., Zhang, H., and Pan, J. (2018, January 21\u201326). Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8461113"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"8379","DOI":"10.1109\/LRA.2021.3102636","article-title":"Decentralized Multi-Robot Collision Avoidance in Complex Scenarios With Selective Communication","volume":"6","author":"Zhai","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"3221","DOI":"10.1109\/LRA.2020.2974695","article-title":"Multi-Agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning","volume":"5","author":"Semnani","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Chen, W., Zhou, S., Pan, Z., Zheng, H., and Liu, Y. (2019). Mapless Collaborative Navigation for a Multi-Robot System Based on the Deep Reinforcement Learning. Appl. Sci., 9.","DOI":"10.3390\/app9204198"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"4616","DOI":"10.1109\/LRA.2021.3068662","article-title":"Where to go Next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments","volume":"6","author":"Brito","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Han, R., Chen, S., and Hao, Q. (August, January 31). Cooperative Multi-Robot Navigation in Dynamic Environment with Deep Reinforcement Learning. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197209"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Vera, J.M., and Abad, A.G. (2019, January 11\u201315). Deep Reinforcement Learning for Routing a Heterogeneous Fleet of Vehicles. Proceedings of the 2019 IEEE Latin American Conference on Computational Intelligence (LA-CCI), Guayaquil, Ecuador.","DOI":"10.1109\/LA-CCI47412.2019.9037042"},{"key":"ref_60","unstructured":"Google Inc (2019). Google\u2019s Optimization Tools (Or-Tools), Google Inc."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"8086","DOI":"10.1109\/LRA.2021.3103054","article-title":"SABER: Data-Driven Motion Planner for Autonomously Navigating Heterogeneous Robots","volume":"6","author":"Schperberg","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_62","unstructured":"Zhang, Y., Qian, Y., Yao, Y., Hu, H., and Xu, Y. (2020). Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding. Auton. Agents Multiagent Syst., 2077\u20132079."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Lin, K., Zhao, R., Xu, Z., and Zhou, J. (2018, January 19\u201323). Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK.","DOI":"10.1145\/3219819.3219993"},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1080\/15472450.2020.1852082","article-title":"Reinforcement learning-enabled genetic algorithm for school bus scheduling","volume":"26","author":"Li","year":"2022","journal-title":"J. Intell. Transp. Syst."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"13861","DOI":"10.1109\/TVT.2020.3029864","article-title":"Scalable Parallel Task Scheduling for Autonomous Driving Using Multi-Task Deep Reinforcement Learning","volume":"69","author":"Qi","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_66","unstructured":"Baru, C. (2019, January 9\u201312). Multi-task Deep Reinforcement Learning for Scalable Parallel Task Scheduling. Proceedings of the 2019 IEEE International Conference on Big Data, Los Angeles, CA, USA."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Xue, T., Zeng, P., and Yu, H. (2018, January 20\u201322). A reinforcement learning method for multi-AGV scheduling in manufacturing. Proceedings of the 2018 IEEE International Conference on Industrial Technology (ICIT), Lyon, France.","DOI":"10.1109\/ICIT.2018.8352413"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Zhang, C., Odonkor, P., Zheng, S., Khorasgani, H., Serita, S., Gupta, C., and Wang, H. (2020, January 10\u201313). Dynamic Dispatching for Large-Scale Heterogeneous Fleet via Multi-agent Deep Reinforcement Learning. Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA.","DOI":"10.1109\/BigData50022.2020.9378191"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Elfakharany, A., and Ismail, Z.H. (2021). End-to-End Deep Reinforcement Learning for Decentralized Task Allocation and Navigation for a Multi-Robot System. Appl. Sci., 11.","DOI":"10.3390\/app11072895"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Li, M.P., Sankaran, P., Kuhl, M.E., Ptucha, R., Ganguly, A., and Kwasinski, A. (2019, January 8\u201311). Task Selection by Autonomous Moblie Rrobots in a warhouse using Deep Reinforcement Learning. Proceedings of the 2019 Winter Simulation Conference (WSC)E, National Harbor, MD, USA.","DOI":"10.1109\/WSC40007.2019.9004792"},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1017\/pds.2021.17","article-title":"A multi-agent reinforcement learning framework for intelligent manufactoring with autonomous mobile robots","volume":"1","author":"Agrawal","year":"2021","journal-title":"Proc. Des. Soc."},{"key":"ref_72","first-page":"1328","article-title":"Certified Adversarial Robustness for Deep Reinforcement Learning","volume":"Volume 100","author":"Kaelbling","year":"2020","journal-title":"Proceedings of the Conference on Robot Learning"},{"key":"ref_73","unstructured":"Verband der Automobilindustrie (2020). Interface for the Communication between Automated Guided Vehicles (AGV) and a Master Control: VDA5050, VDA."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Weinstock, C.B., and Goodenough, J.B. (2006). On System Scalability, Defense Technical Information Center.","DOI":"10.21236\/ADA457003"},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1145\/234313.234424","article-title":"Interoperability","volume":"28","author":"Wegner","year":"1996","journal-title":"ACM Comput. Surv."},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"107252","DOI":"10.1016\/j.cie.2021.107252","article-title":"A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem","volume":"156","author":"Qin","year":"2021","journal-title":"Comput. Ind. Eng."},{"key":"ref_77","first-page":"1162","article-title":"Active Domain Randomization","volume":"100","author":"Mehta","year":"2020","journal-title":"Conf. Robot. Learn."},{"key":"ref_78","unstructured":"Vuong, Q., Vikram, S., Su, H., Gao, S., and Christensen, H.I. (2019). How to Pick the Domain Randomization Parameters for Sim-to-Real Transfer of Reinforcement Learning Policies?. arXiv."},{"key":"ref_79","unstructured":"He, Z., Rakin, A.S., and Fan, D. (2019, January 8\u201314). Certified Adversarial Robustness with Additive Noise. Proceedings of the 32th Conference Advances in neural information processing systems, Vancouver, BC, Canada."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Zhao, W., Queralta, J.P., and Westerlund, T. (2020, January 1\u20134). Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: A Survey. Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia.","DOI":"10.1109\/SSCI47803.2020.9308468"},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"1360","DOI":"10.1109\/TRO.2012.2210294","article-title":"Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation","volume":"28","author":"Stulp","year":"2012","journal-title":"IEEE Trans. Robot."},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Sledge, I.J., Bryner, D.W., and Principe, J.C. (2022). Annotating Motion Primitives for Simplifying Action Search in Reinforcement Learning. IEEE Trans. Emerg. Top. Comput. Intell., 1\u201320.","DOI":"10.1109\/TETCI.2021.3132365"},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"2393","DOI":"10.1109\/TII.2019.2936167","article-title":"End-to-End Navigation Strategy With Deep Reinforcement Learning for Mobile Robots","volume":"16","author":"Shi","year":"2020","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_84","doi-asserted-by":"crossref","first-page":"2007","DOI":"10.1109\/LRA.2019.2899918","article-title":"Learning Navigation Behaviors End-to-End With AutoRL","volume":"4","author":"Chiang","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Wu, J., Wang, R., Li, R., Zhang, H., and Hu, X. (2018, January 7\u201310). Multi-critic DDPG Method and Double Experience Replay. Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan.","DOI":"10.1109\/SMC.2018.00039"},{"key":"ref_86","unstructured":"Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2016). Prioritized Experience Replay. arXiv."}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/11\/5\/85\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:20:23Z","timestamp":1760142023000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/11\/5\/85"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,30]]},"references-count":86,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2022,10]]}},"alternative-id":["robotics11050085"],"URL":"https:\/\/doi.org\/10.3390\/robotics11050085","relation":{},"ISSN":["2218-6581"],"issn-type":[{"value":"2218-6581","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,30]]}}}