{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T15:04:43Z","timestamp":1773414283510,"version":"3.50.1"},"reference-count":55,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2024,6,5]],"date-time":"2024-06-05T00:00:00Z","timestamp":1717545600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004585","name":"project IGA BUT","doi-asserted-by":"publisher","award":["FSI-S-23-839"],"award-info":[{"award-number":["FSI-S-23-839"]}],"id":[{"id":"10.13039\/501100004585","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>The use of robot manipulators in engineering applications and scientific research has significantly increased in recent years. This can be attributed to the rise of technologies such as autonomous robotics and physics-based simulation, along with the utilization of artificial intelligence techniques. The use of these technologies may be limited due to a focus on a specific type of robotic manipulator and a particular solved task, which can hinder modularity and reproducibility in future expansions. This paper presents a method for planning motion across a wide range of robotic structures using deep reinforcement learning (DRL) algorithms to solve the problem of reaching a static or random target within a pre-defined configuration space. The paper addresses the challenge of motion planning in environments under a variety of conditions, including environments with and without the presence of collision objects. It highlights the versatility and potential for future expansion through the integration of OpenAI Gym and the PyBullet physics-based simulator.<\/jats:p>","DOI":"10.3390\/computation12060116","type":"journal-article","created":{"date-parts":[[2024,6,5]],"date-time":"2024-06-05T10:05:50Z","timestamp":1717581950000},"page":"116","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Deep-Reinforcement-Learning-Based Motion Planning for a Wide Range of Robotic Structures"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2715-7820","authenticated-orcid":false,"given":"Roman","family":"Par\u00e1k","sequence":"first","affiliation":[{"name":"Institute of Automation and Computer Science, Brno University of Technology, 61600 Brno, Czech Republic"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4372-2105","authenticated-orcid":false,"given":"Jakub","family":"K\u016fdela","sequence":"additional","affiliation":[{"name":"Institute of Automation and Computer Science, Brno University of Technology, 61600 Brno, Czech Republic"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3142-0900","authenticated-orcid":false,"given":"Radomil","family":"Matou\u0161ek","sequence":"additional","affiliation":[{"name":"Institute of Automation and Computer Science, Brno University of Technology, 61600 Brno, Czech Republic"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7943-8659","authenticated-orcid":false,"given":"Martin","family":"Ju\u0159\u00ed\u010dek","sequence":"additional","affiliation":[{"name":"Institute of Automation and Computer Science, Brno University of Technology, 61600 Brno, Czech Republic"}]}],"member":"1968","published-online":{"date-parts":[[2024,6,5]]},"reference":[{"key":"ref_1","unstructured":"Uygun, Y. (2024, January 05). The Fourth Industrial Revolution-Industry 4.0. Available online: https:\/\/ssrn.com\/abstract=3909340."},{"key":"ref_2","first-page":"761","article-title":"How to define industry 4.0: Main pillars of industry 4.0","volume":"761","author":"Erboz","year":"2017","journal-title":"Manag. Trends Dev. Enterp. Glob. Era"},{"key":"ref_3","first-page":"315","article-title":"Prospects for development movement in the industry concept 4.0","volume":"2","author":"Palka","year":"2019","journal-title":"Multidiscip. Asp. Prod. Eng."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Siciliano, B., and Khatib, O. (2016). Springer Handbook of Robotics, Springer.","DOI":"10.1007\/978-3-319-32552-1"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Siciliano, B., Sciavicco, L., Villani, L., and Oriolo, G. (2008). Robotics: Modelling, Planning and Control, Springer Publishing Company, Incorporated. [1st ed.].","DOI":"10.1007\/978-1-84628-642-1"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1007\/s00170-021-07985-5","article-title":"Benchmarking and optimization of robot motion planning with motion planning pipeline","volume":"118","author":"Liu","year":"2022","journal-title":"Int. J. Adv. Manuf. Technol."},{"key":"ref_7","unstructured":"Xanthidis, M.P., Esposito, J.M., Rekleitis, I., and O\u2019Kane, J.M. (2018). Analysis of motion planning by sampling in subspaces of progressively increasing dimension. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"95046","DOI":"10.1109\/ACCESS.2019.2928846","article-title":"Bidirectional potential guided RRT* for motion planning","volume":"7","author":"Wang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Tanha, S.D.N., Dehkordi, S.F., and Korayem, A.H. (2018, January 23\u201325). Control a mobile robot in Social environments by considering human as a moving obstacle. Proceedings of the 2018 6th RSI International Conference on Robotics and Mechatronics (IcRoM), Tehran, Iran.","DOI":"10.1109\/ICRoM.2018.8657641"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Ju\u0159\u00ed\u010dek, M., Par\u00e1k, R., and K\u016fdela, J. (2023). Evolutionary Computation Techniques for Path Planning Problems in Industrial Robotics: A State-of-the-Art Review. Computation, 11.","DOI":"10.3390\/computation11120245"},{"key":"ref_11","unstructured":"Kudela, J., Ju\u0159\u00ed\u010dek, M., and Par\u00e1k, R. (2023). Applications of Evolutionary Computation, Proceedings of the International Conference on the Applications of Evolutionary Computation (Part of EvoStar), Brno, Czech Republic, 12\u201314 April 2023, Springer."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1038\/s42256-022-00579-0","article-title":"A critical problem in benchmarking and analysis of evolutionary computation methods","volume":"4","author":"Kudela","year":"2022","journal-title":"Nat. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Stripinis, L., Kudela, J., and Paulavicius, R. (2024). Benchmarking Derivative-Free Global Optimization Algorithms Under Limited Dimensions and Large Evaluation Budgets. IEEE Trans. Evol. Comput., early access.","DOI":"10.1109\/TEVC.2024.3379756"},{"key":"ref_14","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_15","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1561\/2200000071","article-title":"An introduction to deep reinforcement learning","volume":"11","author":"Henderson","year":"2018","journal-title":"Found. Trends\u00ae Mach. Learn."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, R., Nageotte, F., Zanne, P., de Mathelin, M., and Dresp-Langley, B. (2021). Deep reinforcement learning for the control of robotic manipulation: A focussed mini-review. Robotics, 10.","DOI":"10.3390\/robotics10010022"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Han, D., Mulyana, B., Stankovic, V., and Cheng, S. (2023). A survey on deep reinforcement learning algorithms for robotic manipulation. Sensors, 23.","DOI":"10.3390\/s23073762"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"102517","DOI":"10.1016\/j.rcim.2022.102517","article-title":"A review on reinforcement learning for contact-rich robotic manipulation tasks","volume":"81","author":"Chrysostomou","year":"2023","journal-title":"Robot.-Comput.-Integr. Manuf."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.13164\/mendel.2021.1.001","article-title":"Comparison of Multiple Reinforcement Learning and Deep Reinforcement Learning Methods for the Task Aimed at Achieving the Goal","volume":"27","year":"2021","journal-title":"Mendel J. Ser."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1016\/j.promfg.2020.01.030","article-title":"Towards a robot simulation framework for e-waste disassembly using reinforcement learning","volume":"38","author":"Kristensen","year":"2019","journal-title":"Procedia Manuf."},{"key":"ref_22","unstructured":"Plappert, M., Andrychowicz, M., Ray, A., McGrew, B., Baker, B., Powell, G., Schneider, J., Tobin, J., Chociej, M., and Welinder, P. (2018). Multi-goal reinforcement learning: Challenging robotics environments and request for research. arXiv."},{"key":"ref_23","unstructured":"Gallou\u00e9dec, Q., Cazin, N., Dellandr\u00e9a, E., and Chen, L. (2021). panda-gym: Open-source goal-conditioned environments for robotic learning. arXiv."},{"key":"ref_24","unstructured":"Rzayev, A., and Aghaei, V.T. (2022). Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks. arXiv."},{"key":"ref_25","unstructured":"Mahmood, A.R., Korenkevych, D., Komer, B.J., and Bergstra, J. (2018, January 1\u20135). Setting up a reinforcement learning task with a real-world robot. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain."},{"key":"ref_26","unstructured":"Franceschetti, A., Tosello, E., Castaman, N., and Ghidoni, S. (2021). Intelligent Autonomous Systems 16, Proceedings of the International Conference on Intelligent Autonomous Systems, Singapore, 22\u201325 June 2021, Springer."},{"key":"ref_27","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv."},{"key":"ref_28","unstructured":"Fujimoto, S., Hoof, H., and Meger, D. (2018, January 10\u201315). Addressing function approximation error in actor\u2013critic methods. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_29","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft Actor\u2013Critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_30","unstructured":"Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Pieter Abbeel, O., and Zaremba, W. (2017, January 4\u20139). Hindsight experience replay. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"6289","DOI":"10.1109\/LRA.2021.3092685","article-title":"Learning kinematic feasibility for mobile manipulation through deep reinforcement learning","volume":"6","author":"Honerkamp","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Malik, A., Lischuk, Y., Henderson, T., and Prazenica, R. (2022). A deep reinforcement-learning approach for inverse kinematics solution of a high degree of freedom robotic manipulator. Robotics, 11.","DOI":"10.3390\/robotics11020044"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"5253","DOI":"10.1109\/TII.2021.3125447","article-title":"A general framework of motion planning for redundant robot manipulator based on deep reinforcement learning","volume":"18","author":"Li","year":"2021","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Thumm, J., and Althoff, M. (2022, January 23\u201327). Provably safe deep reinforcement learning for robotic manipulation in human environments. Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA.","DOI":"10.1109\/ICRA46639.2022.9811698"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1007\/s10514-022-10034-z","article-title":"Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning","volume":"46","author":"Shahid","year":"2022","journal-title":"Auton. Robot."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1007\/s10994-021-06116-1","article-title":"Reinforcement learning for robotic manipulation using simulated locomotion demonstrations","volume":"111","author":"Kilinc","year":"2022","journal-title":"Mach. Learn."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2759","DOI":"10.1109\/TIE.2022.3172754","article-title":"Solving robotic manipulation with sparse reward reinforcement learning via graph-based diversity and proximity","volume":"70","author":"Bing","year":"2022","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"4741","DOI":"10.1109\/LRA.2022.3146903","article-title":"Closed-loop dynamic control of a soft manipulator using deep reinforcement learning","volume":"7","author":"Centurelli","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_39","first-page":"10","article-title":"I4C\u2014Robotic cell according to the Industry 4.0 concept","volume":"27","author":"Lacko","year":"2021","journal-title":"Automa"},{"key":"ref_40","unstructured":"ABB Ltd. (2022). ABB IRB 120 Product Manual, ABB Ltd."},{"key":"ref_41","unstructured":"Seiko Epson Corporation (2024). Industrial Robot: SCARA ROBOT LS-B Series MANUAL, Seiko Epson Corporation."},{"key":"ref_42","unstructured":"ABB Ltd. (2022). ABB IRB 14000 Product Manual, ABB Ltd."},{"key":"ref_43","unstructured":"Universal Robots A\/S (2024). User Manual UR3e, Universal Robots A\/S."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1023\/A:1022632907294","article-title":"Q-learning","volume":"8","author":"Dayan","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1145\/203330.203343","article-title":"Temporal difference learning and TD-Gammon","volume":"38","author":"Tesauro","year":"1995","journal-title":"Commun. ACM"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"838","DOI":"10.1137\/0330046","article-title":"Acceleration of stochastic approximation by averaging","volume":"30","author":"Polyak","year":"1992","journal-title":"SIAM J. Control. Optim."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Ericson, C. (2004). Real-Time Collision Detection, CRC Press.","DOI":"10.1201\/b14581"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Van Den Bergen, G. (2003). Collision Detection in Interactive 3D Environments, CRC Press.","DOI":"10.1201\/9781482297997"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"984","DOI":"10.1109\/TRO.2011.2148230","article-title":"Solvability-unconcerned inverse kinematics by the Levenberg\u2013Marquardt method","volume":"27","author":"Sugihara","year":"2011","journal-title":"IEEE Trans. Robot."},{"key":"ref_51","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). Openai gym. arXiv."},{"key":"ref_52","unstructured":"Coumans, E., Bai, Y.P., and PyBullet, A. (2024, January 05). PyBullet, a Python Module for Physics Simulation for Games, Robotics and Machine Learning. Available online: https:\/\/docs.google.com\/document\/d\/10sXEhzFRSnvFcl3XxNGhnD4N2SedqwdAvK3dsihxVUA."},{"key":"ref_53","unstructured":"Li, S., Wu, Y., Cui, X., Dong, H., Fang, F., and Russell, S. (February, January 27). Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1016\/j.neucom.2020.05.097","article-title":"A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment","volume":"411","author":"Zhang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_55","unstructured":"Kingma, D.P., and Ba, J. (2015, January 7\u20139). Adam: A method for stochastic optimization. Proceedings of the International Conference on Learning Representations (ICLR)HER, San Diego, CA, USA."}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/12\/6\/116\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:54:04Z","timestamp":1760108044000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/12\/6\/116"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,5]]},"references-count":55,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["computation12060116"],"URL":"https:\/\/doi.org\/10.3390\/computation12060116","relation":{},"ISSN":["2079-3197"],"issn-type":[{"value":"2079-3197","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,5]]}}}