{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,27]],"date-time":"2026-04-27T10:50:15Z","timestamp":1777287015048,"version":"3.51.4"},"reference-count":38,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2021,5,19]],"date-time":"2021-05-19T00:00:00Z","timestamp":1621382400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>This paper set out to investigate the usefulness of solving collision avoidance problems with the help of deep reinforcement learning in an unknown environment, especially in compact spaces, such as a narrow corridor. This research aims to determine whether a deep reinforcement learning-based collision avoidance method is superior to the traditional methods, such as potential field-based methods and dynamic window approach. Besides, the proposed obstacle avoidance method was developed as one of the capabilities to enable each robot in a novel robotic system, namely the Self-reconfigurable and Transformable Omni-Directional Robotic Modules (STORM), to navigate intelligently and safely in an unknown environment. A well-conceived hardware and software architecture with features that enable further expansion and parallel development designed for the ongoing STORM projects is also presented in this work. A virtual STORM module with skid-steer kinematics was simulated in Gazebo to reduce the gap between the simulations and the real-world implementations. Moreover, comparisons among multiple training runs of the neural networks with different parameters related to balance the exploitation and exploration during the training process, as well as tests and experiments conducted in both simulation and real-world, are presented in detail. Directions for future research are also provided in the paper.<\/jats:p>","DOI":"10.3390\/robotics10020073","type":"journal-article","created":{"date-parts":[[2021,5,19]],"date-time":"2021-05-19T21:49:21Z","timestamp":1621460961000},"page":"73","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":38,"title":["A Collision Avoidance Method Based on Deep Reinforcement Learning"],"prefix":"10.3390","volume":"10","author":[{"given":"Shumin","family":"Feng","sequence":"first","affiliation":[{"name":"Robotics and Mechatronics Lab, Department of Mechanical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA"}]},{"given":"Bijo","family":"Sebastian","sequence":"additional","affiliation":[{"name":"Torc Robotics, Blacksburg, VA 24060, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9452-482X","authenticated-orcid":false,"given":"Pinhas","family":"Ben-Tzvi","sequence":"additional","affiliation":[{"name":"Robotics and Mechatronics Lab, Department of Mechanical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA"}]}],"member":"1968","published-online":{"date-parts":[[2021,5,19]]},"reference":[{"key":"ref_1","unstructured":"Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., and Graepel, T. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1561\/2200000071","article-title":"An Introduction to Deep Reinforcement Learning","volume":"11","author":"Henderson","year":"2018","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Moreira, I., Rivas, J., Cruz, F., Dazeley, R., Ayala, A., and Fernandes, B. (2020). Deep reinforcement learning with interactive feedback in a human-robot environment. Appl. Sci., 10.","DOI":"10.3390\/app10165574"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Gu, S., Holly, E., Lillicrap, T., and Levine, S. (June, January 29). Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989385"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1007\/978-3-319-46535-7_21","article-title":"Integration of inertial sensor data into control of the mobile platform","volume":"511","author":"Nemec","year":"2017","journal-title":"Adv. Intell. Syst. Comput."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1177\/1729881417744570","article-title":"Experimental investigations of a highly maneuverable mobile omniwheel robot","volume":"14","author":"Kilin","year":"2017","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3357","DOI":"10.1109\/ICRA.2017.7989381","article-title":"Target-driven visual navigation in indoor scenes using deep reinforcement learning","volume":"Volume 1","author":"Zhu","year":"2017","journal-title":"Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA)"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Tai, L., Paolo, G., and Liu, M. (2017, January 24\u201328). Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8202134"},{"key":"ref_9","first-page":"2505","article-title":"VFH*: Local obstacle avoidance with look-ahead verification","volume":"Volume 3","author":"Ulrich","year":"2000","journal-title":"Proceedings of the Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065)"},{"key":"ref_10","unstructured":"(2017, May 12). Gazebo. Available online: http:\/\/gazebosim.org\/."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Kumar, P., Saab, W., and Ben-Tzvi, P. (2017, January 6\u20139). Design of a Multi-Directional Hybrid-Locomotion Modular Robot With Feedforward Stability Control. Proceedings of the Volume 5B: 41st Mechanisms and Robotics Conference, Cleveland, OH, USA.","DOI":"10.1115\/DETC2017-67436"},{"key":"ref_12","first-page":"1","article-title":"A hybrid tracked-wheeled multi-directional mobile robot","volume":"11","author":"Saab","year":"2019","journal-title":"J. Mech. Robot."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Moubarak, P.M., Alvarez, E.J., and Ben-Tzvi, P. (2013, January 17\u201321). Reconfiguring a modular robot into a humanoid formation: A multi-body dynamic perspective on motion scheduling for modules and their assemblies. Proceedings of the 2013 IEEE International Conference on Automation Science and Engineering (CASE), Madison, WI, USA.","DOI":"10.1109\/CoASE.2013.6653891"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Sebastian, B., and Ben-Tzvi, P. (2018). Physics Based Path Planning for Autonomous Tracked Vehicle in Challenging Terrain. J. Intell. Robot. Syst. Theory Appl., 1\u201316.","DOI":"10.1007\/s10846-018-0851-3"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"021003","DOI":"10.1115\/1.4042347","article-title":"Active Disturbance Rejection Control for Handling Slip in Tracked Vehicle Locomotion","volume":"11","author":"Sebastian","year":"2018","journal-title":"J. Mech. Robot."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sohal, S.S., Saab, W., and Ben-Tzvi, P. (2018, January 26\u201329). Improved Alignment Estimation for Autonomous Docking of Mobile Robots. Proceedings of the Volume 5A: 42nd Mechanisms and Robotics Conference, Quebec City, QC, Canada.","DOI":"10.1115\/DETC2018-85626"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1109\/TSSC.1968.300136","article-title":"A Formal Basis for the Heuristic Determination of Minimum Cost Paths","volume":"4","author":"Hart","year":"1968","journal-title":"IEEE Trans. Syst. Sci. Cybern."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1145\/359156.359164","article-title":"An Algorithm for Planning Collision-Free Paths Among Polyhedral Obstacles","volume":"22","author":"Wesley","year":"1979","journal-title":"Commun. ACM"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1177\/027836498600500106","article-title":"Real-Time Obstacle Avoidance for Manipulators and Mobile Robots","volume":"5","author":"Khatib","year":"1986","journal-title":"Int. J. Rob. Res."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1109\/ROBOT.1999.770002","article-title":"High-speed navigation using the global dynamic window approach","volume":"Volume 1","author":"Brock","year":"1999","journal-title":"Proceedings of the Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C)"},{"key":"ref_21","first-page":"1398","article-title":"Potential field methods and their inherent limitations for mobile robot navigation","volume":"Volume 11","author":"Koren","year":"2016","journal-title":"Proceedings of the Proceedings. 1991 IEEE International Conference on Robotics and Automation"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1109\/70.88137","article-title":"The vector field histogram-fast obstacle avoidance for mobile robots","volume":"7","author":"Borenstein","year":"1991","journal-title":"IEEE Trans. Robot. Autom."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"37","DOI":"10.5772\/54427","article-title":"Fuzzy logic navigation and obstacle avoidance by a mobile robot in an unknown dynamic environment","volume":"10","author":"Faisal","year":"2013","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.robot.2015.04.007","article-title":"Navigation of multiple mobile robots in a highly clutter terrains using adaptive neuro-fuzzy inference system","volume":"72","author":"Pothal","year":"2015","journal-title":"Rob. Auton. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1007\/BF01386390","article-title":"A note on two problems in connexion with graphs","volume":"1","author":"Dijkstra","year":"1959","journal-title":"Numer. Math."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1145\/3828.3830","article-title":"Generalized Best-First Search Strategies and the Optimality of A","volume":"32","author":"Dechter","year":"1985","journal-title":"J. ACM"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"566","DOI":"10.1109\/70.508439","article-title":"Probabilistic roadmaps for path planning in high-dimensional configuration spaces","volume":"12","author":"Kavraki","year":"1996","journal-title":"IEEE Trans. Robot. Autom."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1109\/ACCESS.2014.2302442","article-title":"Sampling-based robot motion planning: A review","volume":"2","author":"Elbanhawi","year":"2014","journal-title":"IEEE Access"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1572","DOI":"10.1109\/ROBOT.1998.677362","article-title":"VFH+: Reliable obstacle avoidance for fast mobile robots","volume":"2","author":"Ulrich","year":"1998","journal-title":"Proc. IEEE Int. Conf. Robot. Autom."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Haarnoja, T., Pong, V., Zhou, A., Dalal, M., Abbeel, P., and Levine, S. (2018). Composable deep reinforcement learning for robotic manipulation. arXiv.","DOI":"10.1109\/ICRA.2018.8460756"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2124","DOI":"10.1109\/TVT.2018.2890773","article-title":"Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach","volume":"68","author":"Wang","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_32","first-page":"1","article-title":"Mobile robot obstacle avoidance base on deep reinforcement learning","volume":"5A-2019","author":"Feng","year":"2019","journal-title":"Proc. ASME Des. Eng. Tech. Conf."},{"key":"ref_33","unstructured":"Sutton, R.S., and Barto, A.G. (1988). Chapter 1 Introduction. Reinf. Learn. An Introd."},{"key":"ref_34","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. arXiv."},{"key":"ref_35","first-page":"2094","article-title":"Deep Reinforcement Learning with Double Q-learning","volume":"30","author":"Guez","year":"2016","journal-title":"Assoc. Adv. Artif. Intell."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1115\/1.4034014","article-title":"A Genderless Coupling Mechanism with 6-DOF Misalignment Capability for Modular Self-Reconfigurable Robots","volume":"8","author":"Saab","year":"2016","journal-title":"J. Mech. Robot."},{"key":"ref_37","unstructured":"(2018, February 23). POZYX Positioning System. Available online: https:\/\/www.pozyx.io\/."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Mandow, A., Martinez, J.L., Morales, J., Blanco, J.L., Garcia-Cerezo, A., and Gonzalez, J. (2007). Experimental kinematics for wheeled skid-steer mobile robots. IEEE Int. Conf. Intell. Robot. Syst., 1222\u20131227.","DOI":"10.1109\/IROS.2007.4399139"}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/10\/2\/73\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:03:53Z","timestamp":1760162633000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/10\/2\/73"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,19]]},"references-count":38,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,6]]}},"alternative-id":["robotics10020073"],"URL":"https:\/\/doi.org\/10.3390\/robotics10020073","relation":{},"ISSN":["2218-6581"],"issn-type":[{"value":"2218-6581","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,19]]}}}