{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T13:52:54Z","timestamp":1774965174343,"version":"3.50.1"},"reference-count":29,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T00:00:00Z","timestamp":1766016000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Robot. AI"],"abstract":"<jats:p>Navigating in unknown environments without prior maps poses a significant challenge for mobile robots due to sparse rewards, dynamic obstacles, and limited prior knowledge. This paper presents an Improved Deep Reinforcement Learning (DRL) framework based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm for adaptive mapless navigation. In addition to architectural enhancements, the proposed method offers theoretical benefits byincorporates a latent-state encoder and predictor module to transform high-dimensional sensor inputs into compact embeddings. This compact representation reduces the effective dimensionality of the state space, enabling smoother value-function approximation and mitigating overestimation errors common in actor\u2013critic methods. It uses intrinsic rewards derived from prediction error in the latent space to promote exploration of novel states. The intrinsic reward encourages the agent to prioritize uncertain yet informative regions, improving exploration efficiency under sparse extrinsic reward signals and accelerating convergence. Furthermore, training stability is achieved through regularization of the latent space via maximum mean discrepancy (MMD) loss. By enforcing consistent latent dynamics, the MMD constraint reduces variance in target value estimation and results in more stable policy updates. Experimental results in simulated ROS2\/Gazebo environments demonstrate that the proposed framework outperforms standard TD3 and other improved TD3 variants. Our model achieves a 93.1% success rate and a low 6.8% collision rate, reflecting efficient and safe goal-directed navigation. These findings confirm that combining intrinsic motivation, structured representation learning, and regularization-based stabilization produces more robust and generalizable policies for mapless mobile robot navigation.<\/jats:p>","DOI":"10.3389\/frobt.2025.1625968","type":"journal-article","created":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T06:45:31Z","timestamp":1766040331000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Adaptive mapless mobile robot navigation using deep reinforcement learning based improved TD3 algorithm"],"prefix":"10.3389","volume":"12","author":[{"given":"Shoaib Mohd","family":"Nasti","sequence":"first","affiliation":[]},{"given":"Zahoor Ahmad","family":"Najar","sequence":"additional","affiliation":[]},{"given":"Mohammad Ahsan","family":"Chishti","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,12,18]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1109\/100.580977","article-title":"The dynamic window approach to collision avoidance","volume":"4","author":"Fox","year":"1997","journal-title":"IEEE Robotics and Automation"},{"key":"B2","first-page":"1587","article-title":"Addressing function approximation error in actor-critic methods","volume-title":"Proceedings of the 35th international conference on machine learning (ICML)","author":"Fujimoto","year":"2018"},{"key":"B3","first-page":"723","article-title":"A kernel two-sample test","volume":"13","author":"Gretton","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"B4","article-title":"Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor","volume-title":"Proceedings of the international conference on machine learning (ICML)","author":"Haarnoja","year":"2018"},{"key":"B5","doi-asserted-by":"publisher","first-page":"2525","DOI":"10.3390\/s24082525","article-title":"Inspection robot navigation based on improved td3 algorithm","volume":"24","author":"Huang","year":"2024","journal-title":"Sensors"},{"key":"B6","doi-asserted-by":"publisher","first-page":"8651","DOI":"10.3390\/s23208651","article-title":"End-to-end autonomous navigation based on deep reinforcement learning with a survival penalty function","volume":"23","author":"Jeng","year":"2023","journal-title":"Sensors"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.1007\/s41870-025-02500-5","article-title":"Autonomous navigation of ros2 based turtlebot3 in static and dynamic environments using intelligent approach","author":"Kashyap","year":"2025","journal-title":"Int. J. Inf. Technol."},{"key":"B8","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1177\/027836498600500106","article-title":"Real-time obstacle avoidance for manipulators and mobile robots","volume":"5","author":"Khatib","year":"1986","journal-title":"Int. J. Robotics Res."},{"key":"B9","doi-asserted-by":"publisher","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement learning in robotics: a survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robotics Res."},{"key":"B10","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1007\/s10846-023-01888-1","article-title":"An efficient deep reinforcement learning algorithm for mapless navigation with gap-guided switching strategy","volume":"108","author":"Li","year":"2023","journal-title":"J. Intelligent and Robotic Syst."},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1509.02971","article-title":"Continuous control with deep reinforcement learning","author":"Lillicrap","year":"2015","journal-title":"arXiv Preprint arXiv:1509.02971"},{"key":"B12","article-title":"Continuous control with deep reinforcement learning","author":"Lillicrap","year":"2016","journal-title":"4th international conference on learning representations (ICLR)arXiv:1509.02971"},{"key":"B13","doi-asserted-by":"publisher","first-page":"247","DOI":"10.1109\/cisce62493.2024.10653233","article-title":"Td3 based collision free motion planning for robot navigation","author":"Liu","year":"2024","journal-title":"arXiv Preprint arXiv:2405.15460"},{"key":"B14","article-title":"Learning to navigate in complex environments","volume-title":"International conference on learning representations","author":"Mirowski","year":"2017"},{"key":"B15","article-title":"Asynchronous methods for deep reinforcement learning","volume-title":"Proceedings of the international conference on machine learning (ICML)","author":"Mnih","year":"2016"},{"key":"B16","doi-asserted-by":"publisher","first-page":"102874","DOI":"10.1016\/j.rineng.2024.102874","article-title":"Optimized td3 algorithm for robust autonomous navigation in crowded and dynamic human-interaction environments","volume":"24","author":"Neamah","year":"2024","journal-title":"Results Eng."},{"key":"B17","doi-asserted-by":"crossref","DOI":"10.1109\/CVPRW.2017.70","article-title":"Curiosity-driven exploration by self-supervised prediction","volume-title":"Proceedings of the international conference on machine learning (ICML)","author":"Pathak","year":"2017"},{"key":"B18","doi-asserted-by":"publisher","first-page":"22852","DOI":"10.1038\/s41598-024-72857-3","article-title":"Intelligent mobile robot navigation in unknown and complex environment using reinforcement learning technique","volume":"14","author":"Raj","year":"2024","journal-title":"Sci. Rep."},{"key":"B19","doi-asserted-by":"publisher","first-page":"2119","DOI":"10.22214\/ijraset.2025.73330","article-title":"Deep reinforcement learning with PPO for autonomous Mobile robot navigation using ROS 2 framework","volume":"13","author":"Rana","year":"2025","journal-title":"Int. J. Res. Appl. Sci. and Eng. Technol."},{"key":"B20","doi-asserted-by":"publisher","first-page":"3462","DOI":"10.1038\/s41598-022-07264-7","article-title":"A target-driven visual navigation method based on intrinsic motivation exploration and space topological cognition","volume":"12","author":"Ruan","year":"2022","journal-title":"Sci. Rep."},{"key":"B21","doi-asserted-by":"publisher","first-page":"06347","DOI":"10.48550\/arXiv.1707.06347","article-title":"Proximal policy optimization algorithms","author":"Schulman","year":"2017","journal-title":"CoRR"},{"key":"B22","doi-asserted-by":"crossref","DOI":"10.1109\/IROS.2017.8202134","article-title":"Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation","author":"Tai","year":"2017"},{"key":"B23","doi-asserted-by":"publisher","first-page":"03539v1","DOI":"10.48550\/arXiv.2408.03539","article-title":"Deep reinforcement learning for robotics: a survey of real-world successes","author":"Tang","year":"2024","journal-title":"arXiv Preprint arXiv:2408"},{"key":"B24","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1109\/icstcc59206.2023.10308456","article-title":"Deep reinforcement learning for mapless navigation of autonomous mobile robot","author":"Yadav","year":"2023","journal-title":"Int. J. Comput. Sci. Trends Comput. Commun. (IJCSTCC)"},{"key":"B25","doi-asserted-by":"publisher","first-page":"17298806241292893","DOI":"10.1177\/17298806241292893","article-title":"Mobile robot navigation based on intrinsic reward mechanism with td3 algorithm","volume":"21","author":"Yang","year":"2024","journal-title":"Int. J. Adv. Robotic Syst."},{"key":"B26","doi-asserted-by":"publisher","first-page":"9802","DOI":"10.3390\/s23249802","article-title":"Path planning of a mobile robot for a dynamic indoor environment based on an sac-lstm algorithm","volume":"23","author":"Zhang","year":"2023","journal-title":"Sensors"},{"key":"B27","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1109\/tmm.2021.3070138","article-title":"Deep-irtarget: an automatic target detector in infrared imagery using dual-domain feature extraction and allocation","volume":"24","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Multimedia"},{"key":"B28","doi-asserted-by":"publisher","first-page":"6735","DOI":"10.1109\/tcsvt.2023.3289142","article-title":"Differential feature awareness network within antagonistic learning for infrared-visible object detection","volume":"34","author":"Zhang","year":"2024","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"B29","doi-asserted-by":"publisher","first-page":"3730","DOI":"10.1109\/tnnls.2023.3347633","article-title":"Cognition-driven structural prior for instance-dependent label transition matrix estimation","volume":"36","author":"Zhang","year":"2025","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."}],"container-title":["Frontiers in Robotics and AI"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2025.1625968\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T06:45:36Z","timestamp":1766040336000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2025.1625968\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,18]]},"references-count":29,"alternative-id":["10.3389\/frobt.2025.1625968"],"URL":"https:\/\/doi.org\/10.3389\/frobt.2025.1625968","relation":{},"ISSN":["2296-9144"],"issn-type":[{"value":"2296-9144","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,18]]},"article-number":"1625968"}}