{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T15:05:57Z","timestamp":1777129557962,"version":"3.51.4"},"reference-count":28,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T00:00:00Z","timestamp":1758844800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["www.mdpi.com"],"crossmark-restriction":true},"short-container-title":["Future Internet"],"abstract":"<jats:p>This article presents a deep reinforcement learning (DRL) approach for adaptive robotic grasping in dynamic environments. We developed UR5GraspingEnv, a PyBullet-based simulation environment integrated with OpenAI Gym, to train a UR5 robotic arm with a Robotiq 2F-85 gripper. Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) were implemented to learn robust grasping policies for randomly positioned objects. A tailored reward function, combining distance penalties, grasp, and pose rewards, optimizes grasping and post-grasping tasks, enhanced by domain randomization. SAC achieves an 87% grasp success rate and 75% post-grasp success, outperforming PPO 82% and 68%, with stable convergence over 100,000 timesteps. The system addresses post-grasping manipulation and sim-to-real transfer challenges, advancing industrial and assistive applications. Results demonstrate the feasibility of learning stable and goal-driven policies for single-arm robotic manipulation using minimal supervision. Both PPO and SAC yield competitive performance, with SAC exhibiting superior adaptability in cluttered or edge cases. These findings suggest that DRL, when carefully designed and monitored, can support scalable learning in manipulation tasks.<\/jats:p>","DOI":"10.3390\/fi17100437","type":"journal-article","created":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T10:39:37Z","timestamp":1758883177000},"page":"437","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Deep Reinforcement Learning for Adaptive Robotic Grasping and Post-Grasp Manipulation in Simulated Dynamic Environments"],"prefix":"10.3390","volume":"17","author":[{"given":"Henrique C.","family":"Ferreira","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering, Institute of Engineering\u2014Polytechnic of Porto (ISEP\/IPP), 4249-015 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7410-8872","authenticated-orcid":false,"given":"Ramiro S.","family":"Barbosa","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, Institute of Engineering\u2014Polytechnic of Porto (ISEP\/IPP), 4249-015 Porto, Portugal"},{"name":"GECAD\u2014Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development, ISEP\/IPP, 4249-015 Porto, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,26]]},"reference":[{"key":"ref_1","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). OpenAI Gym. arXiv."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1109\/TRO.2013.2289018","article-title":"Data-Driven Grasp Synthesis\u2014A Survey","volume":"30","author":"Bohg","year":"2014","journal-title":"IEEE Trans. Robot."},{"key":"ref_3","unstructured":"Ferrari, C., and Canny, J. (1992, January 12\u201314). Planning Optimal Grasps. Proceedings of the IEEE International Conference on Robotics and Automation, Nice, France."},{"key":"ref_4","first-page":"95","article-title":"Semantic Grasping in Cluttered Environments","volume":"36","author":"Dang","year":"2014","journal-title":"Auton. Robot."},{"key":"ref_5","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-Level Control through Deep Reinforcement Learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1177\/0278364917710318","article-title":"Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection","volume":"37","author":"Levine","year":"2017","journal-title":"Int. J. Robot. Res."},{"key":"ref_8","unstructured":"Akkaya, I., Andrychowicz, M., Chociej, M., Litwin, M., McGrew, B., Petron, A., Powell, G., Ray, A., Schneider, J., and Sidor, S. (2019). Solving Rubik\u2019s Cube with a Robot Hand. arXiv."},{"key":"ref_9","unstructured":"Coumans, E., and Bai, Y. (2025, May 03). PyBullet Physics Simulation. 2016\u20132025. Available online: https:\/\/pybullet.org."},{"key":"ref_10","unstructured":"James, S., and Davison, A. (November, January 30). Sim-to-Real Robot Grasping via Domain Randomization. Proceedings of the CoRL 2019, Osaka, Japan."},{"key":"ref_11","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."},{"key":"ref_12","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., and Abbeel, P. (2017, January 24\u201328). Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. Proceedings of the IROS 2017, Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8202133"},{"key":"ref_14","unstructured":"Kalashnikov, D., Irpan, A., Pastor, P., Ibarz, J., Herzog, A., Jang, E., Quillen, D., Holly, E., Kalakrishnan, M., and Vanhoucke, V. (2018). QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Mahler, J., Liang, J., Niyaz, S., Laskey, M., Doan, R., Liu, X., Aparicio, J., and Goldberg, K. (2017, January 12\u201316). Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics. Proceedings of the RSS 2017, Cambridge, MA, USA.","DOI":"10.15607\/RSS.2017.XIII.058"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Rajeswaran, A., Kumar, V., Gupta, A., Schulman, J., Todorov, E., and Levine, S. (2017). Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations. arXiv.","DOI":"10.15607\/RSS.2018.XIV.049"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0278364919887447","article-title":"Learning Dexterous In-Hand Manipulation","volume":"39","author":"Andrychowicz","year":"2020","journal-title":"Int. J. Robot. Res."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement Learning in Robotics: A Survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_19","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2016, January 2\u20134). Continuous Control with Deep Reinforcement Learning. Proceedings of the ICLR 2016, San Juan, Puerto Rico."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Pinto, L., and Gupta, A. (2016, January 16\u201321). Supersizing Self-Supervision: Learning to Grasp from Fifty Thousand Tries and Seven Hundred Robot Hours. Proceedings of the ICRA 2016, Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487517"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Han, D., Zhao, L., Xu, K., Chen, Y., and Li, W. (2023). A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation. Sensors, 23.","DOI":"10.3390\/s23073762"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1007\/s10462-025-11262-2","article-title":"An overview of learning-based dexterous grasping: Recent advances and future directions","volume":"58","author":"Song","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Tang, C., Abbatematteo, B., Hu, J., Chandra, R., Mart\u00edn-Mart\u00edn, R., and Stone, P. (2024). Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes. arXiv.","DOI":"10.1146\/annurev-control-030323-022510"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Odeyemi, J., Ogbeyemi, A., Wong, K., and Zhang, W. (2024). On Automated Object Grasping for Intelligent Prosthetic Hands Using Machine Learning. Bioengineering, 11.","DOI":"10.3390\/bioengineering11020108"},{"key":"ref_25","first-page":"1","article-title":"Stable-Baselines3: Reliable Reinforcement Learning Implementations","volume":"22","author":"Raffin","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_26","unstructured":"Team, P. (2025, June 25). PyTorch Documentation. Available online: https:\/\/pytorch.org\/docs\/stable\/index.html."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Sun, Z., Yang, G.S., Zhang, B., and Zhang, W. (2011, January 21\u201323). On the concept of the resilient machine. Proceedings of the 2011 6th IEEE Conference on Industrial Electronics and Applications (ICIEA), Beijing, China.","DOI":"10.1109\/ICIEA.2011.5975608"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Andrychowicz, M., Baker, B., Chociej, M., Jozefowicz, R., McGrew, B., Pachocki, J., Petron, A., Plappert, M., Powell, G., and Ray, A. (2018). Learning Dexterous In-Hand Manipulation. arXiv.","DOI":"10.1177\/0278364919887447"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/10\/437\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T10:43:57Z","timestamp":1758883437000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/10\/437"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,26]]},"references-count":28,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["fi17100437"],"URL":"https:\/\/doi.org\/10.3390\/fi17100437","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,26]]}}}