{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T16:02:49Z","timestamp":1774454569384,"version":"3.50.1"},"reference-count":23,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2024,10,12]],"date-time":"2024-10-12T00:00:00Z","timestamp":1728691200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["52175013"],"award-info":[{"award-number":["52175013"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering"],"published-print":{"date-parts":[[2025,3]]},"abstract":"<jats:p>When simultaneously addressing the challenges of dynamic target tracking and obstacle avoidance for robots, conventional control and control only based on reinforcement learning cannot deal with the complex scenarios effectively. The purpose of this study is to design a robot control algorithm that combines deep reinforcement learning (Soft Actor-Critic, SAC) with PID to achieve real-time tracking of a moving object and effectively avoid single or multiple obstacles. The control of the robot is divided into two key components: initially, the first joint of the 6-degree-of-freedom robot is controlled by PID algorithm, which makes the working plane (the plane coincident with the axis of the first joint and parallel to the linkage) quickly approach the target until it overlaps. Subsequently, the task of reinforcement learning is simplified to control the planar robot to track the target projection in working plane while avoiding the obstacle projection, ultimately achieve target tracking and obstacle avoidance in 3D space. The simulation and experiment results show that the proposed method has good efficiency and convergence speed. The SAC-PID strategy effectively controls the Universal-Robots UR5 to complete dynamic target tracking while accomplishing obstacle avoidance in both virtual and real-world environments.<\/jats:p>","DOI":"10.1177\/09596518241274000","type":"journal-article","created":{"date-parts":[[2024,10,12]],"date-time":"2024-10-12T12:05:13Z","timestamp":1728734713000},"page":"550-561","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["Robot target tracking control considering obstacle avoidance based on combination of deep reinforcement learning and PID"],"prefix":"10.1177","volume":"239","author":[{"given":"Yong","family":"Liu","sequence":"first","affiliation":[{"name":"School of Mechanical Engineering, Hefei University of Technology, Hefei, Anhui Province, China"}]},{"given":"Xiao","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Hefei University of Technology, Hefei, Anhui Province, China"}]},{"given":"Xiang","family":"Li","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Hefei University of Technology, Hefei, Anhui Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1244-8019","authenticated-orcid":false,"given":"Boxi","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Hefei University of Technology, Hefei, Anhui Province, China"}]},{"given":"Sen","family":"Qian","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Hefei University of Technology, Hefei, Anhui Province, China"}]},{"given":"Yihao","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Hefei University of Technology, Hefei, Anhui Province, China"}]}],"member":"179","published-online":{"date-parts":[[2024,10,12]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.3390\/s18020571"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.3390\/app10030935"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2017.2773669"},{"key":"e_1_3_2_5_2","volume-title":"Reinforcement learning: an Introduction","author":"Sutton RS","year":"2018","unstructured":"Sutton RS, Barto AG (eds). Reinforcement learning: an Introduction. 2nd ed. Cambridge, MA: The MIT Press, 2018.","edition":"2"},{"key":"e_1_3_2_6_2","first-page":"435","article-title":"Analysis of space manipulator route planning based on Sarsa (\u03bb) reinforcement learning","volume":"40","author":"Xu W","year":"2019","unstructured":"Xu W, Lu S. Analysis of space manipulator route planning based on Sarsa (\u03bb) reinforcement learning. J Astronaut 2019; 40: 435\u2013443.","journal-title":"J Astronaut"},{"key":"e_1_3_2_7_2","first-page":"1050","volume-title":"2017 IEEE international conference on computer vision workshops","author":"Chen J","unstructured":"Chen J, Bai T, Huang X, et al. Double-task deep Q-learning with multiple views. In: 2017 IEEE international conference on computer vision workshops, 22\u201329 October 2017, pp.1050\u20131058. New York: IEEE."},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-022-07900-2"},{"key":"e_1_3_2_9_2","first-page":"2063","volume-title":"European control conference (ECC)","author":"Sangiovanni B","unstructured":"Sangiovanni B, Rendiniello A, Incremona GP, et al. Deep reinforcement learning for collision avoidance of robotic manipulators. In: European control conference (ECC), Limassol, Cyprus, 12\u201315 June 2018, pp.2063\u20132068. New York: IEEE."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-27645-3_18"},{"key":"e_1_3_2_11_2","unstructured":"Degris T White M Sutton RS. Off-policy actor-critic https:\/\/arxiv.org\/abs\/1205.4839 (2012 accessed 20 June 2013)."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.23919\/ICCAS52745.2021.9649802"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2021.01.077"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11370-021-00387-2"},{"key":"e_1_3_2_15_2","first-page":"15","volume-title":"35th International conference on machine learning (ICML)","author":"Haarnoja T","unstructured":"Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: 35th International conference on machine learning (ICML), Stockholm, Sweden, 10\u201315 July 2018, p.15. San Diego: Jmlr-Journal Machine Learning Research."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1177\/17298814211007305"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1177\/0278364920987859"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2021.3125447"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-021-00366-1"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.3390\/aerospace9030163"},{"key":"e_1_3_2_21_2","unstructured":"Lillicrap TP Hunt JJ Pritzel A et al. Continuous control with deep reinforcement learning https:\/\/arxiv.org\/abs\/1509.02971 (2015 accessed 5 July 2019)."},{"key":"e_1_3_2_22_2","first-page":"41","article-title":"Research on robot dynamic target tracking and obstacle avoidance control based on DDPG-PID","volume":"54","author":"Liu Y","year":"2022","unstructured":"Liu Y, Li X, Jiang P, et al. Research on robot dynamic target tracking and obstacle avoidance control based on DDPG-PID. J Nanjing Univ Aeronaut Astronaut 2022; 54: 41\u201350.","journal-title":"J Nanjing Univ Aeronaut Astronaut"},{"key":"e_1_3_2_23_2","volume-title":"35th International conference on machine learning (ICML)","author":"Fujimoto S","unstructured":"Fujimoto S, van Hoof H, Meger D. Addressing function approximation error in actor-critic methods. In: 35th International conference on machine learning (ICML), Stockholm, Sweden, 10\u201315 July 2018. San Diego: Journal Machine Learning Research."},{"key":"e_1_3_2_24_2","unstructured":"Schulman J Wolski F Dhariwal P et al. Proximal policy optimization algorithms https:\/\/arxiv.org\/abs\/1707.06347 (2017 accessed 28 August 2017)."}],"container-title":["Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/09596518241274000","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/09596518241274000","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/09596518241274000","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,20]],"date-time":"2025-05-20T13:45:24Z","timestamp":1747748724000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/09596518241274000"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,12]]},"references-count":23,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["10.1177\/09596518241274000"],"URL":"https:\/\/doi.org\/10.1177\/09596518241274000","relation":{},"ISSN":["0959-6518","2041-3041"],"issn-type":[{"value":"0959-6518","type":"print"},{"value":"2041-3041","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,12]]}}}