{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T10:27:37Z","timestamp":1773829657121,"version":"3.50.1"},"reference-count":27,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Neurorobot."],"abstract":"<jats:sec><jats:title>Introduction<\/jats:title><jats:p>Reinforcement learning has been widely used in robot motion planning. However, for multi-step complex tasks of dual-arm robots, the trajectory planning method based on reinforcement learning still has some problems, such as ample exploration space, long training time, and uncontrollable training process. Based on the dual-agent depth deterministic strategy gradient (DADDPG) algorithm, this study proposes a motion planning framework constrained by the human joint angle, simultaneously realizing the humanization of learning content and learning style. It quickly plans the coordinated trajectory of dual-arm for complex multi-step tasks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>The proposed framework mainly includes two parts: one is the modeling of human joint angle constraints. The joint angle is calculated from the human arm motion data measured by the inertial measurement unit (IMU) by establishing a human-robot dual-arm kinematic mapping model. Then, the joint angle range constraints are extracted from multiple groups of demonstration data and expressed as inequalities. Second, the segmented reward function is designed. The human joint angle constraint guides the exploratory learning process of the reinforcement learning method in the form of step reward. Therefore, the exploration space is reduced, the training speed is accelerated, and the learning process is controllable to a certain extent.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results and discussion<\/jats:title><jats:p>The effectiveness of the framework was verified in the gym simulation environment of the Baxter robot's reach-grasp-align task. The results show that in this framework, human experience knowledge has a significant impact on the guidance of learning, and this method can more quickly plan the coordinated trajectory of dual-arm for multi-step tasks.<\/jats:p><\/jats:sec>","DOI":"10.3389\/fnbot.2024.1362359","type":"journal-article","created":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T04:44:17Z","timestamp":1708577057000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Motion planning framework based on dual-agent DDPG method for dual-arm robots guided by human joint angle constraints"],"prefix":"10.3389","volume":"18","author":[{"given":"Keyao","family":"Liang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fusheng","family":"Zha","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Guo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shengkai","family":"Liu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pengfei","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lining","family":"Sun","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2024,2,22]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"7863","DOI":"10.1109\/TNNLS.2021.3088947","article-title":"Complex robotic manipulation via graph-based hindsight goal generation","volume":"33","author":"Bing","year":"","journal-title":"IEEE Trans. Neural Netw. Learn. Syst"},{"key":"B2","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1109\/MRA.2022.3204237","article-title":"Simulation to real: learning energy-efficient slithering gaits for a snake-like robot","volume":"29","author":"Bing","year":"","journal-title":"IEEE Robot. Automat. Maga"},{"key":"B3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TNNLS.2023.3270298","article-title":"Meta-reinforcement learning in nonstationary and nonparametric environments","volume":"2023","author":"Bing","year":"","journal-title":"IEEE Trans. Neural Netw. Learn. Syst"},{"key":"B4","doi-asserted-by":"publisher","first-page":"3476","DOI":"10.1109\/TPAMI.2022.3185549","article-title":"Meta-reinforcement learning in non-stationary and dynamic environments","volume":"45","author":"Bing","year":"","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell"},{"key":"B5","doi-asserted-by":"publisher","first-page":"eadg7165","DOI":"10.1126\/scirobotics.adg7165","article-title":"Lateral flexion of a compliant spine improves motor performance in a bioinspired mouse robot","volume":"8","author":"Bing","year":"","journal-title":"Sci. Robot"},{"key":"B6","doi-asserted-by":"publisher","first-page":"2759","DOI":"10.1109\/TIE.2022.3172754","article-title":"Solving robotic manipulation with sparse reward reinforcement learning via graph-based diversity and proximity","volume":"70","author":"Bing","year":"","journal-title":"IEEE Trans. Indust. Electr"},{"key":"B7","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1145\/3243064.3243067","article-title":"Combining deep reinforcement learning with prior knowledge and reasoning","volume":"18","author":"Bougie","year":"2018","journal-title":"ACM SIGAPP Appl. Comput. Rev"},{"key":"B8","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1109\/TIV.2022.3153352","article-title":"Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance","volume":"8","author":"Chu","year":"2022","journal-title":"IEEE Trans. Intell. Vehicl"},{"key":"B9","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1115\/1.4011045","article-title":"A kinematic notation for lower-pair mechanisms based on matrices","volume":"22","year":"1955","journal-title":"J. Appl. Mech."},{"key":"B10","doi-asserted-by":"publisher","first-page":"4917","DOI":"10.1109\/LRA.2022.3152974","article-title":"Passive bimanual skills learning from demonstration with motion graph attention networks","volume":"7","author":"Dong","year":"2022","journal-title":"IEEE Robot. Automat. Lett"},{"key":"B11","doi-asserted-by":"crossref","first-page":"1060","DOI":"10.1109\/HUMANOIDS.2015.7363500","article-title":"\u201cEfficient self-collision avoidance based on focus of interest for humanoid robots,\u201d","volume-title":"2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids)","author":"Fang","year":"2015"},{"key":"B12","doi-asserted-by":"crossref","first-page":"3411","DOI":"10.1109\/ICRA.2017.7989388","article-title":"\u201cEfficient kinematic planning for mobile manipulators with non-holonomic constraints using optimal control,\u201d","volume-title":"2017 IEEE International Conference on Robotics and Automation (ICRA)","author":"Giftthaler","year":"2017"},{"key":"B13","doi-asserted-by":"publisher","first-page":"102","DOI":"10.3390\/robotics9040102","article-title":"Human-like arm motion generation: a review","volume":"9","author":"Gulletta","year":"2020","journal-title":"Robotics"},{"key":"B14","doi-asserted-by":"publisher","first-page":"04021087","DOI":"10.1061\/(ASCE)AS.1943-5525.0001335","article-title":"Coordinated control based on reinforcement learning for dual-arm continuum manipulators in space capture missions","volume":"34","author":"Jiang","year":"2021","journal-title":"J. Aerosp. Eng"},{"key":"B15","doi-asserted-by":"crossref","first-page":"3486","DOI":"10.1109\/IROS.2006.282591","article-title":"\u201cHuman-like arm motion generation for humanoid robots using motion capture database,\u201d","volume-title":"2006 IEEE\/RSJ International Conference on Intelligent Robots and Systems","author":"Kim","year":"2006"},{"key":"B16","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1007\/978-981-99-6492-5_39","article-title":"\u201cResearch on target trajectory planning method of humanoid manipulators based on reinforcement learning,\u201d","volume-title":"Intelligent Robotics and Applications","author":"Liang","year":"2023"},{"key":"B17","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/j.neunet.2020.04.007","article-title":"Phase portraits as movement primitives for fast humanoid robot control","volume":"129","author":"Maeda","year":"2020","journal-title":"Neural Netw"},{"key":"B18","doi-asserted-by":"publisher","first-page":"103779","DOI":"10.1016\/j.robot.2021.103779","article-title":"Learning context-adaptive task constraints for robotic manipulation","volume":"141","year":"2021","journal-title":"Rob. Auton. Syst"},{"key":"B19","doi-asserted-by":"publisher","first-page":"103515","DOI":"10.1016\/j.engappai.2020.103515","article-title":"Advising reinforcement learning toward scaling agents in continuous control environments with sparse rewards","volume":"90","year":"2020","journal-title":"Eng. Appl. Artif. Intell"},{"key":"B20","doi-asserted-by":"publisher","first-page":"2265","DOI":"10.1109\/TIE.2014.2353017","article-title":"Human-like motion generation and control for humanoid's dual arm object manipulation","volume":"62","year":"2014","journal-title":"IEEE Trans. Ind. Electron"},{"key":"B21","doi-asserted-by":"crossref","first-page":"5655","DOI":"10.1109\/ICRA.2015.7139991","article-title":"\u201cUsing synergies in dual-arm manipulation tasks,\u201d","volume-title":"2015 IEEE International Conference on Robotics and Automation (ICRA)","author":"Su\u00e1rez","year":"2015"},{"key":"B22","doi-asserted-by":"publisher","first-page":"564","DOI":"10.3390\/mi13040564","article-title":"Dual-arm robot trajectory planning based on deep reinforcement learning under complex environment","volume":"13","author":"Tang","year":"2022","journal-title":"Micromachines (Basel)"},{"key":"B23","doi-asserted-by":"publisher","first-page":"617","DOI":"10.5555\/2031678.2031705","article-title":"\u201cIntegrating reinforcement learning with human demonstrations of varying ability,\u201d","author":"Taylor","year":"2011","journal-title":"The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2"},{"key":"B24","doi-asserted-by":"publisher","first-page":"6357","DOI":"10.1109\/TITS.2021.3055899","article-title":"Learning to drive like human beings: a method based on deep reinforcement learning","volume":"23","author":"Tian","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst"},{"key":"B25","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1109\/MRA.2012.2192171","article-title":"Simultaneous grasp and motion planning: humanoid robot armar-iii","volume":"19","author":"Vahrenkamp","year":"2012","journal-title":"IEEE Robot. Autom. Mag"},{"key":"B26","doi-asserted-by":"publisher","first-page":"8455","DOI":"10.1109\/LRA.2022.3183786","article-title":"Assembly-oriented task sequence planning for a dual-arm robot","volume":"7","author":"Wang","year":"2022","journal-title":"IEEE Robot. and Automat. Lett"},{"key":"B27","doi-asserted-by":"publisher","first-page":"1056","DOI":"10.1109\/TCYB.2019.2949596","article-title":"Task-oriented deep reinforcement learning for robotic skill acquisition and control","volume":"51","year":"2019","journal-title":"IEEE Trans. Cybern"}],"container-title":["Frontiers in Neurorobotics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2024.1362359\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T04:44:21Z","timestamp":1708577061000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2024.1362359\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,22]]},"references-count":27,"alternative-id":["10.3389\/fnbot.2024.1362359"],"URL":"https:\/\/doi.org\/10.3389\/fnbot.2024.1362359","relation":{},"ISSN":["1662-5218"],"issn-type":[{"value":"1662-5218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,22]]},"article-number":"1362359"}}