{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T06:10:06Z","timestamp":1756793406088,"version":"3.44.0"},"reference-count":37,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T00:00:00Z","timestamp":1756771200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Neurorobot."],"abstract":"<jats:p>Robotic racket sports provide exceptional benchmarks for evaluating dynamic motion control capabilities in robots. Due to the highly non-linear dynamics of the shuttlecock, the stringent demands on robots' dynamic responses, and the convergence difficulties caused by sparse rewards in reinforcement learning, badminton strikes remain a formidable challenge for robot systems. To address these issues, this study proposes DTG-IRRL, a novel learning framework for badminton strikes that integrates imitation-relaxation reinforcement learning with dynamic trajectory generation. The framework demonstrates significantly improved training efficiency and performance, achieving faster convergence and twice the landing accuracy. Analysis of the reward function within a specific parameter space hyperplane intuitively reveals the convergence difficulties arising from the inherent sparsity of rewards in racket sports and demonstrates the framework's effectiveness in mitigating local and slow convergence. Implemented on hardware with zero-shot transfer, the framework achieves a 90% hitting rate and a 70% landing accuracy, enabling sustained humanrobot rallies. Cross-platform validation using the UR5 robot demonstrates the framework's generalizability while highlighting the requirement for high dynamic performance of robotic arms in racket sports.<\/jats:p>","DOI":"10.3389\/fnbot.2025.1649870","type":"journal-article","created":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T05:32:44Z","timestamp":1756791164000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Imitation-relaxation reinforcement learning for sparse badminton strikes via dynamic trajectory generation"],"prefix":"10.3389","volume":"19","author":[{"given":"Yanyan","family":"Yuan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yucheng","family":"Tao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shaowen","family":"Cheng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanhong","family":"Liang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yongbin","family":"Jin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongtao","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2025,9,2]]},"reference":[{"key":"B1","first-page":"212","article-title":"\u201ci-Sim2Real: Reinforcement learning of robotic policies in tight human-robot interaction loops,\u201d","volume-title":"Conference on Robot Learning","author":"Abeyruwan","year":"2023"},{"key":"B2","first-page":"1","article-title":"Model-free trajectory-based policy optimization with monotonic improvement","volume":"19","author":"Akrour","year":"2018","journal-title":"J. Mach. Learn. Res"},{"key":"B3","doi-asserted-by":"publisher","first-page":"3850","DOI":"10.1109\/TRO.2022.3176207","article-title":"Learning to play table tennis from scratch using muscular robots","volume":"38","author":"B\u00fcchler","year":"2022","journal-title":"IEEE Trans. Robot"},{"key":"B4","first-page":"1262","article-title":"\u201cLearning from suboptimal demonstration via self-supervised reward regression,\u201d","volume-title":"Conference on Robot Learning","author":"Chen","year":"2021"},{"key":"B5","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1051\/epn\/2016301","article-title":"Physics of ball sports","volume":"47","author":"Cohen","year":"2016","journal-title":"Europhys News"},{"key":"B6","doi-asserted-by":"publisher","first-page":"20130497","DOI":"10.1098\/rspa.2013.0497","article-title":"The aerodynamic wall","volume":"470","author":"Cohen","year":"2014","journal-title":"Proc. R. Soc. A: Math. Phys. Eng. Sci"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2023.XIX.006","article-title":"Robotic table tennis: a case study into a high speed learning system","author":"D'Ambrosio","year":"2023","journal-title":"arXiv"},{"key":"B8","article-title":"\u201cAchieving human-level competitive robot table tennis,\u201d","volume-title":"Proceedings of the 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities at the International Conference on Learning Representations (ICLR)","author":"D'Ambrosio","year":"2024"},{"key":"B9","doi-asserted-by":"crossref","first-page":"10780","DOI":"10.1109\/IROS47612.2022.9982205","article-title":"\u201cLearning high speed precision table tennis on a physical robot,\u201d","volume-title":"2022 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)","author":"Ding","year":"2022"},{"key":"B10","unstructured":"FZMotion Capture System\n          \n          2025"},{"key":"B11","doi-asserted-by":"crossref","first-page":"5556","DOI":"10.1109\/IROS45743.2020.9341191","article-title":"\u201cRobotic table tennis with model-free reinforcement learning,\u201d","volume-title":"2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)","author":"Gao","year":"2020"},{"key":"B12","first-page":"1","article-title":"\u201cA model-free approach to stroke learning for robotic table tennis,\u201d","volume-title":"2022 International Joint Conference on Neural Networks (IJCNN)","author":"Gao","year":"2022"},{"key":"B13","doi-asserted-by":"publisher","first-page":"13309","DOI":"10.1007\/s10489-022-04131-w","article-title":"Optimal stroke learning with policy gradient approach for robotic table tennis","volume":"53","author":"Gao","year":"2023","journal-title":"Appl. Intellig"},{"key":"B14","doi-asserted-by":"crossref","first-page":"3612","DOI":"10.1109\/IROS45743.2020.9341796","article-title":"\u201cFast tennis swing motion by ball trajectory prediction and joint trajectory modification in standalone humanoid robot real-time system,\u201d","volume-title":"2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS)","author":"Hattori","year":"2020"},{"key":"B15","doi-asserted-by":"crossref","first-page":"782","DOI":"10.23919\/ACC55779.2023.10156522","article-title":"\u201cDecision making of ball-batting robots based on deep reinforcement learning,\u201d","volume-title":"2023 American Control Conference (ACC)","author":"Hsiao","year":"2023"},{"key":"B16","doi-asserted-by":"crossref","first-page":"650","DOI":"10.1109\/HUMANOIDS.2016.7803343","article-title":"\u201cJointly learning trajectory generation and hitting point prediction in robot table tennis,\u201d","volume-title":"2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids)","author":"Huang","year":"2016"},{"key":"B17","doi-asserted-by":"publisher","first-page":"eaau5872","DOI":"10.1126\/scirobotics.aau5872","article-title":"Learning agile and dynamic motor skills for legged robots","volume":"4","author":"Hwangbo","year":"2019","journal-title":"Sci. Robot"},{"key":"B18","doi-asserted-by":"publisher","first-page":"1198","DOI":"10.1038\/s42256-022-00576-3","article-title":"High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning","volume":"4","author":"Jin","year":"2022","journal-title":"Nat. Mach. Intellig"},{"key":"B19","doi-asserted-by":"publisher","first-page":"1727","DOI":"10.1109\/LRA.2018.2803207","article-title":"High-speed and lightweight humanoid robot arm for a skillful badminton robot","volume":"3","author":"Mori","year":"2018","journal-title":"IEEE Robot. Automat. Letters"},{"key":"B20","doi-asserted-by":"publisher","first-page":"3601","DOI":"10.1109\/LRA.2019.2928778","article-title":"High-speed humanoid robot arm for badminton using pneumatic-electric hybrid actuators","volume":"4","author":"Mori","year":"2019","journal-title":"IEEE Robot. Automat. Letters"},{"key":"B21","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1109\/ICHR.2010.5686298","article-title":"\u201cLearning table tennis with a mixture of motor primitives,\u201d","volume-title":"2010 10th IEEE-RAS International Conference on Humanoid Robots","author":"Muelling","year":"2010"},{"key":"B22","doi-asserted-by":"crossref","first-page":"5113","DOI":"10.1109\/IROS.2011.6094506","article-title":"\u201cQuadrocopter ball juggling,\u201d","volume-title":"2011 IEEE\/RSJ international conference on Intelligent Robots and Systems","author":"M\u00fcller","year":"2011"},{"key":"B23","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/978-3-642-15193-4_26","article-title":"\u201cSimulating human table tennis with a biomimetic robot setup,\u201d","volume-title":"From Animals to Animats 11: 11th International Conference on Simulation of Adaptive Behavior, SAB 2010","author":"M\u00fclling","year":"2010"},{"key":"B24","doi-asserted-by":"crossref","first-page":"6292","DOI":"10.1109\/ICRA.2018.8463162","article-title":"\u201cOvercoming exploration in reinforcement learning with demonstrations,\u201d","volume-title":"2018 IEEE international conference on robotics and automation (ICRA)","author":"Nair","year":"2018"},{"key":"B25","article-title":"\u201cThe contribution of upper limb joints in the development of racket velocity in the badminton smash,\u201d","volume-title":"23 International Symposium on Biomechanics in Sports","author":"Rambely","year":"2005"},{"key":"B26","volume-title":"Guinness World Records 2015","author":"Records","year":"2014"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1707.06347","article-title":"Proximal policy optimization algorithms","author":"Schulman","year":"2017","journal-title":"arXiv"},{"key":"B28","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1007\/978-3-030-12939-2_3","article-title":"\u201cA table tennis robot system using an industrial kuka robot arm,\u201d","volume-title":"Pattern Recognition: 40th German Conference, GCPR 2018","author":"Tebbe","year":"2019"},{"key":"B29","doi-asserted-by":"crossref","first-page":"4171","DOI":"10.1109\/ICRA48506.2021.9560764","article-title":"\u201cSample-efficient reinforcement learning in robotic table tennis,\u201d","volume-title":"2021 IEEE International Conference on Robotics and Automation (ICRA)","author":"Tebbe","year":"2021"},{"key":"B30","unstructured":"Universal Robots\n          \n          2025"},{"key":"B31","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1109\/CMI.2016.7413746","article-title":"\u201cBadminton shuttlecock detection and prediction of trajectory using multiple 2 dimensional scanners,\u201d","volume-title":"2016 IEEE First International Conference on Control, Measurement and Instrumentation (CMI)","author":"Waghmare","year":"2016"},{"key":"B32","author":"Yang","year":"2022"},{"key":"B33","doi-asserted-by":"publisher","first-page":"5307","DOI":"10.1109\/LRA.2023.3293355","article-title":"Bat planner: aggressive flying ball player","volume":"8","author":"Yu","year":"2023","journal-title":"IEEE Robot. Automat. Letters"},{"key":"B34","doi-asserted-by":"publisher","first-page":"3542","DOI":"10.1109\/LRA.2025.3541910","article-title":"Optimal design of high-dynamic robotic arm based on angular momentum maximum","volume":"10","author":"Yuan","year":"2025","journal-title":"IEEE Robot. Automat. Letters"},{"key":"B35","doi-asserted-by":"publisher","first-page":"2245","DOI":"10.1109\/LRA.2023.3249401","article-title":"Athletic mobile manipulator system for robotic wheelchair tennis","volume":"8","author":"Zaidi","year":"2023","journal-title":"IEEE Robot. Automat. Letters"},{"key":"B36","doi-asserted-by":"publisher","first-page":"2208","DOI":"10.1109\/TIM.2014.2386951","article-title":"Optimal state estimation of spinning ping-pong ball using continuous motion model","volume":"64","author":"Zhao","year":"2015","journal-title":"IEEE Trans. Instrum. Meas"},{"key":"B37","doi-asserted-by":"publisher","first-page":"1682","DOI":"10.1017\/S0263574721001053","article-title":"A novel method of shuttlecock trajectory tracking and prediction for a badminton robot","volume":"40","author":"Zhi","year":"2022","journal-title":"Robotica"}],"container-title":["Frontiers in Neurorobotics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2025.1649870\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,2]],"date-time":"2025-09-02T05:32:48Z","timestamp":1756791168000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fnbot.2025.1649870\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,2]]},"references-count":37,"alternative-id":["10.3389\/fnbot.2025.1649870"],"URL":"https:\/\/doi.org\/10.3389\/fnbot.2025.1649870","relation":{},"ISSN":["1662-5218"],"issn-type":[{"value":"1662-5218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,2]]},"article-number":"1649870"}}