{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T16:02:19Z","timestamp":1772208139606,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T00:00:00Z","timestamp":1761091200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62171141"],"award-info":[{"award-number":["62171141"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62461160260"],"award-info":[{"award-number":["62461160260"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Science and Technology Development Fund of Macau SAR","award":["0092\/2024\/AFJ"],"award-info":[{"award-number":["0092\/2024\/AFJ"]}]},{"name":"Science and Technology Development Fund of Macau SAR","award":["0075\/2023\/AMJ"],"award-info":[{"award-number":["0075\/2023\/AMJ"]}]},{"name":"Science and Technology Development Fund of Macau SAR","award":["0003\/2023\/RIB1"],"award-info":[{"award-number":["0003\/2023\/RIB1"]}]},{"name":"Science and Technology Development Fund of Macau SAR","award":["001\/2024\/SKL"],"award-info":[{"award-number":["001\/2024\/SKL"]}]},{"name":"Guangdong Science and Technology Department","award":["2024A1515011803"],"award-info":[{"award-number":["2024A1515011803"]}]},{"name":"Guangdong Science and Technology Department","award":["2023A0505030003"],"award-info":[{"award-number":["2023A0505030003"]}]},{"name":"Guangdong Science and Technology Department","award":["2020B1515130001"],"award-info":[{"award-number":["2020B1515130001"]}]},{"DOI":"10.13039\/501100004733","name":"University of Macau","doi-asserted-by":"crossref","award":["MYRG-GRG2023-00237-FST-UMDF"],"award-info":[{"award-number":["MYRG-GRG2023-00237-FST-UMDF"]}],"id":[{"id":"10.13039\/501100004733","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004733","name":"University of Macau","doi-asserted-by":"crossref","award":["MYRG-GRG2024-00299-FST"],"award-info":[{"award-number":["MYRG-GRG2024-00299-FST"]}],"id":[{"id":"10.13039\/501100004733","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Legged robots have great potential in complex environments, but achieving robust and natural locomotion remains difficult due to challenges in generating smooth gaits and resisting disturbances. This article presents a novel reinforcement learning framework that integrates a skeleton-aware graph neural network (GNN), a single-stage teacher\u2013student architecture, a system-response model, and a Wasserstein Adversarial Motion Priors (wAMP) module. The skeleton-aware GNN enriches observations by encoding key node information and link properties, providing structured body information and better spatial awareness on irregular terrains. Unlike conventional two-stage approaches, this method jointly trains teacher and student policies to accelerate learning and improve sim-to-real transfer using hybrid advantage estimation (HAE). The system-response model further enhances robustness by predicting future observations from historical states via contrastive learning, enabling the policy to anticipate terrain variations and external disturbances. Finally, wAMP provides a more stable adversarial imitation method for fitting expert datasets of both flat ground and stair locomotion. Experiments on quadruped robots demonstrate that the proposed approach achieves more natural gaits and stronger robustness than existing baselines.<\/jats:p>","DOI":"10.3390\/sym17111787","type":"journal-article","created":{"date-parts":[[2025,10,23]],"date-time":"2025-10-23T01:14:02Z","timestamp":1761182042000},"page":"1787","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Skeleton Information-Driven Reinforcement Learning Framework for Robust and Natural Motion of Quadruped Robots"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-7852-2444","authenticated-orcid":false,"given":"Huiyang","family":"Cao","sequence":"first","affiliation":[{"name":"School of Computer, Guangdong University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-1684-1638","authenticated-orcid":false,"given":"Hongfa","family":"Lei","sequence":"additional","affiliation":[{"name":"School of Automation, Guangdong University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8170-2463","authenticated-orcid":false,"given":"Yangjun","family":"Liu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Internet of Things for Smart City, Centre for Artificial Intelligence and Robotics, Department of Electromechanical Engineering, University of Macau, Macau 999078, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-3519-0008","authenticated-orcid":false,"given":"Zheng","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Automation, Guangdong University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-6425-9937","authenticated-orcid":false,"given":"Shuai","family":"Shi","sequence":"additional","affiliation":[{"name":"School of Computer, Guangdong University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-7004-772X","authenticated-orcid":false,"given":"Bingquan","family":"Li","sequence":"additional","affiliation":[{"name":"School of Automation, Guangdong University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6516-0927","authenticated-orcid":false,"given":"Weichao","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Automation, Guangdong University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9151-7758","authenticated-orcid":false,"given":"Zhi-Xin","family":"Yang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Internet of Things for Smart City, Centre for Artificial Intelligence and Robotics, Department of Electromechanical Engineering, University of Macau, Macau 999078, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,22]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4170","DOI":"10.1109\/LRA.2019.2931284","article-title":"Dynamic Locomotion on Slippery Ground","volume":"4","author":"Jenelten","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Bledt, G., Wensing, P.M., Ingersoll, S., and Kim, S. (2018, January 21\u201325). Contact Model Fusion for Event-Based Locomotion in Unstructured Terrains. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8460904"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Bloesch, M., Gehring, C., Fankhauser, P., Hutter, M., Hoepflinger, M.A., and Siegwart, R. (2013, January 3\u20137). State estimation for legged robots on unstable and slippery terrain. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan.","DOI":"10.1109\/IROS.2013.6697236"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Gehring, C., Bellicoso, C.D., Coros, S., Bloesch, M., Fankhauser, P., Hutter, M., and Siegwart, R. (October, January 28). Dynamic trotting on slopes for quadrupedal robots. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7354099"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Hartley, R., Mangelson, J., Gan, L., Jadidi, M.G., Walls, J.M., Eustice, R.M., and Grizzle, J.W. (2018). Legged Robot State-Estimation Through Combined Forward Kinematic and Preintegrated Contact Factors. arXiv.","DOI":"10.1109\/ICRA.2018.8460748"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"6166","DOI":"10.1109\/TIE.2024.3488360","article-title":"Slip Detection and Recovery for Quadruped Robots via Orthogonal Decomposition","volume":"72","author":"Yan","year":"2024","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"eaau5872","DOI":"10.1126\/scirobotics.aau5872","article-title":"Learning agile and dynamic motor skills for legged robots","volume":"4","author":"Hwangbo","year":"2019","journal-title":"Sci. Robot."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Haarnoja, T., Ha, S., Zhou, A., Tan, J., Tucker, G., and Levine, S. (2019). Learning to Walk via Deep Reinforcement Learning. arXiv.","DOI":"10.15607\/RSS.2019.XV.011"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kumar, A., Fu, Z., Pathak, D., and Malik, J. (2021). RMA: Rapid Motor Adaptation for Legged Robots. arXiv.","DOI":"10.15607\/RSS.2021.XVII.011"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Margolis, G.B., Yang, G., Paigwar, K., Chen, T., and Agrawal, P. (2022). Rapid Locomotion via Reinforcement Learning. arXiv.","DOI":"10.15607\/RSS.2022.XVIII.022"},{"key":"ref_11","unstructured":"Agarwal, A., Kumar, A., Malik, J., and Pathak, D. (2022). Legged Locomotion in Challenging Terrains using Egocentric Vision. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1553","DOI":"10.1109\/TRO.2025.3539193","article-title":"Fusion-Perception-to-Action Transformer: Enhancing Robotic Manipulation With 3-D Visual Fusion Attention and Proprioception","volume":"41","author":"Liu","year":"2025","journal-title":"IEEE Trans. Robot."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"105174","DOI":"10.1016\/j.robot.2025.105174","article-title":"Robotic manipulation framework based on semantic keypoints for packing shoes of different sizes, shapes, and softness","volume":"194","author":"Dong","year":"2025","journal-title":"Robot. Auton. Syst."},{"key":"ref_14","unstructured":"Sheng, J., Liu, Y., Xu, S., Yang, Z., and Liu, M. (2025). GPA-RAM: Grasp-Pretraining Augmented Robotic Attention Mamba for Spatial Task Learning. arXiv."},{"key":"ref_15","first-page":"1","article-title":"DeepMimic: Example-guided deep reinforcement learning of physics-based character skills","volume":"37","author":"Peng","year":"2018","journal-title":"Acm Trans. Graph."},{"key":"ref_16","unstructured":"Peng, X.B., Coumans, E., Zhang, T., Lee, T.W., Tan, J., and Levine, S. (2020). Learning Agile Robotic Locomotion Skills by Imitating Animals. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"eabc5986","DOI":"10.1126\/scirobotics.abc5986","article-title":"Learning quadrupedal locomotion over challenging terrain","volume":"5","author":"Lee","year":"2020","journal-title":"Sci. Robot."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3450626.3459670","article-title":"AMP: Adversarial motion priors for stylized physics-based character control","volume":"40","author":"Peng","year":"2021","journal-title":"Acm Trans. Graph."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Vollenweider, E., Bjelonic, M., Klemm, V., Rudin, N., Lee, J., and Hutter, M. (2022). Advanced Skills through Multiple Adversarial Motion Priors in Reinforcement Learning. arXiv.","DOI":"10.1109\/ICRA48891.2023.10160751"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Tang, A., Hiraoka, T., Hiraoka, N., Shi, F., Kawaharazuka, K., Kojima, K., Okada, K., and Inaba, M. (2024, January 13\u201317). HumanMimic: Learning Natural Locomotion and Transitions for Humanoid Robot via Wasserstein Adversarial Imitation. Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan.","DOI":"10.1109\/ICRA57147.2024.10610449"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lin, Y.S., Lin, L.S., and Chen, C.C. (2022). An Integrated Framework Based on GAN and RBI for Learning with Insufficient Datasets. Symmetry, 14.","DOI":"10.3390\/sym14020339"},{"key":"ref_22","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017). Wasserstein GAN. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"6768","DOI":"10.1109\/LRA.2025.3572427","article-title":"ALARM: Safe Reinforcement Learning With Reliable Mimicry for Robust Legged Locomotion","volume":"10","author":"Zhou","year":"2025","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_24","unstructured":"Kim, J.T., Park, J., Choi, S., and Ha, S. (2021). Learning Robot Structure and Motion Embeddings using Graph Neural Networks. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Phan Bui, K., Nguyen Truong, G., and Nguyen Ngoc, D. (2022). GCTD3: Modeling of Bipedal Locomotion by Combination of TD3 Algorithms and Graph Convolutional Network. Appl. Sci., 12.","DOI":"10.3390\/app12062948"},{"key":"ref_26","unstructured":"Gallien, T. (2025). Beyond Fixed Morphologies: Learning Graph Policies with Trust Region Compensation in Variable Action Spaces. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wang, R., Li, F., Liu, S., Li, W., Chen, S., Feng, B., and Jin, D. (2024). Adaptive Multi-Channel Deep Graph Neural Networks. Symmetry, 16.","DOI":"10.3390\/sym16040406"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wan, F., and Li, P. (2024). A Novel Money Laundering Prediction Model Based on a Dynamic Graph Convolutional Neural Network and Long Short-Term Memory. Symmetry, 16.","DOI":"10.3390\/sym16030378"},{"key":"ref_29","unstructured":"Long, J., Wang, Z., Li, Q., Gao, J., Cao, L., and Pang, J. (2024). Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Escontrela, A., Peng, X.B., Yu, W., Zhang, T., Iscen, A., Goldberg, K., and Abbeel, P. (2022). Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions. arXiv.","DOI":"10.1109\/IROS47612.2022.9981973"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"4630","DOI":"10.1109\/LRA.2022.3151396","article-title":"Concurrent Training of a Control Policy and a State Estimator for Dynamic and Robust Legged Locomotion","volume":"7","author":"Ji","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Nahrendra, I.M.A., Yu, B., and Myung, H. (2023). DreamWaQ: Learning Robust Quadrupedal Locomotion With Implicit Terrain Imagination via Deep Reinforcement Learning. arXiv.","DOI":"10.1109\/ICRA48891.2023.10161144"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wang, H., Luo, H., Zhang, W., and Chen, H. (2024). CTS: Concurrent Teacher\u2013Student Reinforcement Learning for Legged Locomotion. arXiv.","DOI":"10.1109\/LRA.2024.3457379"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, L., Li, R., Wang, X., Gao, W., and Chen, Y. (2025). A Motion Control Strategy for a Blind Hexapod Robot Based on Reinforcement Learning and Central Pattern Generator. Symmetry, 17.","DOI":"10.3390\/sym17071058"},{"key":"ref_35","unstructured":"Rudin, N., Hoeller, D., Reist, P., and Hutter, M. (2022). Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning. arXiv."},{"key":"ref_36","unstructured":"Warrington, A., Lavington, J.W., \u015acibior, A., Schmidt, M., and Wood, F. (2021). Robust Asymmetric Learning in POMDPs. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"252","DOI":"10.1021\/i200032a041","article-title":"Internal model control: PID controller design","volume":"25","author":"Rivera","year":"1986","journal-title":"Ind. Eng. Chem. Process Des. Dev."},{"key":"ref_38","unstructured":"Cuturi, M. (2013). Sinkhorn Distances: Lightspeed Computation of Optimal Transportation Distances. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"4975","DOI":"10.1109\/LRA.2023.3290509","article-title":"Learning Robust and Agile Legged Locomotion Using Adversarial Motion Priors","volume":"8","author":"Wu","year":"2023","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1109\/LRA.2018.2798285","article-title":"Gait and Trajectory Optimization for Legged Systems Through Phase-Based End-Effector Parameterization","volume":"3","author":"Winkler","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_41","unstructured":"(2025, September 09). OCS2: An Open Source Library for Optimal Control of Switched Systems. Available online: https:\/\/github.com\/leggedrobotics\/ocs2."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wu, G., Wang, P., Qiu, B., and Han, Y. (2025). SDA-RRT*Connect: A Path Planning and Trajectory Optimization Method for Robotic Manipulators in Industrial Scenes with Frame Obstacles. Symmetry, 17.","DOI":"10.3390\/sym17010001"},{"key":"ref_43","unstructured":"Margolis, G.B., and Agrawal, P. (2022). Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Huang, H., Cui, W., Zhang, T., Li, S., Han, J., Qin, B., Zhang, T., Zheng, L., Tang, Z., and Hu, C. (2025). Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands. arXiv.","DOI":"10.1109\/ICRA55743.2025.11127948"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"eabk2822","DOI":"10.1126\/scirobotics.abk2822","article-title":"Learning robust perceptive locomotion for quadrupedal robots in the wild","volume":"7","author":"Miki","year":"2022","journal-title":"Sci. Robot."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Luo, S., Li, S., Yu, R., Wang, Z., Wu, J., and Zhu, Q. (2024). PIE: Parkour with Implicit-Explicit Learning Framework for Legged Robots. arXiv.","DOI":"10.1109\/LRA.2024.3459797"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/11\/1787\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T04:26:39Z","timestamp":1761279999000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/11\/1787"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,22]]},"references-count":46,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["sym17111787"],"URL":"https:\/\/doi.org\/10.3390\/sym17111787","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,22]]}}}