{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T10:03:28Z","timestamp":1777716208470,"version":"3.51.4"},"reference-count":396,"publisher":"SAGE Publications","issue":"8","license":[{"start":{"date-parts":[[2025,1,22]],"date-time":"2025-01-22T00:00:00Z","timestamp":1737504000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"},{"start":{"date-parts":[[2025,1,22]],"date-time":"2025-01-22T00:00:00Z","timestamp":1737504000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"name":"Korea Evaluation Institute of Industrial Technology (KEIT) funded by the Korea Governmen","award":["Grant No.20018216"],"award-info":[{"award-number":["Grant No.20018216"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Legged locomotion holds the premise of universal mobility, a critical capability for many real-world robotic applications. Both model-based and learning-based approaches have advanced the field of legged locomotion in the past three decades. In recent years, however, a number of factors have dramatically accelerated progress in learning-based methods, including the rise of deep learning, rapid progress in simulating robotic systems, and the availability of high-performance and affordable hardware. This article aims to give a brief history of the field, to summarize recent efforts in learning locomotion skills for quadrupeds, and to provide researchers new to the area with an understanding of the key issues involved. With the recent proliferation of humanoid robots, we further outline the rapid rise of analogous methods for bipedal locomotion. We conclude with a discussion of open problems as well as related societal impact.<\/jats:p>","DOI":"10.1177\/02783649241312698","type":"journal-article","created":{"date-parts":[[2025,1,23]],"date-time":"2025-01-23T02:14:47Z","timestamp":1737598487000},"page":"1396-1427","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":35,"title":["Learning-based legged locomotion: State of the art and future perspectives"],"prefix":"10.1177","volume":"44","author":[{"given":"Sehoon","family":"Ha","sequence":"first","affiliation":[]},{"given":"Joonho","family":"Lee","sequence":"additional","affiliation":[{"name":"Neuromeka Co., Ltd, Seoul, Korea"}]},{"given":"Michiel","family":"van de Panne","sequence":"additional","affiliation":[]},{"given":"Zhaoming","family":"Xie","sequence":"additional","affiliation":[{"name":"The AI Institute, Cambridge, USA"}]},{"given":"Wenhao","family":"Yu","sequence":"additional","affiliation":[{"name":"Google DeepMind, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9889-6543","authenticated-orcid":false,"given":"Majid","family":"Khadiv","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Technical University of Munich (TUM), Munich, Germany"}]}],"member":"179","published-online":{"date-parts":[[2025,1,22]]},"reference":[{"key":"e_1_3_3_2_1","volume-title":"Mechanical Design for Robot Locomotion","author":"Abate AM","year":"2018","unstructured":"Abate AM (2018) Mechanical Design for Robot Locomotion. Graduate Thesis."},{"key":"e_1_3_3_3_1","unstructured":"Abdolmaleki A Springenberg JT Tassa Y et al. (2018) Maximum a posteriori policy optimisation. arXiv preprint arXiv:1806.06920."},{"key":"e_1_3_3_4_1","volume-title":"RML@ ICLR","author":"Abel D","year":"2019","unstructured":"Abel D (2019) simple_rl: reproducible reinforcement learning in python. In: RML@ ICLR. New Orleans, LA."},{"issue":"3","key":"e_1_3_3_5_1","first-page":"2531","article-title":"Simultaneous contact, gait, and motion planning for robust multilegged locomotion via mixed-integer convex optimization","volume":"3","author":"Aceituno-Cabezas B","year":"2017","unstructured":"Aceituno-Cabezas B, Mastalli C, Dai H, et al. (2017) Simultaneous contact, gait, and motion planning for robust multilegged locomotion via mixed-integer convex optimization. IEEE Robotics and Automation Letters 3(3): 2531\u20132538.","journal-title":"IEEE Robotics and Automation Letters"},{"key":"e_1_3_3_6_1","first-page":"22","volume-title":"International Conference on Machine Learning","author":"Achiam J","year":"2017","unstructured":"Achiam J, Held D, Tamar A, et al. (2017) Constrained policy optimization. In: International Conference on Machine Learning. PMLR, 22\u201331."},{"key":"e_1_3_3_7_1","unstructured":"Achiam J Adler S Agarwal S et al. (2023) Gpt-4 technical report. arXiv preprint arXiv:2303.08774."},{"key":"e_1_3_3_8_1","first-page":"403","volume-title":"Conference on Robot Learning","author":"Agarwal A","year":"2023","unstructured":"Agarwal A, Kumar A, Malik J, et al. (2023) Legged locomotion in challenging terrains using egocentric vision. In: Conference on Robot Learning. PMLR, 403\u2013415."},{"key":"e_1_3_3_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids43949.2019.9035023"},{"key":"e_1_3_3_10_1","unstructured":"Akkaya I Andrychowicz M Chociej M et al. (2019) Solving rubik\u2019s cube with a robot hand. arXiv preprint arXiv:1910.07113."},{"key":"e_1_3_3_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9635840"},{"key":"e_1_3_3_12_1","doi-asserted-by":"crossref","unstructured":"Arm P Mittal M Kolvenbach H et al. (2024) Pedipulate: enabling manipulation skills using a quadruped robot\u2019s leg. arXiv preprint arXiv:2402.10837.","DOI":"10.1109\/ICRA57147.2024.10611307"},{"key":"e_1_3_3_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-74666-1_17"},{"key":"e_1_3_3_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-009-9133-z"},{"key":"e_1_3_3_15_1","doi-asserted-by":"crossref","unstructured":"Bao L Humphreys J Peng T et al. (2024) Deep reinforcement learning for bipedal locomotion: a brief survey. arXiv preprint arXiv:2401.16889.","DOI":"10.1007\/s10462-025-11451-z"},{"key":"e_1_3_3_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2013.6630926"},{"key":"e_1_3_3_17_1","unstructured":"Barth-Maron G Hoffman MW Budden D et al. (2018) Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617."},{"key":"e_1_3_3_18_1","volume-title":"Spot Specifications","author":"BDI","year":"2015","unstructured":"BDI (2015) Spot Specifications. URL. https:\/\/support.bostondynamics.com\/s\/article\/Robot-specifications."},{"key":"e_1_3_3_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9561396"},{"key":"e_1_3_3_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3218167"},{"key":"e_1_3_3_21_1","unstructured":"Bellegarda G Nguyen C Nguyen Q (2020) Robust quadruped jumping via deep reinforcement learning. arXiv preprint arXiv:2011.07089."},{"key":"e_1_3_3_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9982132"},{"key":"e_1_3_3_23_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.153.3731.34"},{"key":"e_1_3_3_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CASE49439.2021.9551430"},{"key":"e_1_3_3_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2019.2899750"},{"key":"e_1_3_3_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9636371"},{"key":"e_1_3_3_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2018.8593885"},{"key":"e_1_3_3_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9290(89)90224-8"},{"key":"e_1_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.3011379"},{"key":"e_1_3_3_30_1","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2022.854212"},{"key":"e_1_3_3_31_1","first-page":"7","article-title":"Generalization in reinforcement learning: safely approximating the value function","volume":"7","author":"Boyan J","year":"1994","unstructured":"Boyan J, Moore A (1994) Generalization in reinforcement learning: safely approximating the value function. Advances in Neural Information Processing Systems 7: 7.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981648"},{"key":"e_1_3_3_33_1","unstructured":"Brohan A Brown N Carbajal J et al. (2023) Rt-2: vision-language-action models transfer web knowledge to robotic control. arXiv preprint arXiv:2307.15818."},{"key":"e_1_3_3_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-008-9099-2"},{"key":"e_1_3_3_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2006.281802"},{"key":"e_1_3_3_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2009.5354681"},{"key":"e_1_3_3_37_1","first-page":"1","article-title":"Sample-efficient reinforcement learning with stochastic ensemble value expansion","volume":"31","author":"Buckman J","year":"2018","unstructured":"Buckman J, Hafner D, Tucker G, et al. (2018) Sample-efficient reinforcement learning with stochastic ensemble value expansion. Advances in Neural Information Processing Systems 31: 1.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.1998.677408"},{"key":"e_1_3_3_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.1999.770456"},{"key":"e_1_3_3_40_1","doi-asserted-by":"publisher","DOI":"10.3390\/s24154981"},{"key":"e_1_3_3_41_1","first-page":"35","volume-title":"International Conference on Robot Intelligence Technology and Applications","author":"Byun JW","year":"2021","unstructured":"Byun JW, Youm D, Jeon S, et al. (2021) Learning footstep planning for the quadrupedal locomotion with model predictive control. In: International Conference on Robot Intelligence Technology and Applications. Springer, 35\u201343."},{"key":"e_1_3_3_42_1","unstructured":"Caluwaerts K Iscen A Kew JC et al. (2023) Barkour: benchmarking animal-level agility with quadruped robots. arXiv preprint arXiv:2305.14654."},{"key":"e_1_3_3_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2974653"},{"key":"e_1_3_3_44_1","unstructured":"Caron S (2022) Awesome robot description. URL. https:\/\/github.com\/robot-descriptions\/awesome-robot-descriptions?tab=readme-ov-file."},{"key":"e_1_3_3_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/SII.2019.8700380"},{"key":"e_1_3_3_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS55552.2023.10341263"},{"key":"e_1_3_3_47_1","doi-asserted-by":"crossref","unstructured":"Chamorro S Klemm V Valls I et al. (2024) Reinforcement learning for blind stair climbing with legged and wheeled-legged robots. arXiv preprint arXiv:2402.06143.","DOI":"10.1109\/ICRA57147.2024.10610069"},{"key":"e_1_3_3_48_1","doi-asserted-by":"crossref","unstructured":"Chane-Sane E Leziart P-A Flayols T et al. (2024) Cat: constraints as terminations for legged locomotion reinforcement learning. arXiv preprint arXiv:2403.18765.","DOI":"10.1109\/IROS58592.2024.10802334"},{"key":"e_1_3_3_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793789"},{"key":"e_1_3_3_50_1","first-page":"66","volume-title":"Conference on Robot Learning","author":"Chen D","year":"2020","unstructured":"Chen D, Zhou B, Koltun V, et al. (2020) Learning by cheating. In: Conference on Robot Learning. PMLR, 66\u201375."},{"key":"e_1_3_3_51_1","unstructured":"Chen X Wang C Zhou Z et al. (2021) Randomized ensembled double q-learning: learning fast without a model. arXiv preprint arXiv:2101.05982."},{"key":"e_1_3_3_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids57100.2023.10375154"},{"key":"e_1_3_3_53_1","doi-asserted-by":"crossref","unstructured":"Cheng J Vlastelica M Kolev P et al. (2023a) Learning diverse skills for local navigation under multi-constraint optimality. arXiv preprint arXiv:2310.02440.","DOI":"10.1109\/ICRA57147.2024.10611629"},{"key":"e_1_3_3_54_1","doi-asserted-by":"crossref","unstructured":"Cheng X Kumar A Pathak D. (2023b) Legs as manipulator: pushing quadrupedal agility beyond locomotion. arXiv preprint arXiv:2303.11330.","DOI":"10.1109\/ICRA48891.2023.10161470"},{"key":"e_1_3_3_55_1","doi-asserted-by":"crossref","unstructured":"Cheng X Shi K Agarwal A et al. (2023c) Extreme parkour with legged robots. arXiv preprint arXiv:2309.14341.","DOI":"10.1109\/ICRA57147.2024.10610200"},{"key":"e_1_3_3_56_1","doi-asserted-by":"crossref","unstructured":"Cheng X Ji Y Chen J et al. (2024) Expressive whole-body control for humanoid robots. arXiv preprint arXiv:2402.16796.","DOI":"10.15607\/RSS.2024.XX.107"},{"key":"e_1_3_3_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS47582.2021.9555782"},{"key":"e_1_3_3_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794207"},{"key":"e_1_3_3_59_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.ade2256"},{"key":"e_1_3_3_60_1","first-page":"63","article-title":"Deep reinforcement learning in a handful of trials using probabilistic dynamics models","volume":"31","author":"Chua K","year":"2018","unstructured":"Chua K, Calandra R, McAllister R, et al. (2018) Deep reinforcement learning in a handful of trials using probabilistic dynamics models. Advances in Neural Information Processing Systems 31: 63.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3068769"},{"key":"e_1_3_3_62_1","volume-title":"Motion Imitation","author":"Coumans E","year":"2020","unstructured":"Coumans E (2020) Motion Imitation. URL. https:\/\/github.com\/erwincoumans\/motion_imitation."},{"key":"e_1_3_3_63_1","article-title":"Pybullet, a python module for physics simulation for games","author":"Coumans E","year":"2016","unstructured":"Coumans E, Bai Y (2016) Pybullet, a python module for physics simulation for games. Robotics and machine learning.","journal-title":"Robotics and machine learning"},{"key":"e_1_3_3_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3100269"},{"key":"e_1_3_3_65_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02551274"},{"key":"e_1_3_3_66_1","volume-title":"Xiaomi Launches Cyberdog \u2013 an Open Source Quadruped Robot Companion","author":"CyberDog","year":"2021","unstructured":"CyberDog (2021) Xiaomi Launches Cyberdog \u2013 an Open Source Quadruped Robot Companion. URL. https:\/\/www.mi.com\/global\/discover\/article?id=2069."},{"key":"e_1_3_3_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3061381"},{"key":"e_1_3_3_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9560742"},{"key":"e_1_3_3_69_1","first-page":"16","volume-title":"2nd Workshop on Closing the Reality Gap in Sim2Real Transfer for Robotics","author":"Dao J","year":"2020","unstructured":"Dao J, Duan H, Green K, et al. (2020) Learning to walk without dynamics randomization. In: 2nd Workshop on Closing the Reality Gap in Sim2Real Transfer for Robotics, p. 16."},{"key":"e_1_3_3_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA57147.2024.10610977"},{"key":"e_1_3_3_71_1","unstructured":"DARPA Robotics Challenge D (2015) The 2015 darpa robotics challenge finals. URL. https:\/\/www.youtube.com\/watch?v=8P9geWwi9e0."},{"key":"e_1_3_3_72_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2023.3236952"},{"key":"e_1_3_3_73_1","unstructured":"Deeprobotics (2021) Quadruped series from deep robotics. URL. https:\/\/www.deeprobotics.cn\/en\/index\/product1.html."},{"key":"e_1_3_3_74_1","doi-asserted-by":"publisher","DOI":"10.3389\/fnbot.2019.00006"},{"key":"e_1_3_3_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2014.7041373"},{"key":"e_1_3_3_76_1","volume-title":"IEEE-RAS International Conference on Humanoid Robots (Humanoids)","author":"Dh\u00e9din V","unstructured":"Dh\u00e9din V, Chinnakkonda Ravi AK, Jordana A, et al. Diffusion-based learning of contact plans for agile locomotion. In: IEEE-RAS International Conference on Humanoid Robots (Humanoids), arXiv\u20132403, 2024. URL. https:\/\/arxiv.org\/abs\/2403.03639."},{"key":"e_1_3_3_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2018.8594448"},{"key":"e_1_3_3_78_1","unstructured":"Disney (2023) A new approach to Disney\u2019s robotic character pipeline. URL. https:\/\/www.youtube.com\/watch?v=-cfIm06tcfA."},{"key":"e_1_3_3_79_1","volume-title":"Losing Humanity: The Case against Killer Robots","author":"Docherty BL","year":"2012","unstructured":"Docherty BL (2012) Losing Humanity: The Case against Killer Robots. Human Rights Watch."},{"key":"e_1_3_3_80_1","doi-asserted-by":"publisher","DOI":"10.20965\/jrm.2006.p0318"},{"key":"e_1_3_3_81_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCS.1987.1105273"},{"key":"e_1_3_3_82_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3056064"},{"key":"e_1_3_3_83_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9561705"},{"key":"e_1_3_3_84_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981884"},{"key":"e_1_3_3_85_1","doi-asserted-by":"crossref","unstructured":"Duan H Pandit B Gadde MS et al. (2023) Learning vision-based bipedal locomotion for challenging terrain. arXiv preprint arXiv:2309.14594.","DOI":"10.1109\/ICRA57147.2024.10611621"},{"key":"e_1_3_3_86_1","unstructured":"Dynamics L (2024) Limx dynamics\u2019 biped robot p1 conquers the wild based on reinforcement learning. URL. https:\/\/www.youtube.com\/watch?v=UpNid_rWDnI."},{"key":"e_1_3_3_87_1","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2014.7041473"},{"key":"e_1_3_3_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981973"},{"key":"e_1_3_3_89_1","first-page":"1407","volume-title":"International Conference on Machine Learning","author":"Espeholt L","year":"2018","unstructured":"Espeholt L, Soyer H, Munos R, et al. (2018) Impala: scalable distributed deep-rl with importance weighted actor-learner architectures. In: International Conference on Machine Learning. PMLR, 1407\u20131416."},{"key":"e_1_3_3_90_1","unstructured":"Eysenbach B Gupta A Ibarz J et al. (2018) Diversity is all you need: learning skills without a reward function. arXiv preprint arXiv:1802.06070."},{"key":"e_1_3_3_91_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2013.6696352"},{"key":"e_1_3_3_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2017.8246930"},{"key":"e_1_3_3_93_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-74315-8"},{"key":"e_1_3_3_94_1","unstructured":"Feinberg V Wan A Stoica I et al. (2018) Model-based value estimation for efficient model-free reinforcement learning. arXiv preprint arXiv:1803.00101."},{"key":"e_1_3_3_95_1","doi-asserted-by":"publisher","DOI":"10.1002\/rob.21559"},{"key":"e_1_3_3_96_1","unstructured":"Firoozi R Tucker J Tian S et al. (2023) Foundation models in robotics: applications challenges and the future. arXiv preprint arXiv:2312.07843."},{"key":"e_1_3_3_97_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-016-9573-1"},{"key":"e_1_3_3_98_1","unstructured":"Freeman CD Frey E Raichuk A et al. (2021) Brax\u2013a differentiable physics engine for large scale rigid body simulation. arXiv preprint arXiv:2106.13281."},{"key":"e_1_3_3_99_1","first-page":"138","volume-title":"Conference on Robot Learning","author":"Fu Z","year":"2023","unstructured":"Fu Z, Cheng X, Pathak D (2023) Deep whole-body control: learning a unified policy for manipulation and locomotion. In: Conference on Robot Learning. PMLR, 138\u2013149."},{"key":"e_1_3_3_100_1","unstructured":"Fu Z Zhao Q Wu Q et al. (2024) Humanoid shadowing and imitation from humans. arXiv preprint arXiv:2406.10454."},{"key":"e_1_3_3_101_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160562"},{"key":"e_1_3_3_102_1","first-page":"1587","volume-title":"International Conference on Machine Learning","author":"Fujimoto S","year":"2018","unstructured":"Fujimoto S, Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. In: International Conference on Machine Learning. PMLR, 1587\u20131596."},{"key":"e_1_3_3_103_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2979656"},{"key":"e_1_3_3_104_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9561639"},{"key":"e_1_3_3_105_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2022.3172469"},{"key":"e_1_3_3_106_1","unstructured":"Gazar A Khadiv M Del Prete A et al. (2023) Multi-contact stochastic predictive control for legged robots with contact locations uncertainty. arXiv preprint arXiv:2309.04469."},{"key":"e_1_3_3_107_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417766"},{"key":"e_1_3_3_108_1","volume-title":"5th International Conference on Climbing and Walking Robots","author":"Geyer H","year":"2002","unstructured":"Geyer H, Blickhan R, Seyfarth A (2002) Natural dynamics of spring-like running: emergence of selfstability. In: 5th International Conference on Climbing and Walking Robots. England: Professional Engineering Publishing Ltd Suffolk, Vol. 92."},{"issue":"2","key":"e_1_3_3_109_1","first-page":"67","article-title":"The theory of affordances","volume":"1","author":"Gibson JJ","year":"1977","unstructured":"Gibson JJ (1977) The theory of affordances. Hilldale, USA 1(2): 67\u201382.","journal-title":"Hilldale, USA"},{"key":"e_1_3_3_110_1","first-page":"817","volume-title":"Conference on Robot Learning","author":"Golemo F","year":"2018","unstructured":"Golemo F, Taiga AA, Courville A, et al. (2018) Sim-to-real transfer with neural-augmented robot simulation. In: Conference on Robot Learning. PMLR, 817\u2013828."},{"key":"e_1_3_3_111_1","volume-title":"Deep Learning","author":"Goodfellow I","year":"2016","unstructured":"Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT press."},{"key":"e_1_3_3_112_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-007-7194-9"},{"key":"e_1_3_3_113_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2023.3275384"},{"key":"e_1_3_3_114_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3066833"},{"key":"e_1_3_3_115_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2976639"},{"key":"e_1_3_3_116_1","unstructured":"Ha D Schmidhuber J. (2018) World models. arXiv preprint arXiv:1803.10122."},{"key":"e_1_3_3_117_1","doi-asserted-by":"publisher","DOI":"10.1109\/URAI.2018.8442201"},{"key":"e_1_3_3_118_1","unstructured":"Ha S Xu P Tan Z et al. (2020) Learning to walk in the real world with minimal human effort. arXiv preprint arXiv:2002.08550."},{"key":"e_1_3_3_119_1","first-page":"1851","volume-title":"International Conference on Machine Learning","author":"Haarnoja T","year":"2018","unstructured":"Haarnoja T, Hartikainen K, Abbeel P, et al. (2018a) Latent space policies for hierarchical reinforcement learning. In: International Conference on Machine Learning. PMLR, 1851\u20131860."},{"key":"e_1_3_3_120_1","unstructured":"Haarnoja T Zhou A Hartikainen K et al. (2018b) Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905."},{"key":"e_1_3_3_121_1","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2019.XV.011"},{"key":"e_1_3_3_122_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.adi8022"},{"key":"e_1_3_3_123_1","unstructured":"Hafner D Lillicrap T Ba J et al. (2019a) Dream to control: learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603."},{"key":"e_1_3_3_124_1","first-page":"2555","volume-title":"International Conference on Machine Learning","author":"Hafner D","year":"2019","unstructured":"Hafner D, Lillicrap T, Fischer I, et al. (2019b) Learning latent dynamics for planning from pixels. In: International Conference on Machine Learning. PMLR, 2555\u20132565."},{"key":"e_1_3_3_125_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3068951"},{"key":"e_1_3_3_126_1","doi-asserted-by":"crossref","unstructured":"Han L Zhu Q Sheng J et al. (2023) Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models. arXiv preprint arXiv:2308.15143.","DOI":"10.21203\/rs.3.rs-3309878\/v1"},{"key":"e_1_3_3_127_1","doi-asserted-by":"publisher","DOI":"10.1162\/106365603321828970"},{"key":"e_1_3_3_128_1","doi-asserted-by":"crossref","unstructured":"He T Zhang C Xiao W et al. (2024) Agile but safe: learning collision-free high-speed legged locomotion. arXiv preprint arXiv:2401.17583.","DOI":"10.15607\/RSS.2024.XX.059"},{"key":"e_1_3_3_129_1","unstructured":"Heess N Tb D Sriram S et al. (2017) Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286."},{"key":"e_1_3_3_130_1","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2017.8246895"},{"key":"e_1_3_3_131_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"e_1_3_3_132_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2010.5509181"},{"key":"e_1_3_3_133_1","unstructured":"Hiraoka T Imagawa T Hashimoto T et al. (2021) Dropout q-functions for doubly efficient reinforcement learning. arXiv preprint arXiv:2110.02034."},{"key":"e_1_3_3_134_1","first-page":"1","article-title":"Generative adversarial imitation learning","volume":"29","author":"Ho J","year":"2016","unstructured":"Ho J, Ermon S (2016) Generative adversarial imitation learning. Advances in Neural Information Processing Systems 29: 1.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_135_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3068639"},{"key":"e_1_3_3_136_1","doi-asserted-by":"crossref","unstructured":"Hoeller D Rudin N Sako D et al. (2023) Anymal parkour: learning agile navigation for quadrupedal robots. arXiv preprint arXiv:2306.14874.","DOI":"10.1126\/scirobotics.adi7566"},{"key":"e_1_3_3_137_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2000.846489"},{"key":"e_1_3_3_138_1","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(91)90009-T"},{"key":"e_1_3_3_139_1","volume-title":"Dynamic Programming and Markov Processes","author":"Howard RA","year":"1960","unstructured":"Howard RA (1960) Dynamic Programming and Markov Processes. Technology Press of Massachusetts Institute of Technology."},{"key":"e_1_3_3_140_1","unstructured":"Howell TA Cleac\u2019h SL Kolter JZ et al. (2022) Dojo: a differentiable simulator for robotics. arXiv preprint arXiv:2203.00806."},{"key":"e_1_3_3_141_1","first-page":"1","article-title":"The safety filter: a unified view of safety-critical control in autonomous systems","volume":"7","author":"Hsu K-C","year":"2023","unstructured":"Hsu K-C, Hu H, Fisac JF (2023) The safety filter: a unified view of safety-critical control in autonomous systems. Annual Review of Control, Robotics, and Autonomous Systems 7: 1.","journal-title":"Annual Review of Control, Robotics, and Autonomous Systems"},{"key":"e_1_3_3_142_1","unstructured":"Hu Y Anderson L Li T-M et al. (2019) Difftaichi: differentiable programming for physical simulation. arXiv preprint arXiv:1910.00935."},{"key":"e_1_3_3_143_1","unstructured":"Hu Y Xie Q Jain V et al. Toward general-purpose robots via foundation models: a survey and meta-analysis. arXiv preprint arXiv:2312.08782."},{"key":"e_1_3_3_144_1","article-title":"Reward-adaptive reinforcement learning: dynamic policy gradient optimization for bipedal locomotion","author":"Huang C","year":"2022","unstructured":"Huang C, Wang G, Zhou Z, et al. (2022a) Reward-adaptive reinforcement learning: dynamic policy gradient optimization for bipedal locomotion. IEEE Transactions on Pattern Analysis and Machine Intelligence.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"274","key":"e_1_3_3_145_1","first-page":"1","article-title":"Cleanrl: high-quality single-file implementations of deep reinforcement learning algorithms","volume":"23","author":"Huang S","year":"2022","unstructured":"Huang S, Dossa RFJ, Ye C, et al. (2022b) Cleanrl: high-quality single-file implementations of deep reinforcement learning algorithms. Journal of Machine Learning Research 23(274): 1\u201318, URL. https:\/\/jmlr.org\/papers\/v23\/21-1342.html.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_3_146_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS55552.2023.10341936"},{"key":"e_1_3_3_147_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364916648388"},{"key":"e_1_3_3_148_1","doi-asserted-by":"publisher","DOI":"10.1142\/9789814415958_0062"},{"key":"e_1_3_3_149_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913519834"},{"key":"e_1_3_3_150_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2016.7758092"},{"key":"e_1_3_3_151_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2018.2792536"},{"key":"e_1_3_3_152_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.aau5872"},{"key":"e_1_3_3_153_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364920987859"},{"key":"e_1_3_3_154_1","doi-asserted-by":"publisher","DOI":"10.1007\/s004220000211"},{"key":"e_1_3_3_155_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2008.03.014"},{"key":"e_1_3_3_156_1","first-page":"916","volume-title":"Conference on Robot Learning","author":"Iscen A","year":"2018","unstructured":"Iscen A, Caluwaerts K, Tan J, et al. (2018) Policies modulating trajectory generators. In: Conference on Robot Learning. PMLR, 916\u2013926."},{"key":"e_1_3_3_157_1","first-page":"1","article-title":"When to trust your model: model-based policy optimization","volume":"32","author":"Janner M","year":"2019","unstructured":"Janner M, Fu J, Zhang M, et al. (2019) When to trust your model: model-based policy optimization. Advances in Neural Information Processing Systems 32: 1.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_158_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.adh5401"},{"key":"e_1_3_3_159_1","unstructured":"Jeon S Jung M Choi S et al. (2023a) Learning whole-body manipulation for quadrupedal robot. arXiv preprint arXiv:2308.16820."},{"key":"e_1_3_3_160_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160885"},{"key":"e_1_3_3_161_1","unstructured":"Jeong R Kay J Romano F et al. (2019) Modelling generalized forces with reinforcement learning for sim-to-real transfer. arXiv preprint arXiv:1910.09471."},{"key":"e_1_3_3_162_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3151396"},{"key":"e_1_3_3_163_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981984"},{"key":"e_1_3_3_164_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160325"},{"key":"e_1_3_3_165_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9561731"},{"key":"e_1_3_3_166_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794127"},{"key":"e_1_3_3_167_1","doi-asserted-by":"publisher","DOI":"10.1002\/rob.21674"},{"key":"e_1_3_3_168_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCME.2013.6548324"},{"key":"e_1_3_3_169_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2001.973365"},{"key":"e_1_3_3_170_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2003.1241826"},{"key":"e_1_3_3_171_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910388677"},{"key":"e_1_3_3_172_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9635838"},{"key":"e_1_3_3_173_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2023.3307008"},{"key":"e_1_3_3_174_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793865"},{"key":"e_1_3_3_175_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794436"},{"key":"e_1_3_3_176_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11044-018-09653-1"},{"key":"e_1_3_3_177_1","volume-title":"Learning for Dynamics and Control Conference","author":"Khadiv M","year":"2023","unstructured":"Khadiv M, Meduri A, Zhu H, et al. (2023) Learning locomotion skills from mpc in sensor space. In: Learning for Dynamics and Control Conference. PMLR."},{"key":"e_1_3_3_178_1","first-page":"406","volume-title":"2020 IEEE International Conference on Robotics and Automation (ICRA)","author":"Kim GS","year":"2020","unstructured":"Kim GS, Kim S (2020) Extracting legged locomotion heuristics with regularized predictive control. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 406\u2013412."},{"key":"e_1_3_3_179_1","doi-asserted-by":"publisher","DOI":"10.1109\/UR52253.2021.9494694"},{"key":"e_1_3_3_180_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550471.3564762"},{"key":"e_1_3_3_181_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2023.3304561"},{"key":"e_1_3_3_182_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS55552.2023.10341987"},{"key":"e_1_3_3_183_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS55552.2023.10341389"},{"key":"e_1_3_3_184_1","doi-asserted-by":"crossref","unstructured":"Kim Y Oh H Lee J et al. (2023d) Not only rewards but also constraints: applications on legged robot locomotion. arXiv preprint arXiv:2308.12517.","DOI":"10.1109\/TRO.2024.3400935"},{"key":"e_1_3_3_185_1","unstructured":"Kira Z (2022) Awesome-llm-robotics. URL. https:\/\/github.com\/GT-RIPL\/Awesome-LLM-Robotics."},{"key":"e_1_3_3_186_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS55552.2023.10341709"},{"key":"e_1_3_3_187_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2015.7353843"},{"key":"e_1_3_3_188_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2004.1307456"},{"key":"e_1_3_3_189_1","first-page":"1","article-title":"Hierarchical apprenticeship learning with application to quadruped locomotion","volume":"20","author":"Kolter J","year":"2007","unstructured":"Kolter J, Abbeel P, Ng A (2007) Hierarchical apprenticeship learning with application to quadruped locomotion. Advances in Neural Information Processing Systems 20: 1.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_190_1","unstructured":"Kovalev V Shkromada A Ouerdane H et al. ()Combining model-predictive control and predictive reinforcement learning for stable quadrupedal robot locomotion. arXiv preprint arXiv:2307.07752."},{"key":"e_1_3_3_191_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3143227"},{"key":"e_1_3_3_192_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-015-9479-3"},{"key":"e_1_3_3_193_1","doi-asserted-by":"crossref","unstructured":"Kumar A Fu Z Pathak D et al. (2021) Rma: rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034.","DOI":"10.15607\/RSS.2021.XVII.011"},{"key":"e_1_3_3_194_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2023.3286171"},{"key":"e_1_3_3_195_1","doi-asserted-by":"publisher","DOI":"10.1145\/3386569.3392432"},{"key":"e_1_3_3_196_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3062323"},{"key":"e_1_3_3_197_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"e_1_3_3_198_1","unstructured":"Lee G Hou B Mandalika A et al. (2018a) Bayesian policy optimization for model uncertainty. arXiv preprint arXiv:1810.01014."},{"key":"e_1_3_3_199_1","doi-asserted-by":"publisher","DOI":"10.21105\/joss.00500"},{"key":"e_1_3_3_200_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.abc5986"},{"key":"e_1_3_3_201_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981952"},{"key":"e_1_3_3_202_1","doi-asserted-by":"crossref","unstructured":"Lee J Schroth L Klemm V et al. (2023) Evaluation of constrained reinforcement learning algorithms for legged locomotion. arXiv preprint arXiv:2309.15430.","DOI":"10.1109\/IROS58592.2024.10801341"},{"key":"e_1_3_3_203_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.adi9641"},{"key":"e_1_3_3_204_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9196727"},{"key":"e_1_3_3_205_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913478990"},{"key":"e_1_3_3_206_1","unstructured":"letter (2022) General purpose robots should not be weaponized: an open letter to the robotics industry and our communities. URL. https:\/\/shorturl.at\/f9LhG."},{"key":"e_1_3_3_207_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9196642"},{"key":"e_1_3_3_208_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3061322"},{"key":"e_1_3_3_209_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3062342"},{"key":"e_1_3_3_210_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160421"},{"key":"e_1_3_3_211_1","first-page":"342","volume-title":"Conference on Robot Learning","author":"Li C","year":"2023","unstructured":"Li C, Vlastelica M, Blaes S, et al. (2023b) Learning agile skills via adversarial imitation of rough partial demonstrations. In: Conference on Robot Learning. PMLR, 342\u2013352."},{"key":"e_1_3_3_212_1","unstructured":"Li T Jung H Gombolay M et al. (2023c) Crossloco: human motion driven control of legged robots via guided unsupervised reinforcement learning. arXiv preprint arXiv:2309.17046."},{"key":"e_1_3_3_213_1","doi-asserted-by":"publisher","DOI":"10.3390\/robotics12030090"},{"key":"e_1_3_3_214_1","unstructured":"Li C Stanger-Jones E Heim S et al. (2024a) Fld: fourier latent dynamics for structured motion representation and learning. arXiv preprint arXiv:2402.13820."},{"key":"e_1_3_3_215_1","doi-asserted-by":"crossref","unstructured":"Li Z Peng XB Abbeel P et al. (2024b) Reinforcement learning for versatile dynamic and robust bipedal locomotion control. arXiv preprint arXiv:2401.16889.","DOI":"10.1177\/02783649241285161"},{"key":"e_1_3_3_216_1","unstructured":"Liang J Xia F Yu W et al. (2024) Learning to learn faster from human feedback with language model predictive control. arXiv preprint arXiv:2402.11450."},{"key":"e_1_3_3_217_1","unstructured":"Lidec QL Montaut L Schmid C et al. (2022) Augmenting differentiable physics with randomized smoothing. arXiv preprint arXiv:2206.11884."},{"key":"e_1_3_3_218_1","unstructured":"Lidec QL Jallet W Montaut L et al. (2023) Contact models in robotics: a comparative analysis. arXiv preprint arXiv:2304.06372."},{"key":"e_1_3_3_219_1","unstructured":"Lillicrap TP Hunt JJ Pritzel A et al. (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971."},{"key":"e_1_3_3_220_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA46639.2022.9811790"},{"key":"e_1_3_3_221_1","first-page":"13644","volume-title":"International Conference on Machine Learning","author":"Liu Z","year":"2022","unstructured":"Liu Z, Cen Z, Isenbaev V, et al. (2022b) Constrained variational policy optimization for safe reinforcement learning. In: International Conference on Machine Learning. PMLR, 13644\u201313668."},{"key":"e_1_3_3_222_1","unstructured":"Liu M Chen Z Cheng X et al. (2024) Visual whole-body control for legged loco-manipulation. arXiv preprint arXiv:2403.16967."},{"key":"e_1_3_3_223_1","unstructured":"Luck KS Campbell J Jansen MA et al. (2017) From the lab to the desert: fast prototyping and learning of robot locomotion. arXiv preprint arXiv:1706.01977."},{"key":"e_1_3_3_224_1","doi-asserted-by":"crossref","unstructured":"Lykov A Litvinov M Konenkov M et al. (2024) Cognitivedog: large multimodal model based system to translate vision and language into action of quadruped robot. arXiv preprint arXiv:2401.09388.","DOI":"10.1145\/3610978.3641080"},{"key":"e_1_3_3_225_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3143567"},{"key":"e_1_3_3_226_1","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2024.XX.094"},{"key":"e_1_3_3_227_1","unstructured":"Makoviychuk V Wawrzyniak L Guo Y et al. (2021) Isaac gym: high performance gpu-based physics simulation for robot learning. arXiv preprint arXiv:2108.10470."},{"key":"e_1_3_3_228_1","first-page":"1","article-title":"Simple random search of static linear policies is competitive for reinforcement learning","volume":"31","author":"Mania H","year":"2018","unstructured":"Mania H, Guy A, Recht B (2018) Simple random search of static linear policies is competitive for reinforcement learning. Advances in Neural Information Processing Systems 31: 1.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_229_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3057055"},{"key":"e_1_3_3_230_1","first-page":"22","volume-title":"Conference on Robot Learning","author":"Margolis GB","year":"2023","unstructured":"Margolis GB, Agrawal P (2023) Walk these ways: tuning robot control for generalization with multiplicity of behavior. In: Conference on Robot Learning. PMLR, 22\u201331."},{"key":"e_1_3_3_231_1","unstructured":"Margolis GB Chen T Paigwar K et al. (2021) Learning to jump from pixels. arXiv preprint arXiv:2110.15344."},{"key":"e_1_3_3_232_1","unstructured":"Margolis GB Fu X Ji Y et al. (2023) Learning to see physical properties with active sensing motor policies. arXiv preprint arXiv:2311.01405."},{"key":"e_1_3_3_233_1","doi-asserted-by":"publisher","DOI":"10.1177\/02783649231224053"},{"key":"e_1_3_3_234_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.1996.506951"},{"key":"e_1_3_3_235_1","article-title":"Agile maneuvers in legged robots: a predictive control approach","author":"Mastalli C","year":"2023","unstructured":"Mastalli C, Merkt W, Xin G, et al. (2023) Agile maneuvers in legged robots: a predictive control approach. IEEE Transactions on Robotics.","journal-title":"IEEE Transactions on Robotics"},{"key":"e_1_3_3_236_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3203845"},{"key":"e_1_3_3_237_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2022.3228390"},{"key":"e_1_3_3_238_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.abk2822"},{"key":"e_1_3_3_239_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981507"},{"key":"e_1_3_3_240_1","doi-asserted-by":"crossref","unstructured":"Miki T Lee J Wellhausen L et al. (2024) Learning to walk in confined spaces using 3d representation. arXiv preprint arXiv:2403.00187.","DOI":"10.1109\/ICRA57147.2024.10610271"},{"key":"e_1_3_3_241_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2010.5509646"},{"key":"e_1_3_3_242_1","unstructured":"MJX (2023) Mujoco3. URL. https:\/\/mujoco.readthedocs.io\/en\/stable\/mjx.html."},{"key":"e_1_3_3_243_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_3_3_244_1","first-page":"1928","volume-title":"International Conference on Machine Learning","author":"Mnih V","year":"2016","unstructured":"Mnih V, Badia AP, Mirza M, et al. (2016) Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning. PMLR, 1928\u20131937."},{"key":"e_1_3_3_245_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185539"},{"key":"e_1_3_3_246_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2005.1570469"},{"key":"e_1_3_3_247_1","volume-title":"Ethics of Artificial Intelligence and Robotics","author":"M\u00fcller VC","year":"2020","unstructured":"M\u00fcller VC (2020) Ethics of Artificial Intelligence and Robotics. Stanford Encyclopedia of Philosophy."},{"key":"e_1_3_3_248_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3052391"},{"key":"e_1_3_3_249_1","doi-asserted-by":"publisher","DOI":"10.1002\/rob.21578"},{"key":"e_1_3_3_250_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910387457"},{"key":"e_1_3_3_251_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8463189"},{"key":"e_1_3_3_252_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10161144"},{"key":"e_1_3_3_253_1","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-control-072220-093055"},{"key":"e_1_3_3_254_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-007-7194-9_15-1"},{"key":"e_1_3_3_255_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910390538"},{"key":"e_1_3_3_256_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2016.7487274"},{"key":"e_1_3_3_257_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2018.2800124"},{"key":"e_1_3_3_258_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8202315"},{"key":"e_1_3_3_259_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids57100.2023.10375218"},{"key":"e_1_3_3_260_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3176105"},{"key":"e_1_3_3_261_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2023.3300230"},{"key":"e_1_3_3_262_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2000.844095"},{"key":"e_1_3_3_263_1","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2015.XI.047"},{"key":"e_1_3_3_264_1","doi-asserted-by":"publisher","DOI":"10.1145\/3099564.3099567"},{"key":"e_1_3_3_265_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073602"},{"issue":"4","key":"e_1_3_3_266_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3197517.3201311","article-title":"Deepmimic: example-guided deep reinforcement learning of physics-based character skills","volume":"37","author":"Peng XB","year":"2018","unstructured":"Peng XB, Abbeel P, Levine S, et al. (2018a) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Transactions on Graphics 37(4): 1\u201314.","journal-title":"ACM Transactions on Graphics"},{"key":"e_1_3_3_267_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460528"},{"key":"e_1_3_3_268_1","unstructured":"Peng XB Coumans E Zhang T et al. (2020) Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784."},{"key":"e_1_3_3_269_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530110"},{"key":"e_1_3_3_270_1","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2018.2822058"},{"key":"e_1_3_3_271_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910387681"},{"key":"e_1_3_3_272_1","first-page":"1","article-title":"Alvinn: an autonomous land vehicle in a neural network","volume":"1","author":"Pomerleau DA","year":"1988","unstructured":"Pomerleau DA (1988) Alvinn: an autonomous land vehicle in a neural network. Advances in Neural Information Processing Systems 1: 1.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_273_1","volume-title":"IROS Workshop (Ed.), Compliant Manipulation: Challenges in Learning and Control","author":"Ponton B","year":"2014","unstructured":"Ponton B, Farshidian F, Buchli J (2014) Learning compliant locomotion on a quadruped robot. In IROS Workshop (Ed.), Compliant Manipulation: Challenges in Learning and Control. Citeseer."},{"key":"e_1_3_3_274_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2020.3048125"},{"key":"e_1_3_3_275_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913506757"},{"key":"e_1_3_3_276_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.1997.620037"},{"key":"e_1_3_3_277_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids58906.2024.10769799"},{"key":"e_1_3_3_278_1","doi-asserted-by":"publisher","DOI":"10.1126\/scirobotics.adi9579"},{"key":"e_1_3_3_279_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1984.6313238"},{"key":"e_1_3_3_280_1","doi-asserted-by":"publisher","DOI":"10.1109\/MEX.1986.4307016"},{"key":"e_1_3_3_281_1","doi-asserted-by":"publisher","DOI":"10.3182\/20080706-5-KR-1001.01833"},{"key":"e_1_3_3_282_1","volume-title":"Handbook of Model Predictive Control","author":"Rakovi\u0107 SV","year":"2018","unstructured":"Rakovi\u0107 SV, Levine WS (2018) Handbook of Model Predictive Control. Springer."},{"key":"e_1_3_3_283_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9561444"},{"key":"e_1_3_3_284_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364912469821"},{"key":"e_1_3_3_285_1","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2017.2787267"},{"key":"e_1_3_3_286_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12206-012-0219-8"},{"key":"e_1_3_3_287_1","first-page":"627","volume-title":"Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics","author":"Ross S","year":"2011","unstructured":"Ross S, Gordon G, Bagnell D (2011) A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 627\u2013635."},{"key":"e_1_3_3_288_1","unstructured":"Rudin N (2021a) Isaac gym environments for legged robots. URL. https:\/\/github.com\/leggedrobotics\/legged_gym."},{"key":"e_1_3_3_289_1","volume-title":"Rsl rl","author":"Rudin N","year":"2021","unstructured":"Rudin N (2021b) Rsl rl. URL. https:\/\/github.com\/leggedrobotics\/rsl_rl.git."},{"key":"e_1_3_3_290_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981198"},{"key":"e_1_3_3_291_1","first-page":"91","volume-title":"Conference on Robot Learning","author":"Rudin N","year":"2022","unstructured":"Rudin N, Hoeller D, Reist P, et al. (2022b) Learning to walk in minutes using massively parallel deep reinforcement learning. In: Conference on Robot Learning. PMLR, 91\u2013100."},{"key":"e_1_3_3_292_1","volume-title":"On-line Q-Learning Using Connectionist Systems","author":"Rummery GA","year":"1994","unstructured":"Rummery GA, Niranjan M (1994) On-line Q-Learning Using Connectionist Systems. Cambridge, UK: University of Cambridge, Department of Engineering, Vol. 37."},{"key":"e_1_3_3_293_1","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-022-00505-4"},{"key":"e_1_3_3_294_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74024-7_9"},{"key":"e_1_3_3_295_1","doi-asserted-by":"crossref","unstructured":"Sarmadi A Krishnamurthy P Khorrami F (2023) High-dimensional controller tuning through latent representations. arXiv preprint arXiv:2309.12487.","DOI":"10.1109\/ICRA57147.2024.10610607"},{"key":"e_1_3_3_296_1","unstructured":"Schaal S (2009) The sl simulation and real-time control software package. Citeseer. Technical report."},{"key":"e_1_3_3_297_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-007-7194-9_143-1"},{"key":"e_1_3_3_298_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2021.3106832"},{"key":"e_1_3_3_299_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2010.5650765"},{"key":"e_1_3_3_300_1","first-page":"1889","volume-title":"International Conference on Machine Learning","author":"Schulman J","year":"2015","unstructured":"Schulman J, Levine S, Abbeel P, et al. (2015) Trust region policy optimization. In: International Conference on Machine Learning. PMLR, 1889\u20131897."},{"key":"e_1_3_3_301_1","unstructured":"Schulman J Wolski F Dhariwal P et al. (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347."},{"key":"e_1_3_3_302_1","first-page":"2594","volume-title":"Proceedings of the 7th Conference on Robot Learning","author":"Schwarke C","year":"2023","unstructured":"Schwarke C, Klemm V, Van der Boon M, et al. (2023) Curiosity-driven learning of joint locomotion and manipulation tasks. In Proceedings of the 7th Conference on Robot Learning. PMLR, Vol. 229, 2594\u20132610."},{"key":"e_1_3_3_303_1","unstructured":"Schwarke C Klemm V Tordesillas J et al. (2024) Learning quadrupedal locomotion via differentiable simulation. arXiv preprint arXiv:2404.02887."},{"key":"e_1_3_3_304_1","volume-title":"Doctor of Philosophy (Ph. D.)","author":"Semini C","year":"2010","unstructured":"Semini C (2010) Hyq-design and development of a hydraulically actuated quadruped robot. In: Doctor of Philosophy (Ph. D.). University of Genoa, Italy."},{"key":"e_1_3_3_305_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMECH.2016.2616284"},{"key":"e_1_3_3_306_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids57100.2023.10375203"},{"key":"e_1_3_3_307_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMECH.2014.2339013"},{"key":"e_1_3_3_308_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160706"},{"key":"e_1_3_3_309_1","doi-asserted-by":"crossref","unstructured":"Sharma A Ahn M Levine S et al. (2020) Emergent real-world robotic skills via unsupervised off-policy reinforcement learning. arXiv preprint arXiv:2004.12974.","DOI":"10.15607\/RSS.2020.XVI.053"},{"key":"e_1_3_3_310_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3045792"},{"key":"e_1_3_3_311_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA46639.2022.9811755"},{"key":"e_1_3_3_312_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910388315"},{"key":"e_1_3_3_313_1","unstructured":"Sidor z. (2021) Stable baselines. URL. https:\/\/github.com\/hill-a\/stable-baselines."},{"key":"e_1_3_3_314_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9561814"},{"key":"e_1_3_3_315_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3068908"},{"key":"e_1_3_3_316_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-007-7194-9_22-1"},{"key":"e_1_3_3_317_1","unstructured":"Smith R (2005) Open dynamics engine. URL. https:\/\/www.ode.org\/."},{"key":"e_1_3_3_318_1","doi-asserted-by":"crossref","unstructured":"Smith L Kostrikov I Levine S. (2022) A walk in the park: learning to walk in 20 minutes with model-free reinforcement learning. arXiv preprint arXiv:2208.07860.","DOI":"10.15607\/RSS.2023.XIX.056"},{"key":"e_1_3_3_319_1","doi-asserted-by":"crossref","unstructured":"Smith L Kew JC Li T et al. (2023) Learning and adapting agile locomotion skills by transferring experience. arXiv preprint arXiv:2304.09834.","DOI":"10.15607\/RSS.2023.XIX.051"},{"key":"e_1_3_3_320_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3145947"},{"key":"e_1_3_3_321_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913489205"},{"key":"e_1_3_3_322_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3146931"},{"key":"e_1_3_3_323_1","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton RS","year":"2018","unstructured":"Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction. MIT press."},{"key":"e_1_3_3_324_1","doi-asserted-by":"crossref","unstructured":"Tan J Zhang T Coumans E et al. (2018) Sim-to-real: learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332.","DOI":"10.15607\/RSS.2018.XIV.010"},{"key":"e_1_3_3_325_1","unstructured":"Tang Y Yu W Tan J et al. (2023) Language to quadrupedal locomotion. arXiv preprint arXiv:2306.07580."},{"key":"e_1_3_3_326_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS55552.2023.10342440"},{"key":"e_1_3_3_327_1","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/9123.003.0026"},{"key":"e_1_3_3_328_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386025"},{"key":"e_1_3_3_329_1","unstructured":"Team G Anil R Borgeaud S et al. (2023) Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805."},{"key":"e_1_3_3_330_1","first-page":"1939","volume-title":"Proceedings of the Fourteenth Yale Workshop on Adaptive and Learning Systems","author":"Tedrake R","year":"2005","unstructured":"Tedrake R, Zhang TW, Seung HS, et al. (2005) Learning to walk in 20 minutes. In: Proceedings of the Fourteenth Yale Workshop on Adaptive and Learning Systems. Beijing, Vol. 95585, p. 1939."},{"key":"e_1_3_3_331_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3070252"},{"key":"e_1_3_3_332_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2010.5509336"},{"key":"e_1_3_3_333_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8202133"},{"key":"e_1_3_3_334_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386109"},{"key":"e_1_3_3_335_1","unstructured":"Touvron H Lavril T Izacard G et al. (2023) Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971."},{"key":"e_1_3_3_336_1","first-page":"859","volume-title":"Conference on Robot Learning","author":"Truong J","year":"2023","unstructured":"Truong J, Rudolph M, Yokoyama NH, et al. (2023) Rethinking sim2real: lower fidelity simulation leads to higher sim2real transfer in navigation. In: Conference on Robot Learning. PMLR, 859\u2013870."},{"key":"e_1_3_3_337_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2979660"},{"key":"e_1_3_3_338_1","unstructured":"Unitree (2021a) Unitree legged robots. URL. https:\/\/www.unitree.com."},{"key":"e_1_3_3_339_1","unstructured":"Unitree (2021b) Unitree b2 go beyond the limits. URL. https:\/\/www.unitree.com\/b2\/."},{"key":"e_1_3_3_340_1","unstructured":"Unitree (2021c) Unitree h1 the world\u2019s first full-size motor drive humanoid robot flips on ground. URL. https:\/\/www.youtube.com\/watch?v=V1LyWsiTgms."},{"key":"e_1_3_3_341_1","first-page":"11","article-title":"Neural discrete representation learning","volume":"30","author":"Van Den Oord A","year":"2017","unstructured":"Van Den Oord A, Vinyals O, kavukcuoglu K, et al. (2017) Neural discrete representation learning. Advances in Neural Information Processing Systems 30: 11.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_342_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"e_1_3_3_343_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9562022"},{"key":"e_1_3_3_344_1","first-page":"931","volume-title":"Learning for Dynamics and Control Conference","author":"Viereck J","year":"2022","unstructured":"Viereck J, Meduri A, Valuenetqp LR (2022) Learned one-step optimal control for legged locomotion. In: Learning for Dynamics and Control Conference. PMLR, 931\u2013942."},{"key":"e_1_3_3_345_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9197312"},{"key":"e_1_3_3_346_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160751"},{"key":"e_1_3_3_347_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCS.2023.3291885"},{"key":"e_1_3_3_348_1","first-page":"1995","volume-title":"International Conference on Machine Learning","author":"Wang Z","year":"2016","unstructured":"Wang Z, Schaul T, Hessel M, et al. (2016) Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning. PMLR, 1995\u20132003."},{"key":"e_1_3_3_349_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981234"},{"key":"e_1_3_3_350_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992698"},{"key":"e_1_3_3_351_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2016.2640183"},{"key":"e_1_3_3_352_1","article-title":"Optimization-based control for dynamic legged robots","author":"Wensing PM","year":"2023","unstructured":"Wensing PM, Posa M, Hu Y, et al. (2023) Optimization-based control for dynamic legged robots. IEEE Transactions on Robotics.","journal-title":"IEEE Transactions on Robotics"},{"key":"e_1_3_3_353_1","doi-asserted-by":"crossref","unstructured":"Werling K Omens D Lee J et al. (2021) Fast and feature-complete differentiable physics for articulated rigid bodies with contact. arXiv preprint arXiv:2103.16021.","DOI":"10.15607\/RSS.2021.XVII.034"},{"key":"e_1_3_3_354_1","first-page":"2444","volume-title":"Conference on Robot Learning","author":"Widmer D","year":"2023","unstructured":"Widmer D, Kang D, Sukhija B, et al. (2023) Tuning legged locomotion controllers via safe bayesian optimization. In: Conference on Robot Learning. PMLR, 2444\u20132464."},{"key":"e_1_3_3_355_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICHR.2006.321375"},{"key":"e_1_3_3_356_1","unstructured":"Wijmans E Kadian A Morcos A et al. (2019) Dd-ppo: learning near-perfect pointgoal navigators from 2.5 billion frames. arXiv preprint arXiv:1911.00357."},{"key":"e_1_3_3_357_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2018.2798285"},{"key":"e_1_3_3_358_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530067"},{"key":"e_1_3_3_359_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2023.3290509"},{"key":"e_1_3_3_360_1","first-page":"2226","volume-title":"Conference on Robot Learning","author":"Wu P","year":"2023","unstructured":"Wu P, Escontrela A, Hafner D, et al. (2023b) Daydreamer: world models for physical robot learning. In: Conference on Robot Learning. PMLR, 2226\u20132240."},{"key":"e_1_3_3_361_1","doi-asserted-by":"crossref","unstructured":"Xiao X Liu J Wang Z et al. (2023) Robot learning in the era of foundation models: a survey. arXiv preprint arXiv:2311.14379.","DOI":"10.2139\/ssrn.4706193"},{"key":"e_1_3_3_362_1","first-page":"317","volume-title":"Conference on Robot Learning","author":"Xie Z","year":"2020","unstructured":"Xie Z, Clary P, Dao J, et al. (2020a) Learning locomotion skills for cassie: iterative design and sim-to-real Conference on Robot Learning. PMLR, 317\u2013329."},{"key":"e_1_3_3_363_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14115"},{"key":"e_1_3_3_364_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48506.2021.9560837"},{"key":"e_1_3_3_365_1","first-page":"523","volume-title":"International Workshop on the Algorithmic Foundations of Robotics","author":"Xie Z","year":"2022","unstructured":"Xie Z, Da X, Babich B, et al. (2022) Glide: generalizable quadrupedal locomotion in diverse environments with a centroidal model. In: International Workshop on the Algorithmic Foundations of Robotics. Springer, 523\u2013539."},{"key":"e_1_3_3_366_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA46639.2022.9811640"},{"key":"e_1_3_3_367_1","unstructured":"Xu M Huang P Yu W et al. (2023) Creative robot tool use with large language models. arXiv preprint arXiv:2310.13065."},{"key":"e_1_3_3_368_1","doi-asserted-by":"crossref","unstructured":"Xu Z Raj AH Xiao X et al. (2024) Dexterous legged locomotion in confined 3d spaces with reinforcement learning. arXiv preprint arXiv:2403.03848.","DOI":"10.1109\/ICRA57147.2024.10610668"},{"key":"e_1_3_3_369_1","doi-asserted-by":"publisher","DOI":"10.1037\/apl0001045"},{"key":"e_1_3_3_370_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2006.1641984"},{"key":"e_1_3_3_371_1","first-page":"1","volume-title":"Conference on Robot Learning","author":"Yang Y","year":"2020","unstructured":"Yang Y, Caluwaerts K, Iscen A, et al. (2020) Data efficient reinforcement learning for legged robots. In: Conference on Robot Learning. PMLR, 1\u201310."},{"key":"e_1_3_3_372_1","unstructured":"Yang R Zhang M Hansen N et al. (2021) Learning vision-guided quadrupedal locomotion end-to-end with cross-modal transformers. arXiv preprint arXiv:2107.03996."},{"key":"e_1_3_3_373_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA46639.2022.9812154"},{"key":"e_1_3_3_374_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9982038"},{"key":"e_1_3_3_375_1","first-page":"773","volume-title":"Conference on Robot Learning","author":"Yang Y","year":"2022","unstructured":"Yang Y, Zhang T, Coumans E, et al. (2022c) Fast and efficient locomotion via learned gait transitions. In: Conference on Robot Learning. PMLR, 773\u2013783."},{"key":"e_1_3_3_376_1","unstructured":"Yang R Chen Z Ma J et al. (2023a.) Generalized animal imitator: agile locomotion with versatile motion prior. arXiv preprint arXiv:2310.01408."},{"key":"e_1_3_3_377_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00144"},{"key":"e_1_3_3_378_1","first-page":"2205","volume-title":"Conference on Robot Learning","author":"Yang Y","year":"2023","unstructured":"Yang Y, Meng X, Yu W, et al. (2023c) Learning semantics-aware locomotion skills from human demonstration. In: Conference on Robot Learning. PMLR, 2205\u20132214."},{"key":"e_1_3_3_379_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids43949.2019.9035003"},{"key":"e_1_3_3_380_1","article-title":"Robust walking based on mpc with viability guarantees","author":"Yeganegi MH","year":"2021","unstructured":"Yeganegi MH, Khadiv M, Del Prete A, et al. (2021) Robust walking based on mpc with viability guarantees. IEEE Transactions on Robotics.","journal-title":"IEEE Transactions on Robotics"},{"key":"e_1_3_3_381_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2023.3320827"},{"key":"e_1_3_3_382_1","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2017.XIII.048"},{"key":"e_1_3_3_383_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS40897.2019.8968053"},{"key":"e_1_3_3_384_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2020.2974685"},{"key":"e_1_3_3_385_1","volume-title":"5th Annual Conference on Robot Learning","author":"Yu W","year":"2021","unstructured":"Yu W, Jain D, Escontrela A, et al. (2021) Visual-locomotion: learning to walk on complex terrains with vision. In: 5th Annual Conference on Robot Learning."},{"key":"e_1_3_3_386_1","first-page":"374","volume-title":"Conference on Robot Learning","author":"Yu W","year":"2023","unstructured":"Yu W, Gileadi N, Fu C, et al. (2023a) Language to rewards for robotic skill synthesis. In: Conference on Robot Learning. PMLR, 374\u2013404."},{"key":"e_1_3_3_387_1","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-023-00701-w"},{"key":"e_1_3_3_388_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2019.2901308"},{"key":"e_1_3_3_389_1","doi-asserted-by":"publisher","DOI":"10.20517\/ir.2022.20"},{"key":"e_1_3_3_390_1","doi-asserted-by":"crossref","unstructured":"Zhang C Rudin N Hoeller D et al. (2023) Learning agile locomotion on risky terrains. arXiv preprint arXiv:2311.10484.","DOI":"10.1109\/IROS58592.2024.10801909"},{"key":"e_1_3_3_391_1","doi-asserted-by":"crossref","unstructured":"Zhang J Heim S Jeon SH et al. (2024) Learning emergent gaits with decentralized phase oscillators: on the role of observations rewards and feedback. arXiv preprint arXiv:2402.08662.","DOI":"10.1109\/ICRA57147.2024.10611045"},{"key":"e_1_3_3_392_1","unstructured":"Zhou H Yao X Meng Y et al. (2023) Language-conditioned learning for robotic manipulation: a survey. arXiv preprint arXiv:2312.10807."},{"key":"e_1_3_3_393_1","volume-title":"Design of a Highly Dynamic Humanoid Robot","author":"Zhu T","year":"2023","unstructured":"Zhu T (2023) Design of a Highly Dynamic Humanoid Robot. Los Angeles: University of California."},{"key":"e_1_3_3_394_1","doi-asserted-by":"publisher","DOI":"10.1145\/3618397"},{"key":"e_1_3_3_395_1","unstructured":"Zhuang Z Fu Z Wang J et al. (2023) Robot parkour learning. arXiv preprint arXiv:2309.05665."},{"key":"e_1_3_3_396_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910390537"},{"key":"e_1_3_3_397_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910392608"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/02783649241312698","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/02783649241312698","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/02783649241312698","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T10:17:40Z","timestamp":1777457860000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/02783649241312698"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,22]]},"references-count":396,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.1177\/02783649241312698"],"URL":"https:\/\/doi.org\/10.1177\/02783649241312698","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"value":"0278-3649","type":"print"},{"value":"1741-3176","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,22]]}}}