{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T11:32:48Z","timestamp":1769167968801,"version":"3.49.0"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,8,12]],"date-time":"2020-08-12T00:00:00Z","timestamp":1597190400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2020,8,31]]},"abstract":"<jats:p>Motion synthesis in a dynamic environment has been a long-standing problem for character animation. Methods using motion capture data tend to scale poorly in complex environments because of their larger capturing and labeling requirement. Physics-based controllers are effective in this regard, albeit less controllable. In this paper, we present CARL, a quadruped agent that can be controlled with high-level directives and react naturally to dynamic environments. Starting with an agent that can imitate individual animation clips, we use Generative Adversarial Networks to adapt high-level controls, such as speed and heading, to action distributions that correspond to the original animations. Further fine-tuning through the deep reinforcement learning enables the agent to recover from unseen external perturbations while producing smooth transitions. It then becomes straightforward to create autonomous agents in dynamic environments by adding navigation modules over the entire process. We evaluate our approach by measuring the agent's ability to follow user control and provide a visual analysis of the generated motion to show its effectiveness.<\/jats:p>","DOI":"10.1145\/3386569.3392433","type":"journal-article","created":{"date-parts":[[2020,8,12]],"date-time":"2020-08-12T11:44:27Z","timestamp":1597232667000},"update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":41,"title":["CARL"],"prefix":"10.1145","volume":"39","author":[{"given":"Ying-Sheng","family":"Luo","sequence":"first","affiliation":[{"name":"Inventec Corp., Taiwan"}]},{"given":"Jonathan Hans","family":"Soeseno","sequence":"additional","affiliation":[{"name":"Inventec Corp., Taiwan"}]},{"given":"Trista Pei-Chun","family":"Chen","sequence":"additional","affiliation":[{"name":"Inventec Corp., Taiwan"}]},{"given":"Wei-Chao","family":"Chen","sequence":"additional","affiliation":[{"name":"Skywatch Innovation Inc. and Inventec Corp., Taiwan"}]}],"member":"320","published-online":{"date-parts":[[2020,8,12]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"International Conference on Learning Representations.","author":"Bansal Trapit","year":"2018","unstructured":"Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, and Igor Mordatch. 2018. Emergent Complexity via Multi-Agent Competition. In International Conference on Learning Representations."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356536"},{"key":"e_1_2_2_3_1","volume-title":"arXiv preprint arXiv:1606.01540","author":"Brockman Greg","year":"2016","unstructured":"Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016)."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1073204.1073248"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3274247.3274506"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00916"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1661412.1618516"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1833349.1781156"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1964921.1964954"},{"key":"e_1_2_2_10_1","unstructured":"Erwin Coumans et al. 2013. Bullet physics library. Open source: bulletphysics.org (2013)."},{"key":"e_1_2_2_11_1","volume-title":"Computer Graphics Forum","author":"Silva Marco Da","unstructured":"Marco Da Silva, Yeuhi Abe, and Jovan Popovi\u0107. 2008. Simulation of human motion data using short-horizon model-predictive control. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 371--380."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAR.2019.8813480"},{"key":"e_1_2_2_13_1","volume-title":"Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts.","author":"Engel Jesse","year":"2019","unstructured":"Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts. 2019. Gansynth: Adversarial neural audio synthesis. arXiv preprint arXiv:1902.08710 (2019)."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/2919332.2919834"},{"key":"e_1_2_2_15_1","volume-title":"International Conference on Learning Representations.","author":"Fu Justin","year":"2018","unstructured":"Justin Fu, Katie Luo, and Sergey Levine. 2018. Learning robust rewards with adversarial inverse reinforcement learning. In International Conference on Learning Representations."},{"key":"e_1_2_2_16_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative Adversarial nets. In Advances in neural information processing systems. 2672--2680."},{"key":"e_1_2_2_17_1","volume-title":"ACM transactions on graphics (TOG)","author":"Grochow Keith","unstructured":"Keith Grochow, Steven L Martin, Aaron Hertzmann, and Zoran Popovi\u0107. 2004. Style-based inverse kinematics. In ACM transactions on graphics (TOG), Vol. 23. ACM, 522--531."},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2767002"},{"key":"e_1_2_2_19_1","unstructured":"Nicolas Heess Srinivasan Sriram Jay Lemmon Josh Merel Greg Wayne Yuval Tassa Tom Erez Ziyu Wang SM Eslami Martin Riedmiller et al. 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017)."},{"key":"e_1_2_2_20_1","unstructured":"Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In Advances in neural information processing systems. 4565--4573."},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073663"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925975"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1002\/cav.1469"},{"key":"e_1_2_2_24_1","volume-title":"Learning agile and dynamic motor skills for legged robots. Science Robotics 4, 26","author":"Hwangbo Jemin","year":"2019","unstructured":"Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter. 2019. Learning agile and dynamic motor skills for legged robots. Science Robotics 4, 26 (2019), eaau5872."},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.632"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1186562.1015760"},{"key":"e_1_2_2_27_1","volume-title":"Motion Graphs. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '02)","author":"Kovar Lucas","year":"2002","unstructured":"Lucas Kovar, Michael Gleicher, and Fr\u00e9d\u00e9ric Pighin. 2002. Motion Graphs. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '02). ACM, 473--482."},{"key":"e_1_2_2_28_1","volume-title":"Jessica K Hodgins, and Nancy S Pollard.","author":"Lee Jehee","year":"2002","unstructured":"Jehee Lee, Jinxiang Chai, Paul SA Reitsma, Jessica K Hodgins, and Nancy S Pollard. 2002. Interactive control of avatars animated with human motion data. In ACM Transactions on Graphics (TOG), Vol. 21. ACM, 491--500."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275016"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1833349.1781155"},{"key":"e_1_2_2_31_1","volume-title":"Proceedings of the 11th ACM SIGGRAPH\/Eurographics conference on Computer Animation. Eurographics Association, 221--230","author":"Levine Sergey","year":"2012","unstructured":"Sergey Levine and Jovan Popovi\u0107. 2012. Physically Plausible Simulation for Character Animation. In Proceedings of the 11th ACM SIGGRAPH\/Eurographics conference on Computer Animation. Eurographics Association, 221--230."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185524"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3083723"},{"key":"e_1_2_2_34_1","volume-title":"Computer Graphics Forum","author":"Liu Libin","unstructured":"Libin Liu, KangKang Yin, and Baining Guo. 2015. Improving Sampling-based Motion Control. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 415--423."},{"key":"e_1_2_2_35_1","volume-title":"ACM Transactions on Graphics (TOG)","author":"Liu Libin","unstructured":"Libin Liu, KangKang Yin, Michiel van de Panne, Tianjia Shao, and Weiwei Xu. 2010. Sampling-based contact-rich motion control. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 128."},{"key":"e_1_2_2_36_1","volume-title":"Hierarchical Visuomotor Control of Humanoids. In International Conference on Learning Representations.","author":"Merel Josh","year":"2019","unstructured":"Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, and Greg Wayne. 2019. Hierarchical Visuomotor Control of Humanoids. In International Conference on Learning Representations."},{"key":"e_1_2_2_37_1","volume-title":"Yee Whye Teh, and Nicolas Heess","author":"Merel Josh","year":"2018","unstructured":"Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, and Nicolas Heess. 2018. Neural probabilistic motor primitives for humanoid control. arXiv preprint arXiv:1811.11711 (2018)."},{"key":"e_1_2_2_38_1","volume-title":"On the Kinematic Motion Primitives (kMPs) - Theory and Application. Frontiers in neurorobotics 6","author":"Moro Federico Lorenzo","year":"2012","unstructured":"Federico Lorenzo Moro, Nikos G Tsagarakis, and Darwin G Caldwell. 2012. On the Kinematic Motion Primitives (kMPs) - Theory and Application. Frontiers in neurorobotics 6 (2012), 10."},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356501"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201311"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766910"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925881"},{"key":"e_1_2_2_43_1","volume-title":"MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies. In NeurIPS.","author":"Peng Xue Bin","year":"2019","unstructured":"Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, and Sergey Levine. 2019. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies. In NeurIPS."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276510"},{"key":"e_1_2_2_45_1","volume-title":"International conference on machine learning. 1889--1897","author":"Schulman John","year":"2015","unstructured":"John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015a. Trust region policy optimization. In International conference on machine learning. 1889--1897."},{"key":"e_1_2_2_46_1","volume-title":"High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438","author":"Schulman John","year":"2015","unstructured":"John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015b. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015)."},{"key":"e_1_2_2_47_1","volume-title":"Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)."},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1002\/cav.125"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356505"},{"key":"e_1_2_2_50_1","unstructured":"Richard S Sutton Andrew G Barto et al. 1998. Introduction to reinforcement learning. Vol. 2. MIT press Cambridge."},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386025"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907001"},{"key":"e_1_2_2_53_1","volume-title":"Mujoco: A Physics Engine for Model-Based Control. In 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026--5033","author":"Todorov Emanuel","year":"2012","unstructured":"Emanuel Todorov, Tom Erez, and Yuval Tassa. 2012. Mujoco: A Physics Engine for Model-Based Control. In 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026--5033."},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/1576246.1531366"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601192"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356499"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130833"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275023"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778811"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/1275808.1276509"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201366"},{"key":"e_1_2_2_62_1","unstructured":"Yi Zhou Zimo Li Shuangjiu Xiao Chong He Zeng Huang and Hao Li. 2018. Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis. (2018)."},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3386569.3392433","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3386569.3392433","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,25]],"date-time":"2025-06-25T05:41:46Z","timestamp":1750830106000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3386569.3392433"}},"subtitle":["controllable agent with reinforcement learning for quadruped locomotion"],"short-title":[],"issued":{"date-parts":[[2020,8,12]]},"references-count":63,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,8,31]]}},"alternative-id":["10.1145\/3386569.3392433"],"URL":"https:\/\/doi.org\/10.1145\/3386569.3392433","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,8,12]]},"assertion":[{"value":"2020-08-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}