{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T08:18:25Z","timestamp":1774685905461,"version":"3.50.1"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2018,12,4]],"date-time":"2018-12-04T00:00:00Z","timestamp":1543881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003621","name":"Ministry of Science ICT and Future Planning","doi-asserted-by":"publisher","award":["IITP-2017-0536-20170040"],"award-info":[{"award-number":["IITP-2017-0536-20170040"]}],"id":[{"id":"10.13039\/501100003621","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2018,12,31]]},"abstract":"<jats:p>\n            Flying creatures in animated films often perform highly dynamic aerobatic maneuvers, which require their extreme of exercise capacity and skillful control. Designing physics-based controllers (a.k.a., control policies) for aerobatic maneuvers is very challenging because dynamic states remain in unstable equilibrium most of the time during aerobatics. Recently, Deep Reinforcement Learning (DRL) has shown its potential in constructing physics-based controllers. In this paper, we present a new concept,\n            <jats:italic>Self-Regulated Learning (SRL)<\/jats:italic>\n            , which is combined with DRL to address the aerobatics control problem. The key idea of SRL is to allow the agent to take control over its own learning using an additional self-regulation policy. The policy allows the agent to regulate its goals according to the capability of the current control policy. The control and self-regulation policies are learned jointly along the progress of learning. Self-regulated learning can be viewed as building its own curriculum and seeking compromise on the goals. The effectiveness of our method is demonstrated with physically-simulated creatures performing aerobatic skills of sharp turning, rapid winding, rolling, soaring, and diving.\n          <\/jats:p>","DOI":"10.1145\/3272127.3275023","type":"journal-article","created":{"date-parts":[[2018,11,28]],"date-time":"2018-11-28T19:16:10Z","timestamp":1543432570000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":27,"title":["Aerobatics control of flying creatures via self-regulated learning"],"prefix":"10.1145","volume":"37","author":[{"given":"Jungdam","family":"Won","sequence":"first","affiliation":[{"name":"Seoul National University, South Korea"}]},{"given":"Jungnam","family":"Park","sequence":"additional","affiliation":[{"name":"Seoul National University, South Korea"}]},{"given":"Jehee","family":"Lee","sequence":"additional","affiliation":[{"name":"Seoul National University, South Korea"}]}],"member":"320","published-online":{"date-parts":[[2018,12,4]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"Social Psychology Second Edition: Handbook of Basic Principles. Social Psychology Second Edition: Handbook of Basic Principles."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910371999"},{"key":"e_1_2_2_3_1","volume-title":"Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS","author":"Abbeel Pieter","year":"2016","unstructured":"Pieter Abbeel , Adam Coates , Morgan Quigley , and Andrew Y. Ng . 2006. An Application of Reinforcement Learning to Aerobatic Helicopter Flight . In Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS 2016 ). 1--8. Pieter Abbeel, Adam Coates, Morgan Quigley, and Andrew Y. Ng. 2006. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS 2016). 1--8."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015430"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2012.325"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1531326.1531359"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553380"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2012.325"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1781156"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964954"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185565"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1360612.1360681"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8659.2008.01134.x"},{"key":"e_1_2_2_14_1","volume-title":"Dart: Dynamic Animation and Robotics Toolkit. https:\/\/dartsim.github.io\/.","year":"2012","unstructured":"Dart. 2012 . Dart: Dynamic Animation and Robotics Toolkit. https:\/\/dartsim.github.io\/. (2012). Dart. 2012. Dart: Dynamic Animation and Robotics Toolkit. https:\/\/dartsim.github.io\/. (2012)."},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1781157"},{"key":"e_1_2_2_16_1","volume-title":"Learning Robust Rewards with Adversarial Inverse Reinforcement Learning. CoRR abs\/1710.11248","author":"Fu Justin","year":"2017","unstructured":"Justin Fu , Katie Luo , and Sergey Levine . 2017. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning. CoRR abs\/1710.11248 ( 2017 ). Justin Fu, Katie Luo, and Sergey Levine. 2017. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning. CoRR abs\/1710.11248 (2017)."},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/3305381.3305517"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/280814.280816"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2682626"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366174"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601218"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2767002"},{"key":"e_1_2_2_23_1","volume-title":"Shin","author":"Han Daseong","year":"2016","unstructured":"Daseong Han , Haegwang Eom , Junyong Noh , and Joseph S . Shin . 2016 . Data-guided Model Predictive Control Based on Smoothed Contact Dynamics. Computer Graphics Forum 35, 2 (2016). Daseong Han, Haegwang Eom, Junyong Noh, and Joseph S. Shin. 2016. Data-guided Model Predictive Control Based on Smoothed Contact Dynamics. Computer Graphics Forum 35, 2 (2016)."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12323"},{"key":"e_1_2_2_25_1","volume-title":"Proceedings of IEEE International Conference on Evolutionary Computation. 312--317","author":"Hansen N.","unstructured":"N. Hansen and A. Ostermeier . 1996. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation . In Proceedings of IEEE International Conference on Evolutionary Computation. 312--317 . N. Hansen and A. Ostermeier. 1996. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation. In Proceedings of IEEE International Conference on Evolutionary Computation. 312--317."},{"key":"e_1_2_2_26_1","volume-title":"Automatic Goal Generation for Reinforcement Learning Agents. CoRR abs\/1705.06366","author":"Held David","year":"2017","unstructured":"David Held , Xinyang Geng , Carlos Florensa , and Pieter Abbeel . 2017. Automatic Goal Generation for Reinforcement Learning Agents. CoRR abs\/1705.06366 ( 2017 ). David Held, Xinyang Geng, Carlos Florensa, and Pieter Abbeel. 2017. Automatic Goal Generation for Reinforcement Learning Agents. CoRR abs\/1705.06366 (2017)."},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2516971.2516976"},{"key":"e_1_2_2_28_1","volume-title":"Ng","author":"Kim H. J.","year":"2004","unstructured":"H. J. Kim , Michael I. Jordan , Shankar Sastry , and Andrew Y . Ng . 2004 . Autonomous Helicopter Flight via Reinforcement Learning. In Advances in Neural Information Processing Systems 16 (NIPS 2003). 799--806. H. J. Kim, Michael I. Jordan, Shankar Sastry, and Andrew Y. Ng. 2004. Autonomous Helicopter Flight via Reinforcement Learning. In Advances in Neural Information Processing Systems 16 (NIPS 2003). 799--806."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1921427.1921447"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983616"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1781155"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661233"},{"key":"e_1_2_2_33_1","volume-title":"Proceedings of the 31st International Conference on Machine Learning (ICML","author":"Levine Sergey","year":"2014","unstructured":"Sergey Levine and Vladlen Koltun . 2014 . Learning Complex Neural Network Policies with Trajectory Optimization . In Proceedings of the 31st International Conference on Machine Learning (ICML 2014). Sergey Levine and Vladlen Koltun. 2014. Learning Complex Neural Network Policies with Trajectory Optimization. In Proceedings of the 31st International Conference on Machine Learning (ICML 2014)."},{"key":"e_1_2_2_34_1","volume-title":"Continuous control with deep reinforcement learning. CoRR abs\/1509.02971","author":"Lillicrap Timothy P.","year":"2015","unstructured":"Timothy P. Lillicrap , Jonathan J. Hunt , Alexander Pritzel , Nicolas Heess , Tom Erez , Yuval Tassa , David Silver , and Daan Wierstra . 2015. Continuous control with deep reinforcement learning. CoRR abs\/1509.02971 ( 2015 ). Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. CoRR abs\/1509.02971 (2015)."},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3083723"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2893476"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366173"},{"key":"e_1_2_2_38_1","volume-title":"Teacher-Student Curriculum Learning. CoRR abs\/1707.00183","author":"Matiisen Tambet","year":"2017","unstructured":"Tambet Matiisen , Avital Oliver , Taco Cohen , and John Schulman . 2017. Teacher-Student Curriculum Learning. CoRR abs\/1707.00183 ( 2017 ). Tambet Matiisen, Avital Oliver, Taco Cohen, and John Schulman. 2017. Teacher-Student Curriculum Learning. CoRR abs\/1707.00183 (2017)."},{"key":"e_1_2_2_39_1","volume-title":"Robotics: Science and Systems (RSS","author":"Mordatch Igor","year":"2014","unstructured":"Igor Mordatch and Emanuel Todorov . 2014. Combining the benefits of function approximation and trajectory optimization . In In Robotics: Science and Systems (RSS 2014 ). Igor Mordatch and Emanuel Todorov. 2014. Combining the benefits of function approximation and trajectory optimization. In In Robotics: Science and Systems (RSS 2014)."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185539"},{"key":"e_1_2_2_41_1","volume":"199","author":"Ng Andrew Y.","unstructured":"Andrew Y. Ng , Daishi Harada , and Stuart J. Russell. 199 9. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML '99). 278--287. Andrew Y. Ng, Daishi Harada, and Stuart J. Russell. 1999. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML '99). 278--287.","journal-title":"Stuart J. Russell."},{"key":"e_1_2_2_42_1","unstructured":"Jeanne Ellis Ormrod. 2009. Essentials of Educational Psychology. Pearson Education.  Jeanne Ellis Ormrod. 2009. Essentials of Educational Psychology. Pearson Education."},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201311"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925881"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073602"},{"key":"e_1_2_2_46_1","volume-title":"High-Dimensional Continuous Control Using Generalized Advantage Estimation. CoRR abs\/1506.02438","author":"Schulman John","year":"2015","unstructured":"John Schulman , Philipp Moritz , Sergey Levine , Michael I. Jordan , and Pieter Abbeel . 2015. High-Dimensional Continuous Control Using Generalized Advantage Estimation. CoRR abs\/1506.02438 ( 2015 ). John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2015. High-Dimensional Continuous Control Using Generalized Advantage Estimation. CoRR abs\/1506.02438 (2015)."},{"key":"e_1_2_2_47_1","volume-title":"Proximal Policy Optimization Algorithms. CoRR abs\/1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman , Filip Wolski , Prafulla Dhariwal , Alec Radford , and Oleg Klimov . 2017. Proximal Policy Optimization Algorithms. CoRR abs\/1707.06347 ( 2017 ). John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. CoRR abs\/1707.06347 (2017)."},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276511"},{"key":"e_1_2_2_49_1","volume-title":"Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play. CoRR abs\/1703.05407","author":"Sukhbaatar Sainbayar","year":"2017","unstructured":"Sainbayar Sukhbaatar , Ilya Kostrikov , Arthur Szlam , and Rob Fergus . 2017. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play. CoRR abs\/1703.05407 ( 2017 ). Sainbayar Sukhbaatar, Ilya Kostrikov, Arthur Szlam, and Rob Fergus. 2017. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play. CoRR abs\/1703.05407 (2017)."},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964953"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185522"},{"key":"e_1_2_2_52_1","unstructured":"TensorFlow. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http:\/\/tensorflow.org\/ Software available from tensorflow.org.  TensorFlow. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http:\/\/tensorflow.org\/ Software available from tensorflow.org."},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2009.76"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/192161.192170"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ADPRL.2007.368199"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778810"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185521"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3130800.3130833"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/882262.882360"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778811"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276509"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201397"},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201366"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275023","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3272127.3275023","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:44:04Z","timestamp":1750207444000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275023"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,4]]},"references-count":63,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2018,12,31]]}},"alternative-id":["10.1145\/3272127.3275023"],"URL":"https:\/\/doi.org\/10.1145\/3272127.3275023","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,12,4]]},"assertion":[{"value":"2018-12-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}