{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T08:14:28Z","timestamp":1761293668118,"version":"3.38.0"},"reference-count":55,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2017,12,5]],"date-time":"2017-12-05T00:00:00Z","timestamp":1512432000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2018,1]]},"abstract":"<jats:p> We consider the scenario where a robot is demonstrated a manipulation skill once and should then use only a few trials on its own to learn to reproduce, optimize, and generalize that same skill. A manipulation skill is generally a high-dimensional policy. To achieve the desired sample efficiency, we need to exploit the inherent structure in this problem. With our approach, we propose to decompose the problem into analytically known objectives, such as motion smoothness, and black-box objectives, such as trial success or reward, depending on the interaction with the environment. The decomposition allows us to leverage and combine (i) constrained optimization methods to address analytic objectives, (ii) constrained Bayesian optimization to explore black-box objectives, and (iii) inverse optimal control methods to eventually extract a generalizable skill representation. The algorithm is evaluated on a synthetic benchmark experiment and compared with state-of-the-art learning methods. We also demonstrate the performance on real-robot experiments with a PR2. <\/jats:p>","DOI":"10.1177\/0278364917743795","type":"journal-article","created":{"date-parts":[[2017,12,5]],"date-time":"2017-12-05T09:52:17Z","timestamp":1512467537000},"page":"137-154","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":33,"title":["Learning manipulation skills from a single demonstration"],"prefix":"10.1177","volume":"37","author":[{"given":"Peter","family":"Englert","sequence":"first","affiliation":[{"name":"Machine Learning & Robotics Lab, University of Stuttgart, Germany"}]},{"given":"Marc","family":"Toussaint","sequence":"additional","affiliation":[{"name":"Machine Learning & Robotics Lab, University of Stuttgart, Germany"}]}],"member":"179","published-online":{"date-parts":[[2017,12,5]]},"reference":[{"key":"bibr1-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015430"},{"key":"bibr2-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910371999"},{"key":"bibr3-0278364917743795","first-page":"1","volume-title":"Advances in Neural Information Processing Systems","volume":"19","author":"Abbeel P","year":"2007"},{"volume-title":"34th international conference on machine learning","year":"2017","author":"Achiam J","key":"bibr4-0278364917743795"},{"key":"bibr5-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2008.10.024"},{"key":"bibr6-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2017.8246913"},{"key":"bibr7-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/ECC.2015.7330913"},{"key":"bibr8-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/MCS.2006.1636313"},{"journal-title":"arXiv","year":"2010","author":"Brochu E","key":"bibr9-0278364917743795"},{"issue":"1","key":"bibr10-0278364917743795","first-page":"5","volume":"76","author":"Calandra R","year":"2015","journal-title":"Annals of Mathematics and Artificial Intelligence"},{"key":"bibr11-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989384"},{"key":"bibr12-0278364917743795","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/978-3-319-60916-4_4","volume-title":"Robotics Research","volume":"3","author":"Englert P","year":"2015"},{"key":"bibr13-0278364917743795","unstructured":"Englert P, Toussaint M (2016) Combined optimization and reinforcement learning for manipulations skills. In: Robotics: science and systems (ed. Hsu D, Amato N, Berman S, et al.), Ann Arbor, MI, 18\u201322 June 2016. Available at: http:\/\/www.roboticsproceedings.org\/rss12\/index.html"},{"key":"bibr14-0278364917743795","first-page":"49","volume-title":"International conference on machine learning","author":"Finn C","year":"2016"},{"key":"bibr15-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2016.7759592"},{"key":"bibr16-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/ADPRL.2011.5967356"},{"issue":"2","key":"bibr17-0278364917743795","first-page":"937","volume":"32","author":"Gardner J","year":"2014","journal-title":"Proceedings of Machine Learning Research"},{"key":"bibr18-0278364917743795","first-page":"250","volume-title":"UAI\u201914 proceedings of the thirtieth conference on uncertainty in artificial intelligence","author":"Gelbart MA","year":"2014"},{"key":"bibr19-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199694587.003.0008"},{"key":"bibr20-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1162\/106365601750190398"},{"key":"bibr21-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2013.6630743"},{"key":"bibr22-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2011.6095096"},{"key":"bibr23-0278364917743795","first-page":"849","volume-title":"NIPS\u201908 proceedings of the 21st international conference on neural information processing systems","author":"Kober J","year":"2008"},{"key":"bibr24-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913495721"},{"key":"bibr25-0278364917743795","first-page":"769","volume-title":"NIPS\u201907 Proceedings of the 20th international conference on neural information processing systems","author":"Kolter JZ","year":"2008"},{"key":"bibr26-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2015.7139549"},{"key":"bibr27-0278364917743795","first-page":"1401","volume-title":"AAAI\u201913 proceedings of the twenty-seventh AAAI conference on artificial intelligence","author":"Kupcsik AG","year":"2013"},{"issue":"1","key":"bibr28-0278364917743795","first-page":"97","volume":"86","author":"Kushner HJ","year":"1964","journal-title":"Journal of Fluids Engineering"},{"key":"bibr29-0278364917743795","first-page":"475","volume-title":"ICML\u201912 proceedings of the 29th international conference on machine learning","author":"Levine S","year":"2012"},{"issue":"3","key":"bibr30-0278364917743795","first-page":"1","volume":"28","author":"Levine S","year":"2013","journal-title":"Proceedings of Machine Learning Research"},{"issue":"1","key":"bibr31-0278364917743795","first-page":"1334","volume":"17","author":"Levine S","year":"2016","journal-title":"The Journal of Machine Learning Research"},{"key":"bibr32-0278364917743795","first-page":"19","volume-title":"NIPS\u201911 proceedings of the 24th international conference on neural information processing systems","author":"Levine S","year":"2011"},{"key":"bibr33-0278364917743795","first-page":"944","volume-title":"International joint conference on artificial intelligence","author":"Lizotte DJ","year":"2007"},{"key":"bibr34-0278364917743795","first-page":"117","volume-title":"Towards Global Optimization","volume":"2","author":"Mockus J","year":"1978"},{"key":"bibr35-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1177\/0278364912472380"},{"key":"bibr36-0278364917743795","first-page":"663","volume-title":"ICML\u201900 proceedings of the seventeenth international conference on machine learning","author":"Ng AY","year":"2000"},{"issue":"10","key":"bibr37-0278364917743795","first-page":"2035","volume":"9","author":"Nickisch H","year":"2008","journal-title":"Journal of Machine Learning Research"},{"volume-title":"AAAI\u201910 proceedings of the twenty-fourth AAAI conference on artificial intelligence","year":"2010","author":"Peters J","key":"bibr38-0278364917743795"},{"key":"bibr39-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2012.6225317"},{"volume-title":"Gaussian Processes for Machine Learning","year":"2006","author":"Rasmussen CE","key":"bibr40-0278364917743795"},{"issue":"138","key":"bibr41-0278364917743795","first-page":"97","volume":"6","author":"R\u00fcckert EA","year":"2013","journal-title":"Frontiers in Computational Neuroscience"},{"key":"bibr42-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1214\/lnms\/1215456182"},{"key":"bibr43-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-23461-8_9"},{"key":"bibr44-0278364917743795","doi-asserted-by":"publisher","DOI":"10.2478\/pjbr-2013-0003"},{"volume-title":"Journ\u00e9es Francophones Planification, D\u00e9cision, et Apprentissage pour la conduite de syst\u00e8mes","year":"2013","author":"Stulp F","key":"bibr45-0278364917743795"},{"key":"bibr46-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1613\/jair.3229"},{"key":"bibr47-0278364917743795","first-page":"997","volume-title":"ICML\u201915 proceedings of the 32nd international conference on machine learning","volume":"37","author":"Sui Y","year":"2015"},{"volume-title":"Reinforcement Learning: An Introduction","year":"1998","author":"Sutton RS","key":"bibr48-0278364917743795"},{"key":"bibr49-0278364917743795","first-page":"3137","volume":"11","author":"Theodorou E","year":"2010","journal-title":"Journal of Machine Learning Research"},{"key":"bibr50-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-51547-2_15"},{"key":"bibr51-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2014.6942539"},{"key":"bibr52-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1142\/S0219843615500280"},{"key":"bibr53-0278364917743795","volume-title":"Numerical Optimization","volume":"2","author":"Wright SJ","year":"1999"},{"key":"bibr54-0278364917743795","doi-asserted-by":"publisher","DOI":"10.1108\/17563781211255862"},{"key":"bibr55-0278364917743795","first-page":"1433","volume-title":"AAAI\u201908 proceedings of the 23rd AAAI national conference on artificial intelligence","volume":"3","author":"Ziebart BD","year":"2008"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364917743795","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0278364917743795","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364917743795","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T23:50:14Z","timestamp":1740873014000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0278364917743795"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,12,5]]},"references-count":55,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,1]]}},"alternative-id":["10.1177\/0278364917743795"],"URL":"https:\/\/doi.org\/10.1177\/0278364917743795","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"type":"print","value":"0278-3649"},{"type":"electronic","value":"1741-3176"}],"subject":[],"published":{"date-parts":[[2017,12,5]]}}}