{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T05:13:40Z","timestamp":1744953220966,"version":"3.38.0"},"reference-count":76,"publisher":"SAGE Publications","issue":"13-14","license":[{"start":{"date-parts":[[2023,12,5]],"date-time":"2023-12-05T00:00:00Z","timestamp":1701734400000},"content-version":"vor","delay-in-days":399,"URL":"http:\/\/www.sagepub.com\/licence-information-for-chorus"}],"funder":[{"DOI":"10.13039\/100000006","name":"Office of Naval Research","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000006","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["EECS-0926052, IIS-1017134, IIS-1205249"],"award-info":[{"award-number":["EECS-0926052, IIS-1017134, IIS-1205249"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004189","name":"Max-Planck-Gesellschaft","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004189","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004399","name":"Okawa Foundation for Information and Telecommunications","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004399","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2022,11]]},"abstract":"<jats:p> Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper, we introduce a full framework for learning feedback models for reactive motion planning. Our pipeline starts by segmenting demonstrations of a complete task into motion primitives via a semi-automated segmentation algorithm. Then, given additional demonstrations of successful adaptation behaviors, we learn initial feedback models through learning-from-demonstrations. In the final phase, a sample-efficient reinforcement learning algorithm fine-tunes these feedback models for novel task settings through few real system interactions. We evaluate our approach on a real anthropomorphic robot in learning a tactile feedback task. <\/jats:p>","DOI":"10.1177\/02783649221143399","type":"journal-article","created":{"date-parts":[[2022,12,5]],"date-time":"2022-12-05T13:02:45Z","timestamp":1670245365000},"page":"1121-1145","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["Supervised learning and reinforcement learning of feedback models for reactive behaviors: Tactile feedback testbed"],"prefix":"10.1177","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4443-5105","authenticated-orcid":false,"given":"Giovanni","family":"Sutanto","sequence":"first","affiliation":[{"name":"Autonomous Motion Department, Max Planck Institute for Intelligent Systems (AMD at MPI-IS), T\u00fcbingen, Germany"},{"name":"Department of Computer Science, University of Southern California (CS at USC), Los Angeles, CA, USA"},{"name":"Intrinsic Innovation LLC., Mountain View, CA, USA"}]},{"given":"Katharina","family":"Rombach","sequence":"additional","affiliation":[{"name":"Chair of Intelligent Maintenance Systems, ETH Zurich, Switzerland"}]},{"given":"Yevgen","family":"Chebotar","sequence":"additional","affiliation":[{"name":"Google Brain, Mountain View, CA, USA"}]},{"given":"Zhe","family":"Su","sequence":"additional","affiliation":[{"name":"Dexterity Inc., Redwood City, CA, USA"}]},{"given":"Stefan","family":"Schaal","sequence":"additional","affiliation":[{"name":"Intrinsic Innovation LLC., Mountain View, CA, USA"}]},{"given":"Gaurav S.","family":"Sukhatme","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Southern California (CS at USC), Los Angeles, CA, USA"}]},{"given":"Franziska","family":"Meier","sequence":"additional","affiliation":[{"name":"Meta Artificial Intelligence, Menlo Park, CA, USA"}]}],"member":"179","published-online":{"date-parts":[[2022,12,5]]},"reference":[{"key":"bibr1-02783649221143399","unstructured":"Abadi M, Agarwal A, Barham P, et al. (2015) Tensorflow: large-scale machine learning on heterogeneous distributed systems."},{"key":"bibr2-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-015-9435-2"},{"key":"bibr3-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2010.930"},{"key":"bibr4-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1991.3.4.579"},{"key":"bibr5-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/MCS.2006.1636313"},{"key":"bibr6-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907818"},{"key":"bibr7-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1177\/0278364919871998"},{"key":"bibr8-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793789"},{"key":"bibr9-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989384"},{"key":"bibr10-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2014.6943031"},{"key":"bibr11-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1016\/0262-8856(92)90066-C"},{"key":"bibr12-02783649221143399","unstructured":"Cheng CA, Mukadam M, Issac J, et al. (2018) RMPflow: A computational graph for automatic motion policy generation. In: The 13th International Workshop on the Algorithmic Foundations of Robotics. http:\/\/arxiv.org\/abs\/1811.07049"},{"key":"bibr13-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390175"},{"key":"bibr14-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2018.8593661"},{"key":"bibr15-02783649221143399","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2016.XII.036"},{"key":"bibr16-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2015.7353412"},{"key":"bibr17-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1023\/A:1013254724861"},{"key":"bibr18-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2015.7363559"},{"key":"bibr19-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-009-9118-y"},{"key":"bibr20-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2014.2304775"},{"key":"bibr21-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2014.7041354"},{"key":"bibr22-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2016.7803277"},{"key":"bibr23-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2014.02.001"},{"key":"bibr24-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2009.5152423"},{"key":"bibr25-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA40945.2020.9196976"},{"key":"bibr26-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1162\/NECO_a_00393"},{"key":"bibr27-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794127"},{"key":"bibr28-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2011.6095096"},{"key":"bibr29-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2018.2795645"},{"key":"bibr30-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1177\/0278364916648389"},{"key":"bibr31-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-012-9287-y"},{"key":"bibr32-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2008.4650953"},{"key":"bibr33-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-010-5223-6"},{"key":"bibr34-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2016.7803366"},{"issue":"39","key":"bibr35-02783649221143399","first-page":"1","volume":"17","author":"Levine S","year":"2016","journal-title":"Journal of Machine Learning Research"},{"volume-title":"Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICML\u201913","year":"2013","author":"Levine S","key":"bibr36-02783649221143399"},{"key":"bibr37-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2015.7363457"},{"key":"bibr38-02783649221143399","unstructured":"Lillicrap TP, Hunt JJ, Pritzel A, et al. (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971."},{"key":"bibr39-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2015.7363584"},{"key":"bibr40-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2011.6094676"},{"key":"bibr41-02783649221143399","unstructured":"Mnih V, Kavukcuoglu K, Silver D, et al. (2013) Playing atari with deep reinforcement learning. CoRR abs\/1312.5602."},{"key":"bibr42-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS40897.2019.8967695"},{"key":"bibr43-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1177\/0278364918790369"},{"key":"bibr44-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1017\/S0263574711001056"},{"key":"bibr45-02783649221143399","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2013.IX.048"},{"key":"bibr46-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1177\/0278364914554471"},{"key":"bibr47-02783649221143399","unstructured":"Pairet \u00c8, Ard\u00f3n P, Mistry M, et al. (2019) Learning generalisable coupling terms for obstacle avoidance via low-dimensional geometric descriptors. CoRR abs\/1906.09941."},{"key":"bibr48-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-017-9648-7"},{"key":"bibr49-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1609\/icaps.v22i1.13513"},{"key":"bibr50-02783649221143399","unstructured":"Park DH, Hoffmann H, Pastor P, et al. (2008) Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields. In: IEEE International Conference on Humanoid Robots. pp. 91\u201398."},{"key":"bibr51-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2009.5152385"},{"key":"bibr52-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2012.09.017"},{"key":"bibr53-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2011.6095059"},{"key":"bibr54-02783649221143399","doi-asserted-by":"crossref","unstructured":"Quei\u03b2 er JF, Hammer B, Ishihara H, et al. (2018) Skill memories for parameterized dynamic action primitives on the pneumatically driven humanoid robot child affetto. In: 2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob). pp. 39\u201345.","DOI":"10.1109\/DEVLRN.2018.8761040"},{"key":"bibr55-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/HUMANOIDS.2014.7041410"},{"key":"bibr56-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989252"},{"key":"bibr57-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2015.7139778"},{"key":"bibr58-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2009.5152817"},{"key":"bibr59-02783649221143399","unstructured":"Ratliff ND, Issac J, Kappler D, et al. (2018) Riemannian motion policies. CoRR abs\/1801.02854. http:\/\/arxiv.org\/abs\/1801.02854"},{"key":"bibr60-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1978.1163055"},{"issue":"1","key":"bibr61-02783649221143399","first-page":"1929","volume":"15","author":"Srivastava N","year":"2014","journal-title":"J. Mach. Learn. Res"},{"key":"bibr62-02783649221143399","first-page":"1547","volume-title":"Proceedings of the 29th International Coference on International Conference on Machine Learning, ICML\u201912","author":"Stulp F","year":"2012"},{"key":"bibr63-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2012.2210294"},{"key":"bibr64-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793502"},{"key":"bibr65-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989326"},{"key":"bibr66-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793520"},{"key":"bibr67-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460986"},{"key":"bibr68-02783649221143399","doi-asserted-by":"crossref","unstructured":"Tamar A, Levine S, Abbeel P (2016) Value iteration networks. CoRR abs\/1602.02867.","DOI":"10.24963\/ijcai.2017\/700"},{"key":"bibr69-02783649221143399","first-page":"3137","volume":"11","author":"Theodorou E","year":"2010","journal-title":"J. Mach. Learn. Res"},{"key":"bibr70-02783649221143399","first-page":"828","volume-title":"Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research","volume":"9","author":"Theodorou E","year":"2010"},{"key":"bibr71-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794219"},{"volume-title":"COURSERA: Neural Networks for Machine Learning","year":"2012","author":"Tieleman T","key":"bibr72-02783649221143399"},{"key":"bibr73-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907291"},{"key":"bibr74-02783649221143399","doi-asserted-by":"publisher","DOI":"10.3390\/s20061748"},{"key":"bibr75-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1163\/156855308X314533"},{"key":"bibr76-02783649221143399","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2012.6247812"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/02783649221143399","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/02783649221143399","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/02783649221143399","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/02783649221143399","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T07:03:43Z","timestamp":1740899023000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/02783649221143399"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11]]},"references-count":76,"journal-issue":{"issue":"13-14","published-print":{"date-parts":[[2022,11]]}},"alternative-id":["10.1177\/02783649221143399"],"URL":"https:\/\/doi.org\/10.1177\/02783649221143399","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"type":"print","value":"0278-3649"},{"type":"electronic","value":"1741-3176"}],"subject":[],"published":{"date-parts":[[2022,11]]}}}