{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T04:19:47Z","timestamp":1777522787362,"version":"3.51.4"},"reference-count":40,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2016,9,20]],"date-time":"2016-09-20T00:00:00Z","timestamp":1474329600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Adaptive Behavior"],"published-print":{"date-parts":[[2016,12]]},"abstract":"<jats:p>In machine learning, learning a task is expensive (many training samples are needed) and it is therefore of general interest to be able to reuse knowledge across tasks. This is the case in aerial robotics applications, where an autonomous aerial robot cannot interact with the environment hazard free. Prototype generation is a well known technique commonly used in supervised learning to help reduce the number of samples needed to learn a task. However, little is known about how such techniques can be used in a reinforcement learning task. In this work we propose an algorithm that, in order to learn a new (target) task, first generates new samples\u2014prototypes\u2014based on samples acquired previously in a known (source) task. The proposed approach uses Gaussian processes to learn a continuous multidimensional transition function, rendering the method capable of reasoning directly in continuous (states and actions) domains. We base the prototype generation on a careful selection of a subset of samples from the source task (based on known filtering techniques) and transforming such samples using the (little) knowledge acquired in the target task. Our experimental evidence gathered in known reinforcement learning benchmark tasks, as well as a challenging quadcopter to helicopter transfer task, suggests that prototype generation is feasible and, furthermore, that the filtering technique used is not as important as a correct transformation model.<\/jats:p>","DOI":"10.1177\/1059712316664570","type":"journal-article","created":{"date-parts":[[2016,9,15]],"date-time":"2016-09-15T22:34:23Z","timestamp":1473978863000},"page":"464-478","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":8,"title":["Transfer learning by prototype generation in continuous spaces"],"prefix":"10.1177","volume":"24","author":[{"given":"Enrique","family":"Munoz de Cote","sequence":"first","affiliation":[{"name":"Computer Science Department, Instituto Nacional de Astr\u00f3fisica, \u00d3ptica y Electr\u00f3nica, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Esteban O.","family":"Garcia","sequence":"additional","affiliation":[{"name":"Computer Science Department, Instituto Nacional de Astr\u00f3fisica, \u00d3ptica y Electr\u00f3nica, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eduardo F.","family":"Morales","sequence":"additional","affiliation":[{"name":"Computer Science Department, Instituto Nacional de Astr\u00f3fisica, \u00d3ptica y Electr\u00f3nica, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2016,9,20]]},"reference":[{"key":"bibr1-1059712316664570","first-page":"383","volume-title":"Proceedings of the 11th international conference on autonomous agents and multiagent systems","volume":"1","author":"Ammar H.","year":"2012"},{"key":"bibr2-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.1997.606886"},{"key":"bibr3-1059712316664570","volume-title":"Neuro-dynamic programming","author":"Bertsekas D.","year":"1996"},{"key":"bibr4-1059712316664570","first-page":"2504","volume-title":"Proceedings of the twenty-ninth AAAI conference on artificial intelligence","author":"Bou Ammar H.","year":"2015"},{"key":"bibr5-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1109\/ACC.2008.4587201"},{"key":"bibr6-1059712316664570","first-page":"465","volume-title":"ICML","author":"Deisenroth M.","year":"2011"},{"key":"bibr7-1059712316664570","volume-title":"Proceedings of robotics: Science and systems","author":"Deisenroth M.","year":"2011"},{"key":"bibr8-1059712316664570","first-page":"19","volume-title":"16th european symposium on artificial neural networks","author":"Deisenroth M.","year":"2008"},{"key":"bibr9-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1613\/jair.639"},{"key":"bibr10-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1613\/jair.904"},{"key":"bibr11-1059712316664570","unstructured":"Engel Y., Mannor S., Meir R. (2003). Bayes meets bellman: The gaussian process approach to temporal difference learning. In Fawcett T., Mishra N. (Eds.),\n                      ICML\n                      (pp. 154\u2013161). AAAI Press, Washington D.C."},{"key":"bibr12-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102377"},{"key":"bibr13-1059712316664570","volume-title":"Proceedings of the ICML-06 workshop on structural knowledge transfer for machine learning","author":"Ferguson K.","year":"2006"},{"key":"bibr14-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-41822-8_25"},{"key":"bibr15-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1142\/S0218001414600076"},{"key":"bibr16-1059712316664570","volume-title":"Reinforcement learning: State of the art","author":"Hasselt H.","year":"2011"},{"key":"bibr17-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1994.6.6.1185"},{"key":"bibr18-1059712316664570","volume-title":"Knowledge transfer in reinforcement learning","author":"Lazaric A.","year":"2008"},{"key":"bibr19-1059712316664570","volume-title":"Advances in neural information processing systems","author":"Lazaric A.","year":"2007"},{"key":"bibr20-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390225"},{"key":"bibr21-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2010.07.027"},{"key":"bibr22-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-008-5061-y"},{"key":"bibr23-1059712316664570","first-page":"21","author":"Murray-Smith R.","year":"2002","journal-title":"In 15th IFAC world congress on automatic control"},{"key":"bibr24-1059712316664570","first-page":"278","volume-title":"Proceedings of the sixteenth international conference on machine learning","author":"Ng A.","year":"1999"},{"key":"bibr25-1059712316664570","volume-title":"Advances in neural information processing systems","volume":"16","author":"Ng A.","year":"2004"},{"key":"bibr26-1059712316664570","doi-asserted-by":"crossref","DOI":"10.1002\/9780470316887","volume-title":"Markov decision processes\u2014discrete stochastic dynamic programming","author":"Puterman M.","year":"1994"},{"key":"bibr27-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-89722-4_18"},{"key":"bibr28-1059712316664570","first-page":"751","volume":"16","author":"Rasmussen C.","year":"2004","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"2","key":"bibr29-1059712316664570","first-page":"69","volume":"14","author":"Rasmussen C.","year":"2006","journal-title":"International Journal of Neural Systems"},{"key":"bibr30-1059712316664570","unstructured":"Robotics C. (2013). V-rep pro edu, version 3.0.1 [Computer software manual]. Retrieved from http:\/\/www.coppeliarobotics.com\/"},{"key":"bibr31-1059712316664570","volume-title":"Inductive transfer","author":"Silver D.","year":"2005"},{"key":"bibr32-1059712316664570","first-page":"494","author":"Soni V.","year":"2006","journal-title":"AAAI"},{"key":"bibr33-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.1998.712192"},{"key":"bibr34-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87481-2_32"},{"key":"bibr35-1059712316664570","first-page":"1633","volume":"10","author":"Taylor M.","year":"2009","journal-title":"Journal of Machine Learning Research"},{"key":"bibr36-1059712316664570","first-page":"2125","volume":"8","author":"Taylor M.","year":"2007","journal-title":"Journal of Machine Learning Research"},{"key":"bibr37-1059712316664570","first-page":"412","author":"Torrey L.","year":"2005","journal-title":"Joint European Conference on Machine Learning and Knowledge Discovery in Databases"},{"key":"bibr38-1059712316664570","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCC.2010.2103939"},{"key":"bibr39-1059712316664570","unstructured":"van Hasselt H. (2011). Insights in reinforcement learning: formal analysis and empirical evaluation of temporal-difference learning algorithms (Unpublished doctoral dissertation). Universiteit Utrecht, The Netherlands."},{"key":"bibr40-1059712316664570","doi-asserted-by":"crossref","DOI":"10.3233\/978-1-58603-969-1-i","volume-title":"The logic of adaptive behavior: Knowledge representation and algorithms for adaptive sequential decision making under uncertainty in first-order and relational domains","author":"van Otterlo M","year":"2009"}],"container-title":["Adaptive Behavior"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1059712316664570","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1059712316664570","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1059712316664570","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T16:18:39Z","timestamp":1777393119000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1059712316664570"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,9,20]]},"references-count":40,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2016,12]]}},"alternative-id":["10.1177\/1059712316664570"],"URL":"https:\/\/doi.org\/10.1177\/1059712316664570","relation":{},"ISSN":["1059-7123","1741-2633"],"issn-type":[{"value":"1059-7123","type":"print"},{"value":"1741-2633","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,9,20]]}}}