{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,23]],"date-time":"2025-11-23T13:31:10Z","timestamp":1763904670813,"version":"3.37.3"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2016,9,22]],"date-time":"2016-09-22T00:00:00Z","timestamp":1474502400000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Knowl Inf Syst"],"published-print":{"date-parts":[[2017,6]]},"DOI":"10.1007\/s10115-016-0992-2","type":"journal-article","created":{"date-parts":[[2016,9,22]],"date-time":"2016-09-22T05:42:59Z","timestamp":1474522979000},"page":"911-940","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Incremental reinforcement learning for multi-objective robotic tasks"],"prefix":"10.1007","volume":"51","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5638-5240","authenticated-orcid":false,"given":"Javier","family":"Garc\u00eda","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Roberto","family":"Iglesias","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miguel A.","family":"Rodr\u00edguez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Carlos V.","family":"Regueiro","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2016,9,22]]},"reference":[{"key":"992_CR1","doi-asserted-by":"publisher","unstructured":"Allen BF, Petros F (2009) Complex networks of simple neurons for bipedal locomotion. In: 2009 IEEE\/RSJ international conference on intelligent robots and systems, October 11\u201315, 2009. St. Louis, MO, USA, pp 4457\u20134462","DOI":"10.1109\/IROS.2009.5354077"},{"key":"992_CR2","unstructured":"Anderson C (2000) Approximating a policy can be easier than approximating a value function. Technical report, University of Colorado State"},{"key":"992_CR3","unstructured":"Barrett S, Taylor ME, Stone P (2010) Transfer learning for reinforcement learning on a physical robot. In: Ninth international conference on autonomous agents and multiagent systems\u2014adaptive learning agents workshop (ALA), May 2010"},{"issue":"1\u20132","key":"992_CR4","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1023\/A:1022140919877","volume":"13","author":"AG Barto","year":"2003","unstructured":"Barto AG, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Dyn Syst 13(1\u20132):41\u201377","journal-title":"Discrete Event Dyn Syst"},{"key":"992_CR5","unstructured":"Boedecker J (2005) Humanoid robot simulation and walking behaviour development in the spark simulator framework. Technical report, Artificial Intelligence Research University of Koblenz"},{"key":"992_CR6","unstructured":"Castelletti A, Corani G, Rizzoli AE, Soncini Sessa R, Weber E (2002) Reinforcement learning in the operational management of a water system. In: IFAC workshop on modeling and control in environmental issues"},{"key":"992_CR7","unstructured":"Castro DD, Tamar A, Mannor S (2012) Policy gradients with variance related risk criteria. In: Proceedings of the 29th international conference on machine learning, ICML 2012, Edinburgh, Scotland, UK, June 26\u2013July 1, 2012"},{"issue":"1\u20132","key":"992_CR8","first-page":"1","volume":"2","author":"MP Deisenroth","year":"2013","unstructured":"Deisenroth MP, Neumann G, Peters J (2013) A survey on policy search for robotics, foundations and trends in robotics. Found Trends Robot 2(1\u20132):1\u2013142","journal-title":"Found Trends Robot"},{"key":"992_CR9","doi-asserted-by":"publisher","unstructured":"Domingues E, Lau N, Pimentel B, Shafii N, Reis LP, Neves AJR (2011) Humanoid behaviors: from simulation to a real robot. In: Progress in artificial intelligence, 15th Portuguese conference on artificial intelligence, EPIA 2011, Lisbon, Portugal, October 10\u201313, 2011. Proceedings, pp 352\u2013364","DOI":"10.1007\/978-3-642-24769-9_26"},{"issue":"7","key":"992_CR10","doi-asserted-by":"publisher","first-page":"866","DOI":"10.1016\/j.robot.2010.03.007","volume":"58","author":"F Fern\u00e1ndez","year":"2010","unstructured":"Fern\u00e1ndez F, Garc\u00eda J, Veloso MM (2010) Probabilistic policy reuse for inter-task transfer learning. Robot Auton Syst 58(7):866\u2013871","journal-title":"Robot Auton Syst"},{"key":"992_CR11","doi-asserted-by":"publisher","unstructured":"Ferreira L, Bianchi R, Ribeiro C (2012) Multi-agent multi-objective learning using heuristically accelerated reinforcement learning. In: Brazilian robotics symposium and Latin American robotics symposium","DOI":"10.1109\/SBR-LARS.2012.10"},{"key":"992_CR12","unstructured":"Gabor Z, Kalmar Z, Szepesvari C (1998) Multi-criteria reinforcement learning. In: International conference on machine learning (ICML-98), Madison, WI"},{"key":"992_CR13","doi-asserted-by":"publisher","unstructured":"Garg A, Roth D (2001) Understanding probabilistic classifiers. In: EMCL \u201901: proceedings of the 12th European conference on machine learning, London, UK, 2001. Springer, pp 179\u2013191","DOI":"10.1007\/3-540-44795-4_16"},{"key":"992_CR14","unstructured":"Geibel P (2006) In: Proceedings of the 17th European conference on machine learning Berlin, Germany, September 18\u201322, 2006 Proceedings, Berlin, Heidelberg, 2006. Springer, Berlin, Heidelberg, pp 646\u2013653"},{"key":"992_CR15","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1613\/jair.1666","volume":"24","author":"P Geibel","year":"2005","unstructured":"Geibel P, Wysotzki F (2005) Risk-sensitive reinforcement learning applied to control under constraints. JAIR 24:81\u2013108","journal-title":"JAIR"},{"key":"992_CR16","doi-asserted-by":"publisher","unstructured":"Ijspeert AJ, Nakanishi J, Schaal S (2002) Movement imitation with nonlinear dynamical systems in humanoid robots. In: IEEE international conference on robotics and automation (ICRA2002), pp 1398\u20131403","DOI":"10.1109\/ROBOT.2002.1014739"},{"key":"992_CR17","unstructured":"Kalyanakrishnan S, Stone P (2009) An empirical analysis of value function-based and policy search reinforcement learning. In: The eighth international conference on autonomous agents and multiagent systems (AAMAS), Richland, SC, May 2009. International Foundation for Autonomous Agents and Multiagent Systems, pp 749\u2013756"},{"key":"992_CR18","first-page":"579","volume":"12","author":"J Kober","year":"2013","unstructured":"Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 12:579\u2013610","journal-title":"Int J Robot Res"},{"issue":"1\u20132","key":"992_CR19","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1007\/s10994-010-5223-6","volume":"84","author":"J Kober","year":"2011","unstructured":"Kober J, Peters J (2011) Policy search for motor primitives in robotics. Mach Learn 84(1\u20132):171\u2013203","journal-title":"Mach Learn"},{"key":"992_CR20","doi-asserted-by":"publisher","unstructured":"Kuhlmann G, Stone P (2007) Graph-based domain mapping for transfer learning in general games. In: Proceedings of the 18th European conference on machine learning, September 2007","DOI":"10.1007\/978-3-540-74958-5_20"},{"key":"992_CR21","unstructured":"Lee H, Shen Y, Yu C-H, Singh G, Ng AY (2006) Quadruped robot obstacle negotiation via reinforcement learning. In: Proceedings of the 2006 IEEE international conference on robotics and automation, ICRA 2006, May 15\u201319, 2006, Orlando, Florida, USA, pp 3003\u20133010"},{"issue":"3","key":"992_CR22","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1109\/TSMC.2014.2358639","volume":"45","author":"C Liu","year":"2015","unstructured":"Liu C, Xu X, Hu D (2015) Multiobjective reinforcement learning: a comprehensive overview. IEEE Trans Syst Man Cybern Syst 45(3):385\u2013398","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"key":"992_CR23","doi-asserted-by":"publisher","unstructured":"Van Moffaert K, Brys T, Now\u00e9 A (2015) Risk-sensitivity through multi-objective reinforcement learning. In: Proceedings of the IEEE congress on evolutionary computation (IEEE CEC)","DOI":"10.1109\/CEC.2015.7257098"},{"key":"992_CR24","doi-asserted-by":"publisher","unstructured":"Van Moffaert K, Drugan MM, Now\u00e9 A (2013) Hypervolume-based multi-objective reinforcement learning. In: Evolutionary multi-criterion optimization\u20147th international conference, EMO 2013, Sheffield, UK, March 19\u201322, 2013. Proceedings, pp 352\u2013366","DOI":"10.1007\/978-3-642-37140-0_28"},{"issue":"1","key":"992_CR25","first-page":"3483","volume":"15","author":"K Moffaert Van","year":"2014","unstructured":"Van Moffaert K, Now\u00e9 A (2014) Multi-objective reinforcement learning using sets of pareto dominating policies. J Mach Learn Res 15(1):3483\u20133512","journal-title":"J Mach Learn Res"},{"key":"992_CR26","unstructured":"Morimoto J, Doya K (2000) Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29\u2013July 2, 2000, pp 623\u2013630"},{"key":"992_CR27","unstructured":"Parisi S, Pirotta M, Smacchia N, Bascetta L, Restelli M (2014) Policy gradient approaches for multi-objective sequential decision making. In: 2014 International joint conference on neural networks, IJCNN 2014, Beijing, China, July 6\u201311, 2014, pp 2323\u20132330"},{"key":"992_CR28","doi-asserted-by":"publisher","unstructured":"Perez J, Germain-Renaud C, K\u00e9gl B, Loomis C (2009) Responsive elastic computing. In: Proceedings of the 6th international conference industry session on grids meets autonomic computing, GMAC\u201909, New York, NY, USA, pp 55\u201364","DOI":"10.1145\/1555301.1555311"},{"key":"992_CR29","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1613\/jair.3987","volume":"48","author":"D Roijers","year":"2013","unstructured":"Roijers D, Vamplew P, Whiteson S, Dazeley R (2013) A survey of multi-objective sequential decision-making. J Artif Intell Res 48:67\u2013113","journal-title":"J Artif Intell Res"},{"key":"992_CR30","doi-asserted-by":"publisher","unstructured":"R\u00fcckstie\u00df T, Felder M, Schmidhuber J (2008) State-dependent exploration for policy gradient methods. In: European conference on machine learning and principles and practice of knowledge discovery in databases 2008, Part II, LNAI 5212, pp 234\u2013249","DOI":"10.1007\/978-3-540-87481-2_16"},{"key":"992_CR31","unstructured":"Shafii N, Reis LP, Lao N (2010) Biped walking using coronal and sagittal movements based on truncated fourier series. In: Sousa AA, Eug\u00e9nio O (eds) Proceedings of the fifth doctoral symposium in informatics engineering, (DSIE 2010), Porto, Portugal, January 2010. Faculdade de Engenharia, Universidade do Porto, pp 79\u201390"},{"key":"992_CR32","unstructured":"Shelton CR (2001) Importance sampling for reinforcement learning with multiple objectives. PhD thesis, Massachusetts Institute of Technology, August 2001"},{"key":"992_CR33","doi-asserted-by":"crossref","DOI":"10.1887\/0750308958\/b386c48","volume-title":"Penalty functions","author":"AE Smith","year":"1997","unstructured":"Smith AE, Coit DW, Baeck T, Fogel D, Michalewicz Z (1997) Penalty functions. Oxford University Press and Institute of Physics Publishing, New York"},{"key":"992_CR34","volume-title":"Introduction to reinforcement learning","author":"S Richard","year":"1998","unstructured":"Richard S, Sutton RS, Andrew G (1998) Introduction to reinforcement learning, 1st edn. MIT Press, Cambridge","edition":"1"},{"issue":"1","key":"992_CR35","first-page":"1633","volume":"10","author":"ME Taylor","year":"2009","unstructured":"Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10(1):1633\u20131685","journal-title":"J Mach Learn Res"},{"key":"992_CR36","unstructured":"Taylor ME, Whiteson S, Stone P (2007) Temporal difference and policy search methods for reinforcement learning: an empirical comparison. In: Proceedings of the twenty-second conference on artificial intelligence, pp. 1675\u20131678, July 2007. Nectar Track"},{"key":"992_CR37","volume-title":"Constrained reinforcement learning from intrinsic and extrinsic rewards","author":"E Uchibe","year":"2009","unstructured":"Uchibe E, Doya K (2009) Constrained reinforcement learning from intrinsic and extrinsic rewards. INTECH Open Access Publisher, New York"},{"key":"992_CR38","unstructured":"van Hasselt H (2012) Reinforcement learning in continuous state and action spaces, volume 12 of adaptation, learning, and optimization, Chapter 7. Springer, Berlin, Heidelberg, pp 207\u2013251"}],"container-title":["Knowledge and Information Systems"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10115-016-0992-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10115-016-0992-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10115-016-0992-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T17:52:53Z","timestamp":1568397173000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10115-016-0992-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,9,22]]},"references-count":38,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,6]]}},"alternative-id":["992"],"URL":"https:\/\/doi.org\/10.1007\/s10115-016-0992-2","relation":{},"ISSN":["0219-1377","0219-3116"],"issn-type":[{"type":"print","value":"0219-1377"},{"type":"electronic","value":"0219-3116"}],"subject":[],"published":{"date-parts":[[2016,9,22]]}}}