{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T10:05:59Z","timestamp":1777716359330,"version":"3.51.4"},"reference-count":75,"publisher":"SAGE Publications","issue":"7","license":[{"start":{"date-parts":[[2013,6,1]],"date-time":"2013-06-01T00:00:00Z","timestamp":1370044800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2013,6]]},"abstract":"<jats:p>We present new global and local policy search algorithms suitable for problems with policy-dependent cost variance (or risk), a property present in many robot control tasks. These algorithms exploit new techniques in non-parametric heteroscedastic regression to directly model the policy-dependent distribution of cost. For local search, the learned cost model can be used as a critic for performing risk-sensitive gradient descent. Alternatively, decision-theoretic criteria can be applied to globally select policies to balance exploration and exploitation in a principled way, or to perform greedy minimization with respect to various risk-sensitive criteria. This separation of learning and policy selection permits variable risk control, where risk-sensitivity can be flexibly adjusted and appropriate policies can be selected at runtime without relearning. We describe experiments in dynamic stabilization and manipulation with a mobile manipulator that demonstrate learning of flexible, risk-sensitive policies in very few trials.<\/jats:p>","DOI":"10.1177\/0278364913476124","type":"journal-article","created":{"date-parts":[[2013,7,1]],"date-time":"2013-07-01T12:14:24Z","timestamp":1372680864000},"page":"806-825","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":25,"title":["Variable risk control via stochastic optimization"],"prefix":"10.1177","volume":"32","author":[{"given":"Scott R","family":"Kuindersma","sequence":"first","affiliation":[{"name":"Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA"},{"name":"Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, USA"}]},{"given":"Roderic A","family":"Grupen","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, USA"}]},{"given":"Andrew G","family":"Barto","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, USA"}]}],"member":"179","published-online":{"date-parts":[[2013,7,1]]},"reference":[{"key":"bibr1-0278364913476124","volume-title":"Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables","author":"Abramowitz M","year":"1972"},{"key":"bibr2-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1162\/089976698300017746"},{"key":"bibr3-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1079\/PNS2002181"},{"key":"bibr4-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1137\/S1052623497331063"},{"key":"bibr5-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1287\/moor.27.2.294.324"},{"key":"bibr6-0278364913476124","doi-asserted-by":"publisher","DOI":"10.3389\/fnhum.2011.00001"},{"key":"bibr7-0278364913476124","volume-title":"A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning","author":"Brochu E","year":"2009"},{"key":"bibr8-0278364913476124","first-page":"2879","volume":"12","author":"Bull AD","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"bibr9-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/ICSMC.1992.271617"},{"key":"bibr10-0278364913476124","volume-title":"Introduction to Robotics: Mechanics and Control","author":"Craig JJ","year":"2005","edition":"3"},{"key":"bibr11-0278364913476124","first-page":"761","volume-title":"Proceedings of the Fifteenth National Conference on Artificial Intelligence","author":"Dearden R","year":"1998"},{"key":"bibr12-0278364913476124","volume-title":"Whole-Body Strategies for Mobility and Manipulation","author":"Deegan P","year":"2010"},{"key":"bibr13-0278364913476124","volume-title":"Efficient Reinforcement Learning using Gaussian Processes","author":"Deisenroth MP","year":"2010"},{"key":"bibr14-0278364913476124","volume-title":"Proceedings of the 28th International Conference on Machine Learning","author":"Deisenroth MP","year":"2011"},{"key":"bibr15-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-89378-3_25"},{"key":"bibr16-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1145\/1791212.1791238"},{"key":"bibr17-0278364913476124","first-page":"493","volume-title":"Advances in Neural Information Processing Systems 10 (NIPS)","author":"Goldberg PW","year":"1998"},{"key":"bibr18-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-335-6.50021-0"},{"key":"bibr19-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.18.7.356"},{"key":"bibr20-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/72.105429"},{"key":"bibr21-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.1973.1100265"},{"key":"bibr22-0278364913476124","unstructured":"Johnson SG (2011) The NLopt nonlinear-optimization package. http:\/\/ab-initio.mit.edu\/nlopt."},{"key":"bibr23-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1023\/A:1012771025575"},{"key":"bibr24-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1093\/icb\/36.4.402"},{"key":"bibr25-0278364913476124","volume-title":"Advances in Neural Information Processing Systems 14 (NIPS)","author":"Kakade S","year":"2002"},{"key":"bibr26-0278364913476124","first-page":"393","volume-title":"Proceedings of the International Conference on Machine Learning (ICML)","author":"Kersting K","year":"2010"},{"key":"bibr27-0278364913476124","volume-title":"Advances in Neural Information Processing Systems 21","author":"Kober J","year":"2009"},{"key":"bibr28-0278364913476124","first-page":"611","volume-title":"Proceedings of the Nineteenth National Conference on Artificial Intelligence","author":"Kohl N","year":"2004"},{"key":"bibr29-0278364913476124","doi-asserted-by":"crossref","unstructured":"Kolter JZ, Ng AY (2010) Policy search via the signed derivative. In: Robotics: Science and Systems V (RSS).","DOI":"10.7551\/mitpress\/8727.003.0028"},{"key":"bibr30-0278364913476124","volume-title":"Proceedings of the 10th European Workshop on Reinforcement Learning","author":"Kormushev P","year":"2012"},{"key":"bibr31-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids.2011.6100881"},{"key":"bibr32-0278364913476124","volume-title":"RSS 2012 Workshop on Mobile Manipulation","author":"Kuindersma S","year":"2012"},{"key":"bibr33-0278364913476124","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2012.VIII.026"},{"key":"bibr34-0278364913476124","volume-title":"Proceedings of the 14th International Conference on Advanced Robotics","author":"Kuindersma SR","year":"2009"},{"key":"bibr35-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1115\/1.3653121"},{"key":"bibr36-0278364913476124","volume-title":"Proceedings of the International Conference on Machine Learning (ICML)","author":"L\u00e1zaro-Gredilla M","year":"2011"},{"issue":"3","key":"bibr37-0278364913476124","first-page":"308","volume":"69","author":"Levy H","year":"1979","journal-title":"The American Economic Review"},{"key":"bibr38-0278364913476124","volume-title":"Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI)","author":"Lizotte D","year":"2007"},{"key":"bibr39-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1007\/s10898-011-9732-z"},{"key":"bibr40-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1007\/s10514-009-9130-2"},{"key":"bibr41-0278364913476124","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2007.III.041"},{"key":"bibr42-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1023\/A:1017940631555"},{"key":"bibr43-0278364913476124","volume-title":"Proceedings of the 27th International Conference on Machine Learning (ICML)","author":"Morimura T","year":"2010"},{"key":"bibr44-0278364913476124","volume-title":"Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010)","author":"Morimura T","year":"2010"},{"key":"bibr45-0278364913476124","first-page":"117","volume-title":"Toward Global Optimization","volume":"2","author":"Mo\u010dckus J","year":"1978"},{"key":"bibr46-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1000857"},{"key":"bibr47-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1098\/rspb.2010.2518"},{"key":"bibr48-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1523\/JNEUROSCI.5498-10.2012"},{"key":"bibr49-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1177\/1059-712302-010001-01"},{"key":"bibr50-0278364913476124","volume-title":"Third International Conference on Learning and Intelligent Optimization (LION3)","author":"Osborne MA","year":"2009"},{"key":"bibr51-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2006.282564"},{"key":"bibr52-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1523\/JNEUROSCI.4286-07.2008"},{"key":"bibr53-0278364913476124","volume-title":"Gaussian Processes for Machine Learning","author":"Rasmussen CE","year":"2006"},{"key":"bibr54-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-05181-4_13"},{"key":"bibr55-0278364913476124","volume-title":"Advances of Neural Information Processing Systems 21 (NIPS)","author":"Roberts JW","year":"2009"},{"key":"bibr56-0278364913476124","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI)","author":"Rosenstein MT","year":"2001"},{"key":"bibr57-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1214\/lnms\/1215456182"},{"key":"bibr58-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1086\/294846"},{"key":"bibr59-0278364913476124","volume-title":"Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence","author":"Snelson E","year":"2006"},{"key":"bibr60-0278364913476124","volume-title":"Proceedings of the 27th International Conference on Machine Learning (ICML)","author":"Srinivas N","year":"2010"},{"key":"bibr61-0278364913476124","volume-title":"Proceedings of the 29th International Conference on Machine Learning (ICML)","author":"Stulp F","year":"2012"},{"key":"bibr62-0278364913476124","volume-title":"Proceedings of the 29th International Conference on Machine Learning (ICML)","author":"Tamar A","year":"2012"},{"key":"bibr63-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2004.1389841"},{"key":"bibr64-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2011.6095076"},{"key":"bibr65-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2010.5509336"},{"key":"bibr66-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1987.10478466"},{"key":"bibr67-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1152\/jn.00745.2006"},{"key":"bibr68-0278364913476124","first-page":"615","volume-title":"Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI)","author":"van den Broek B","year":"2010"},{"key":"bibr69-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1016\/j.jspi.2010.04.018"},{"key":"bibr70-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1017\/S0001867800036508"},{"key":"bibr71-0278364913476124","volume-title":"Risk-Sensitive Optimal Control","author":"Whittle P","year":"1990"},{"key":"bibr72-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992696"},{"key":"bibr73-0278364913476124","volume-title":"Proceedings of the ICML 2011 Workshop: Planning and Acting with Uncertain Model","author":"Wilson A","year":"2011"},{"key":"bibr74-0278364913476124","volume-title":"Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI)","author":"Wilson A","year":"2011"},{"key":"bibr75-0278364913476124","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0900102106"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364913476124","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0278364913476124","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364913476124","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T10:18:08Z","timestamp":1777457888000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0278364913476124"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,6]]},"references-count":75,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2013,6]]}},"alternative-id":["10.1177\/0278364913476124"],"URL":"https:\/\/doi.org\/10.1177\/0278364913476124","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"value":"0278-3649","type":"print"},{"value":"1741-3176","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,6]]}}}