{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:25:16Z","timestamp":1750220716595,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":26,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,15]],"date-time":"2020-10-15T00:00:00Z","timestamp":1602720000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,15]]},"DOI":"10.1145\/3383455.3422519","type":"proceedings-article","created":{"date-parts":[[2021,10,7]],"date-time":"2021-10-07T14:51:48Z","timestamp":1633618308000},"page":"1-9","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Risk-sensitive reinforcement learning"],"prefix":"10.1145","author":[{"given":"Nelson","family":"Vadori","sequence":"first","affiliation":[{"name":"J.P. Morgan AI Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sumitra","family":"Ganesh","sequence":"additional","affiliation":[{"name":"J.P. Morgan AI Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Prashant","family":"Reddy","sequence":"additional","affiliation":[{"name":"J.P. Morgan AI Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Manuela","family":"Veloso","sequence":"additional","affiliation":[{"name":"J.P. Morgan AI Research"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,7]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Marc G Bellemare Will Dabney and R\u00e9mi Munos. 2017. A Distributional Perspective on Reinforcement Learning. In ICML.  Marc G Bellemare Will Dabney and R\u00e9mi Munos. 2017. A Distributional Perspective on Reinforcement Learning. In ICML ."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2009.07.008"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.27.2.294.324"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysconle.2004.08.007"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of the Nineteenth International Symposium on Mathematical Theory of Networks and Systems","author":"Borkar Vivek S","year":"2010","unstructured":"Vivek S Borkar . 2010 . Learning Algorithms for Risk-Sensitive Control . Proceedings of the Nineteenth International Symposium on Mathematical Theory of Networks and Systems (2010), 1327--1332. Vivek S Borkar. 2010. Learning Algorithms for Risk-Sensitive Control. Proceedings of the Nineteenth International Symposium on Mathematical Theory of Networks and Systems (2010), 1327--1332."},{"key":"e_1_3_2_1_6_1","first-page":"1","article-title":"Risk-Constrained Reinforcement Learning with Percentile Risk Criteria","volume":"18","author":"Chow Yinlam","year":"2018","unstructured":"Yinlam Chow , Mohammad Ghavamzadeh , Lucas Janson , and Marco Pavone . 2018 . Risk-Constrained Reinforcement Learning with Percentile Risk Criteria . J. Mach. Learn. Res. 18 , 167 (2018), 1 -- 51 . Yinlam Chow, Mohammad Ghavamzadeh, Lucas Janson, and Marco Pavone. 2018. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria. J. Mach. Learn. Res. 18, 167 (2018), 1--51.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022632907294"},{"key":"e_1_3_2_1_8_1","volume-title":"Risk-Aware Decision Making and Dynamic Programming. In NIPS workshop.","author":"Defourny Boris","year":"2008","unstructured":"Boris Defourny , Damien Ernst , and Louis Wehenkel . 2008 . Risk-Aware Decision Making and Dynamic Programming. In NIPS workshop. Boris Defourny, Damien Ernst, and Louis Wehenkel. 2008. Risk-Aware Decision Making and Dynamic Programming. In NIPS workshop."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00780-005-0159-6"},{"volume-title":"NeurIPS Workshop on Robust AI in Financial Services.","author":"Ganesh S.","key":"e_1_3_2_1_10_1","unstructured":"S. Ganesh , N. Vadori , M. Xu , H. Zheng , P. Reddy , and M. Veloso . 2019. Reinforcement Learning for Market Making in a Multi-agent Dealer Market . In NeurIPS Workshop on Robust AI in Financial Services. S. Ganesh, N. Vadori, M. Xu, H. Zheng, P. Reddy, and M. Veloso. 2019. Reinforcement Learning for Market Making in a Multi-agent Dealer Market. In NeurIPS Workshop on Robust AI in Financial Services."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622519.1622522"},{"key":"e_1_3_2_1_12_1","unstructured":"Olivier Gu\u00e9ant and Iuliia Manziuk. 2019. Deep reinforcement learning for market making in corporate bonds: beating the curse of dimensionality. (2019). arXiv:1910.13205 [q-fin.TR]  Olivier Gu\u00e9ant and Iuliia Manziuk. 2019. Deep reinforcement learning for market making in corporate bonds: beating the curse of dimensionality. (2019). arXiv:1910.13205 [q-fin.TR]"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1214\/105051604000000116"},{"key":"e_1_3_2_1_14_1","unstructured":"Shie Mannor and John Tsitsiklis. 2011. Mean-Variance Optimization in Markov Decision Processes. In ICML.  Shie Mannor and John Tsitsiklis. 2011. Mean-Variance Optimization in Markov Decision Processes. In ICML ."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1017940631555"},{"key":"e_1_3_2_1_16_1","unstructured":"Tetsuro Morimura Masashi Sugiyama and Hisashi Kashima. 2010. Parametric Return Density Estimation for Reinforcement Learning. In UAI.  Tetsuro Morimura Masashi Sugiyama and Hisashi Kashima. 2010. Parametric Return Density Estimation for Reinforcement Learning. In UAI ."},{"key":"e_1_3_2_1_17_1","unstructured":"Tetsuro Morimura Masashi Sugiyama Hisashi Kashima Hirotaka Hachiya and Toshiyuki Tanaka. 2010. Nonparametric Return Distribution Approximation for Reinforcement Learning. In ICML.  Tetsuro Morimura Masashi Sugiyama Hisashi Kashima Hirotaka Hachiya and Toshiyuki Tanaka. 2010. Nonparametric Return Distribution Approximation for Reinforcement Learning. In ICML ."},{"key":"e_1_3_2_1_18_1","first-page":"252","article-title":"Actor-Critic Algorithms for Risk-Sensitive MDPs","volume":"26","author":"Prashanth L.a.","year":"2013","unstructured":"L.a. Prashanth and Mohammad Ghavamzadeh . 2013 . Actor-Critic Algorithms for Risk-Sensitive MDPs . Advances in Neural Information Processing Systems 26 (2013), 252 -- 260 . L.a. Prashanth and Mohammad Ghavamzadeh. 2013. Actor-Critic Algorithms for Risk-Sensitive MDPs. Advances in Neural Information Processing Systems 26 (2013), 252--260.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_19_1","volume-title":"Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint. (Oct","author":"Prashanth L A","year":"2018","unstructured":"L A Prashanth and Michael Fu. 2018. Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint. (Oct . 2018 ). arXiv:1810.09126 [cs.LG] L A Prashanth and Michael Fu. 2018. Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint. (Oct. 2018). arXiv:1810.09126 [cs.LG]"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-016-5569-5"},{"key":"e_1_3_2_1_21_1","volume-title":"Risk-sensitive Reinforcement Learning. Neural Computation 26, 7","author":"Shen Yun","year":"2014","unstructured":"Yun Shen , Michael J Tobia , Tobias Sommer , and Klaus Obermayer . 2014. Risk-sensitive Reinforcement Learning. Neural Computation 26, 7 ( 2014 ). Yun Shen, Michael J Tobia, Tobias Sommer, and Klaus Obermayer. 2014. Risk-sensitive Reinforcement Learning. Neural Computation 26, 7 (2014)."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Sobel MJ. 1982. The Variance of discounted Markov Decision Processes. J. Appl. Probab. (1982).  Sobel MJ. 1982. The Variance of discounted Markov Decision Processes. J. Appl. Probab . (1982).","DOI":"10.1017\/S0021900200023123"},{"key":"e_1_3_2_1_23_1","first-page":"1","article-title":"Learning the Variance of the Reward-To-Go","volume":"17","author":"Tamar Aviv","year":"2016","unstructured":"Aviv Tamar , Dotan Di Castro , and Shie Mannor . 2016 . Learning the Variance of the Reward-To-Go . Journal of Machine Learning Research 17 , 13 (2016), 1 -- 36 . Aviv Tamar, Dotan Di Castro, and Shie Mannor. 2016. Learning the Variance of the Reward-To-Go. Journal of Machine Learning Research 17, 13 (2016), 1--36.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_24_1","unstructured":"Aviv Tamar Yinla Chow Mohammad Ghavamzadeh and Shie Mannor. 2015. Policy Gradient for Coherent Risk Measures. In NIPS.  Aviv Tamar Yinla Chow Mohammad Ghavamzadeh and Shie Mannor. 2015. Policy Gradient for Coherent Risk Measures. In NIPS ."},{"key":"e_1_3_2_1_25_1","volume-title":"Dotan Di Castro, and Shie Mannor","author":"Tamar Aviv","year":"2012","unstructured":"Aviv Tamar , Dotan Di Castro, and Shie Mannor . 2012 . Policy Gradients with Variance Related Risk Criteria. In ICML. Aviv Tamar, Dotan Di Castro, and Shie Mannor. 2012. Policy Gradients with Variance Related Risk Criteria. In ICML."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Tamar A. Glassner Y. Manor S. 2015. Optimizing the CVaR via Sampling. In AAAI.  Tamar A. Glassner Y. Manor S. 2015. Optimizing the CVaR via Sampling. In AAAI .","DOI":"10.1609\/aaai.v29i1.9561"}],"event":{"name":"ICAIF '20: ACM International Conference on AI in Finance","sponsor":["ACM Association for Computing Machinery"],"location":"New York New York","acronym":"ICAIF '20"},"container-title":["Proceedings of the First ACM International Conference on AI in Finance"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3383455.3422519","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3383455.3422519","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:33:22Z","timestamp":1750199602000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3383455.3422519"}},"subtitle":["a martingale approach to reward uncertainty"],"short-title":[],"issued":{"date-parts":[[2020,10,15]]},"references-count":26,"alternative-id":["10.1145\/3383455.3422519","10.1145\/3383455"],"URL":"https:\/\/doi.org\/10.1145\/3383455.3422519","relation":{},"subject":[],"published":{"date-parts":[[2020,10,15]]},"assertion":[{"value":"2021-10-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}