{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:31:34Z","timestamp":1740123094598,"version":"3.37.3"},"reference-count":54,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T00:00:00Z","timestamp":1661385600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T00:00:00Z","timestamp":1661385600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002241","name":"Japan Science and Technology Agency","doi-asserted-by":"publisher","award":["JPMJAX200O"],"award-info":[{"award-number":["JPMJAX200O"]}],"id":[{"id":"10.13039\/501100002241","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this work, we study a new class of risks defined in terms of the location and deviation of the loss distribution, generalizing far beyond classical mean-variance risk functions. The class is easily implemented as a wrapper around any smooth loss, it admits finite-sample stationarity guarantees for stochastic gradient methods, it is straightforward to interpret and adjust, with close links to M-estimators of the loss location, and has a salient effect on the test loss distribution, giving us control over symmetry and deviations that are not possible under naive ERM.<\/jats:p>","DOI":"10.1007\/s10994-022-06217-5","type":"journal-article","created":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T16:03:48Z","timestamp":1661443428000},"page":"4679-4718","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Learning with risks based on M-location"],"prefix":"10.1007","volume":"111","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6704-1769","authenticated-orcid":false,"given":"Matthew J.","family":"Holland","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,8,25]]},"reference":[{"issue":"3","key":"6217_CR1","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1111\/1467-9965.00068","volume":"9","author":"P Artzner","year":"1999","unstructured":"Artzner, P., Delbaen, F., Eber, J. M., & Heath, D. (1999). Coherent measures of risk. Mathematical Finance, 9(3), 203\u2013228.","journal-title":"Mathematical Finance"},{"key":"6217_CR2","volume-title":"Probability and measure theory","author":"RB Ash","year":"2000","unstructured":"Ash, R. B., & Dol\u00e9ans-Dade, C. A. (2000). Probability and measure theory (2nd ed.). NP: Academic Press.","edition":"2"},{"key":"6217_CR3","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-007-2247-7","volume-title":"Convexity and optimization in banach spaces","author":"V Barbu","year":"2012","unstructured":"Barbu, V., & Precupanu, T. (2012). Convexity and optimization in banach spaces (4th ed.). Berlin: Springer Science & Business Media.","edition":"4"},{"issue":"2","key":"6217_CR4","doi-asserted-by":"publisher","first-page":"596","DOI":"10.1137\/S0363012902407120","volume":"42","author":"HH Bauschke","year":"2003","unstructured":"Bauschke, H. H., Borwein, J. M., & Combettes, P. L. (2003). Bregman monotone optimization algorithms. SIAM Journal on Control and Optimization, 42(2), 596\u2013636.","journal-title":"SIAM Journal on Control and Optimization"},{"key":"6217_CR5","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48311-5","volume-title":"Convex analysis and monotone operator theory in Hilbert spaces","author":"HH Bauschke","year":"2017","unstructured":"Bauschke, H. H., & Combettes, P. L. (2017). Convex analysis and monotone operator theory in Hilbert spaces (2nd ed.). Berlin: Springer.","edition":"2"},{"key":"6217_CR6","volume-title":"Convex optimization algorithms","author":"DP Bertsekas","year":"2015","unstructured":"Bertsekas, D. P. (2015). Convex optimization algorithms. Nashua: Athena Scientific."},{"key":"6217_CR7","unstructured":"Bhat, S.P., & Prashanth, L.A. (2020). Concentration of risk measures: A Wasserstein distance approach. Advances in Neural Information Processing Systems 32 (NeurIPS 2019)."},{"key":"6217_CR8","unstructured":"Bottou, L., Curtis, F.E., Nocedal, J. (2016). Optimization methods for large-scale machine learning. arXiv preprint arXiv:1606.04838."},{"issue":"7","key":"6217_CR9","doi-asserted-by":"publisher","first-page":"1493","DOI":"10.1162\/089976699300016106","volume":"11","author":"L Breiman","year":"1999","unstructured":"Breiman, L. (1999). Prediction games and arcing algorithms. Neural Computation, 11(7), 1493\u20131517.","journal-title":"Neural Computation"},{"issue":"6","key":"6217_CR10","doi-asserted-by":"publisher","first-page":"2507","DOI":"10.1214\/15-AOS1350","volume":"43","author":"C Brownlees","year":"2015","unstructured":"Brownlees, C., Joly, E., & Lugosi, G. (2015). Empirical risk minimization for heavy-tailed losses. Annals of Statistics, 43(6), 2507\u20132536.","journal-title":"Annals of Statistics"},{"issue":"2","key":"6217_CR11","first-page":"315","volume":"12","author":"A Daniilidis","year":"2005","unstructured":"Daniilidis, A., & Malick, J. (2005). Filling the gap between lower-$$C^{1}$$ and lower-$$C^{2}$$ functions. Journal of Convex Analysis, 12(2), 315\u2013329.","journal-title":"Journal of Convex Analysis"},{"issue":"1","key":"6217_CR12","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1137\/18M1178244","volume":"29","author":"D Davis","year":"2019","unstructured":"Davis, D., & Drusvyatskiy, D. (2019). Stochastic model-based minimization of weakly convex functions. SIAM Journal on Optimization, 29(1), 207\u2013239.","journal-title":"SIAM Journal on Optimization"},{"key":"6217_CR13","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1007\/s10107-018-1311-3","volume":"178","author":"D Drusvyatskiy","year":"2019","unstructured":"Drusvyatskiy, D., & Paquette, C. (2019). Efficiency of minimizing compositions of convex functions and smooth maps. Mathematical Programming, 178, 503\u2013558.","journal-title":"Mathematical Programming"},{"issue":"1","key":"6217_CR14","first-page":"2450","volume":"20","author":"J Duchi","year":"2019","unstructured":"Duchi, J., & Namkoong, H. (2019). Variance-based regularization with convex objectives. Journal of Machine Learning Research, 20(1), 2450\u20132504.","journal-title":"Journal of Machine Learning Research"},{"issue":"4","key":"6217_CR15","doi-asserted-by":"publisher","first-page":"3229","DOI":"10.1137\/17M1135086","volume":"28","author":"JC Duchi","year":"2018","unstructured":"Duchi, J. C., & Ruan, F. (2018). Stochastic methods for composite and weakly convex optimization problems. SIAM Journal on Optimization, 28(4), 3229\u20133259.","journal-title":"SIAM Journal on Optimization"},{"issue":"4","key":"6217_CR16","doi-asserted-by":"publisher","first-page":"2341","DOI":"10.1137\/120880811","volume":"23","author":"S Ghadimi","year":"2013","unstructured":"Ghadimi, S., & Lan, G. (2013). Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23(4), 2341\u20132368.","journal-title":"SIAM Journal on Optimization"},{"key":"6217_CR17","unstructured":"Hashimoto, T.B., Srivastava, M., Namkoong, H., Liang, P. (2018). Fairness without demographics in repeated loss minimization. In Proceedings of the 35th International Conference on Machine Learning (ICML) Vol.\u00a080, pp. 1929\u20131938."},{"issue":"1","key":"6217_CR18","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1016\/0890-5401(92)90010-D","volume":"100","author":"D Haussler","year":"1992","unstructured":"Haussler, D. (1992). Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1), 78\u2013150.","journal-title":"Information and Computation"},{"key":"6217_CR19","unstructured":"Holland, M.J., & Haress, E.M. (2021a). Learning with risk-averse feedback under potentially heavy tails. In 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021) Vol.\u00a0130."},{"key":"6217_CR20","unstructured":"Holland, M.J., & Haress, E.M. (2021b). Spectral risk-based learning using unbounded losses. arXiv preprint arXiv:2105.04816."},{"issue":"1","key":"6217_CR21","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1214\/aoms\/1177703732","volume":"35","author":"PJ Huber","year":"1964","unstructured":"Huber, P. J. (1964). Robust estimation of a location parameter. Annals of Mathematical Statistics, 35(1), 73\u2013101.","journal-title":"Annals of Mathematical Statistics"},{"key":"6217_CR22","unstructured":"Johnson, R., & Zhang, T. (2014). Accelerating stochastic gradient descent using predictive variance reduction. Advances in Neural Information Processing Systems 26 (NIPS 2013) pp. 315\u2013323."},{"issue":"3","key":"6217_CR23","doi-asserted-by":"publisher","first-page":"1185","DOI":"10.1016\/j.jfa.2013.11.008","volume":"266","author":"A Jourani","year":"2014","unstructured":"Jourani, A., Thibault, L., & Zagrodny, D. (2014). Differential properties of the Moreau envelope. Journal of Functional Analysis, 266(3), 1185\u20131237.","journal-title":"Journal of Functional Analysis"},{"key":"6217_CR24","unstructured":"Kall, P., & Mayer, J. (2005). Stochastic linear programming. International Series in Operations Research and Management Science."},{"key":"6217_CR25","unstructured":"Khim, J., Leqi, L., Prasad, A., Ravikumar, P. (2020). Uniform convergence of rank-weighted learning. In 37th International Conference on Machine Learning (ICML) Vol.\u00a0119, pp. 5254\u20135263."},{"key":"6217_CR26","unstructured":"Lee, J., Park, S., Shin, J. (2020). Learning bounds for risk-sensitive learning. Advances in Neural Information Processing Systems 33 (NeurIPS 2020) pp. 13867\u201313879."},{"key":"6217_CR27","unstructured":"Leqi, L., Prasad, A., Ravikumar, P.K. (2019). On human-aligned risk minimization. Advances in Neural Information Processing Systems 32 (NeurIPS 2019)."},{"key":"6217_CR28","unstructured":"Le\u00a0Roux, N., Schmidt, M., Bach, F.R. (2013). A stochastic gradient method with an exponential convergence rate for finite training sets. Advances in Neural Information Processing Systems 25 (NIPS 2012) pp. 2663\u20132671."},{"key":"6217_CR29","volume-title":"Optimization by vector space methods","author":"DG Luenberger","year":"1969","unstructured":"Luenberger, D. G. (1969). Optimization by vector space methods. New Jersey: Wiley."},{"issue":"1","key":"6217_CR30","first-page":"77","volume":"7","author":"H Markowitz","year":"1952","unstructured":"Markowitz, H. (1952). Portfolio selection. Journal of Finance, 7(1), 77\u201391.","journal-title":"Journal of Finance"},{"key":"6217_CR31","unstructured":"Maurer, A., & Pontil, M. (2009). Empirical bernstein bounds and sample variance penalization. In Proceedings of the 22nd Conference on Learning Theory (COLT)."},{"key":"6217_CR32","unstructured":"Namkoong, H., & Duchi, J.C. (2016). Stochastic gradient methods for distributionally robust optimization with $$f$$-divergences. Advances in Neural Information Processing Systems 29 (NIPS 2016) Vol.\u00a029, pp. 2208\u20132216."},{"key":"6217_CR33","volume-title":"Problem complexity and method efficiency in optimization","author":"AS Nemirovsky","year":"1983","unstructured":"Nemirovsky, A. S., & Yudin, D. B. (1983). Problem complexity and method efficiency in optimization. New Jersey: Wiley."},{"key":"6217_CR34","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-8853-9","volume-title":"Introductory lectures on convex optimization: a basic course","author":"Y Nesterov","year":"2004","unstructured":"Nesterov, Y. (2004). Introductory lectures on convex optimization: a basic course. Berlin: Springer."},{"key":"6217_CR35","volume-title":"Calculus without derivatives","author":"JP Penot","year":"2012","unstructured":"Penot, J. P. (2012). Calculus without derivatives. Berlin: Springer."},{"issue":"5","key":"6217_CR36","doi-asserted-by":"publisher","first-page":"1805","DOI":"10.1090\/S0002-9947-96-01544-9","volume":"348","author":"RA Poliquin","year":"1996","unstructured":"Poliquin, R. A., & Rockafellar, R. T. (1996). Prox-regular functions in variational analysis. Transactions of the American Mathematical Society, 348(5), 1805\u20131838.","journal-title":"Transactions of the American Mathematical Society"},{"key":"6217_CR37","unstructured":"Prashanth, L.A., Jagannathan, K., Kolla, R.K. (2020). Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions. In 37th International Conference on Machine Learning (ICML) Vol.\u00a0119, pp. 5577\u20135586."},{"key":"6217_CR38","doi-asserted-by":"crossref","unstructured":"Reyzin, L., & Schapire, R.E. (2006). How boosting the margin can also boost classifier complexity. In Proceedings of the 23rd International Conference on Machine Learning (ICML 2006) (pp. 753\u2013760).","DOI":"10.1145\/1143844.1143939"},{"issue":"3","key":"6217_CR39","doi-asserted-by":"publisher","first-page":"525","DOI":"10.2140\/pjm.1968.24.525","volume":"24","author":"RT Rockafellar","year":"1968","unstructured":"Rockafellar, R. T. (1968). Integrals which are convex functionals. Pacific Journal of Mathematics, 24(3), 525\u2013539.","journal-title":"Pacific Journal of Mathematics"},{"issue":"1","key":"6217_CR40","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1007\/s00780-005-0165-8","volume":"10","author":"RT Rockafellar","year":"2006","unstructured":"Rockafellar, R. T., Uryasev, S., & Zabarankin, M. (2006). Generalized deviations in risk analysis. Finance and Stochastics, 10(1), 51\u201374.","journal-title":"Finance and Stochastics"},{"issue":"3","key":"6217_CR41","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1080\/17442508208833217","volume":"7","author":"RT Rockafellar","year":"1982","unstructured":"Rockafellar, R. T., & Wets, R. J. B. (1982). On the interchange of subdifferentiation and conditional expectation for convex functionals. Stochastics: An International Journal of Probability and Stochastic Processes, 7(3), 173\u2013182.","journal-title":"Stochastics: An International Journal of Probability and Stochastic Processes"},{"key":"6217_CR42","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-02431-3","volume-title":"Variational Analysis","author":"RT Rockafellar","year":"1998","unstructured":"Rockafellar, R. T., & Wets, R. J. B. (1998). Variational Analysis. Berlin: Springer."},{"key":"6217_CR43","first-page":"1","volume":"10","author":"A Ruszczy\u0144ski","year":"2003","unstructured":"Ruszczy\u0144ski, A., & Shapiro, A. (2003). Stochastic programming models handbooks in operations research and management. Science, 10, 1\u201364.","journal-title":"Science"},{"issue":"3","key":"6217_CR44","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1287\/moor.1050.0186","volume":"31","author":"A Ruszczy\u0144ski","year":"2006","unstructured":"Ruszczy\u0144ski, A., & Shapiro, A. (2006). Optimization of convex risk functions. Mathematics of Operations Research, 31(3), 433\u2013452.","journal-title":"Mathematics of Operations Research"},{"key":"6217_CR45","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1007\/1-84628-095-8_4","volume-title":"Probabilistic and randomized methods for design under uncertainty","author":"A Ruszczy\u0144ski","year":"2006","unstructured":"Ruszczy\u0144ski, A., & Shapiro, A. (2006). Optimization of risk measures. Probabilistic and randomized methods for design under uncertainty (pp. 119\u2013157). Berlin: Springer."},{"key":"6217_CR46","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781107298019","volume-title":"Understanding machine learning: From theory to algorithms","author":"S Shalev-Shwartz","year":"2014","unstructured":"Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge: Cambridge University Press."},{"key":"6217_CR47","first-page":"567","volume":"14","author":"S Shalev-Shwartz","year":"2013","unstructured":"Shalev-Shwartz, S., & Zhang, T. (2013). Stochastic dual coordinate ascent methods for regularized loss minimization. Journal of Machine Learning Research, 14, 567\u2013599.","journal-title":"Journal of Machine Learning Research"},{"key":"6217_CR48","unstructured":"Shamir, O., & Zhang, T. (2013). Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes. In Proceedings of the 30th International Conference on Machine Learning pp. 71\u201379."},{"issue":"1\u20133","key":"6217_CR49","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1007\/BF01582215","volume":"67","author":"A Shapiro","year":"1994","unstructured":"Shapiro, A. (1994). Quantitative stability in stochastic programming. Mathematical Programming, 67(1\u20133), 99\u2013108.","journal-title":"Mathematical Programming"},{"key":"6217_CR50","doi-asserted-by":"publisher","first-page":"327","DOI":"10.1007\/s10957-020-01655-4","volume":"185","author":"M Soueycatt","year":"2020","unstructured":"Soueycatt, M., Mohammad, Y., & Hamwi, Y. (2020). Regularization in banach spaces with respect to the bregman distance. Journal of Optimization Theory and Applications, 185, 327\u2013342.","journal-title":"Journal of Optimization Theory and Applications"},{"issue":"2","key":"6217_CR51","doi-asserted-by":"publisher","first-page":"423","DOI":"10.1214\/aoms\/1177700153","volume":"36","author":"V Strassen","year":"1965","unstructured":"Strassen, V. (1965). The existence of probability measures with given marginals. Annals of Mathematical Statistics, 36(2), 423\u2013439.","journal-title":"Annals of Mathematical Statistics"},{"key":"6217_CR52","volume-title":"The nature of statistical learning theory","author":"VN Vapnik","year":"1999","unstructured":"Vapnik, V. N. (1999). The nature of statistical learning theory (2nd ed.). Berlin: Springer.","edition":"2"},{"key":"6217_CR53","unstructured":"Williamson, R.C., & Menon, A.K. (2019). Fairness risk measures. In Proceedings of the 36th International Conference on Machine Learning (ICML) pp. 6786\u20136797."},{"key":"6217_CR54","unstructured":"Zhai, R., Dan, C., Kolter, J.Z., Ravikumar, P. (2021). DORO: Distributional and outlier robust optimization. In 38th International Conference on Machine Learning (ICML) Vol.\u00a0139, pp. 12345\u201312355."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06217-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-022-06217-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06217-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,11,29]],"date-time":"2022-11-29T22:25:26Z","timestamp":1669760726000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-022-06217-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,25]]},"references-count":54,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["6217"],"URL":"https:\/\/doi.org\/10.1007\/s10994-022-06217-5","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"type":"print","value":"0885-6125"},{"type":"electronic","value":"1573-0565"}],"subject":[],"published":{"date-parts":[[2022,8,25]]},"assertion":[{"value":"6 December 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 April 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 July 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 August 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The author declares no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"Not applicable.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}