{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,2,4]],"date-time":"2023-02-04T20:10:41Z","timestamp":1675541441394},"reference-count":32,"publisher":"MIT Press","issue":"6","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,5,19]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>van Rooyen, Menon, and Williamson (2015) introduced a notion of convex loss functions being robust to random classification noise and established that the \u201cunhinged\u201d loss function is robust in this sense. In this letter, we study the accuracy of binary classifiers obtained by minimizing the unhinged loss and observe that even for simple linearly separable data distributions, minimizing the unhinged loss may only yield a binary classifier with accuracy no better than random guessing.<\/jats:p>","DOI":"10.1162\/neco_a_01502","type":"journal-article","created":{"date-parts":[[2022,5,9]],"date-time":"2022-05-09T23:29:36Z","timestamp":1652138976000},"page":"1488-1499","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":0,"title":["The Perils of Being Unhinged: On the Accuracy of Classifiers Minimizing a Noise-Robust Convex Loss"],"prefix":"10.1162","volume":"34","author":[{"given":"Philip M.","family":"Long","sequence":"first","affiliation":[{"name":"Google, Mountain View, CA 94043, U.S.A. plong@google.com"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rocco A.","family":"Servedio","sequence":"additional","affiliation":[{"name":"Columbia University, New York, NY 10027, U.S.A. rocco@cs.columbia.edu"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","published-online":{"date-parts":[[2022,5,19]]},"reference":[{"issue":"473","key":"2022051920100271400_B1","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1198\/016214505000000907","article-title":"Convexity, classification, and risk bounds","volume":"101","author":"Bartlett","year":"2006","journal-title":"Journal of the American Statistical Association"},{"key":"2022051920100271400_B2","first-page":"961","article-title":"On symmetric losses for learning from corrupted labels","volume-title":"Proceedings of the 36th International Conference on Machine Learning Research, 97","author":"Charoenphakdee","year":"2019"},{"key":"2022051920100271400_B3","first-page":"961","article-title":"On symmetric losses for learning from corrupted labels","volume-title":"Proceedings of the International Conference on Machine Learning Research","author":"Charoenphakdee","year":"2019"},{"issue":"2","key":"2022051920100271400_B4","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1023\/A:1007607513941","article-title":"An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization","volume":"40","author":"Dietterich","year":"2000","journal-title":"Machine Learning"},{"key":"2022051920100271400_B5","first-page":"180","article-title":"MadaBoost: A modified version of AdaBoost","volume-title":"Proceedings of the Thirteenth Annual Conference on Computational Learning Theory","author":"Domingo","year":"2000"},{"issue":"2","key":"2022051920100271400_B6","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1023\/A:1007413511361","article-title":"On the optimality of the simple Bayesian classifier under zero-one loss","volume":"29","author":"Domingos","year":"1997","journal-title":"Machine Learning"},{"issue":"594\u2013604","key":"2022051920100271400_B7","first-page":"309","article-title":"On the mathematical foundations of theoretical statistics","volume":"222","author":"Fisher","year":"1922","journal-title":"Philosophical Transactions of the Royal Society of London"},{"key":"2022051920100271400_B8","first-page":"148","article-title":"Experiments with a new boosting algorithm","volume-title":"Proceedings of the Thirteenth International Conference on Machine Learning","author":"Freund","year":"1996"},{"issue":"1","key":"2022051920100271400_B9","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1006\/jcss.1997.1504","article-title":"A decision-theoretic generalization of on-line learning and an application to boosting","volume":"55","author":"Freund","year":"1997","journal-title":"Journal of Computer and System Sciences"},{"issue":"2","key":"2022051920100271400_B10","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1214\/aos\/1016218223","article-title":"Additive logistic regression: A statistical view of boosting","volume":"28","author":"Friedman","year":"1998","journal-title":"Annals of Statistics"},{"key":"2022051920100271400_B11","first-page":"225","volume-title":"Advances in neural information processing systems, 11","author":"Gentile","year":"1998"},{"key":"2022051920100271400_B12","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v31i1.10894","article-title":"Robust loss functions under label noise for deep neural networks","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Ghosh","year":"2017"},{"key":"2022051920100271400_B13","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1016\/j.neucom.2014.09.081","article-title":"Making risk minimization tolerant to label noise","volume":"160","author":"Ghosh","year":"2015","journal-title":"Neurocomputing"},{"key":"2022051920100271400_B14","first-page":"1772","article-title":"The implicit bias of gradient descent on nonseparable data","volume-title":"Proceedings of the Conference on Learning Theory","author":"Ji","year":"2019"},{"issue":"3","key":"2022051920100271400_B15","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1007\/s10994-009-5165-z","article-title":"Random classification noise defeats all convex potential boosters","volume":"78","author":"Long","year":"2010","journal-title":"Machine Learning"},{"key":"2022051920100271400_B16","first-page":"801","article-title":"Consistency versus realizable H-consistency for multiclass classification","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Long","year":"2013"},{"key":"2022051920100271400_B17","first-page":"546","article-title":"An empirical evaluation of bagging and boosting","volume-title":"Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference","author":"Maclin","year":"1997"},{"key":"2022051920100271400_B18","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511809071","volume-title":"Introduction to information retrieval","author":"Manning","year":"2008"},{"issue":"3","key":"2022051920100271400_B19","doi-asserted-by":"publisher","first-page":"1146","DOI":"10.1109\/TSMCB.2012.2223460","article-title":"Noise tolerance under risk minimization","volume":"43","author":"Manwani","year":"2013","journal-title":"IEEE Trans. Cybern."},{"key":"2022051920100271400_B20","first-page":"512","volume-title":"Advances in neural information processing systems, 7","author":"Mason","year":"1999"},{"key":"2022051920100271400_B21","first-page":"1196","article-title":"Learning with noisy labels","volume-title":"Advances in neural information processing systems, 26","author":"Natarajan","year":"2013"},{"key":"2022051920100271400_B22","volume-title":"Advances in neural information processing systems","author":"Ng","year":"2001"},{"key":"2022051920100271400_B23","first-page":"1944","article-title":"Making deep neural networks robust to label noise: A loss correction approach","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Patrini","year":"2017"},{"key":"2022051920100271400_B24","volume-title":"Convex analysis","author":"Rockafellar","year":"2015"},{"issue":"5","key":"2022051920100271400_B25","doi-asserted-by":"publisher","first-page":"1358","DOI":"10.1137\/S0097539798340928","article-title":"Perceptron, Winnow, and PAC learning","volume":"31","author":"Servedio","year":"2002","journal-title":"SIAM J. Comput."},{"key":"2022051920100271400_B26","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511809682","volume-title":"Kernel methods for pattern analysis","author":"Shawe-Taylor","year":"2004"},{"issue":"1","key":"2022051920100271400_B27","first-page":"2822","article-title":"The implicit bias of gradient descent on separable data","volume":"19","author":"Soudry","year":"2018","journal-title":"Journal of Machine Learning Research"},{"key":"2022051920100271400_B28","first-page":"307","article-title":"Margins, shrinkage, and boosting","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Telgarsky","year":"2013"},{"issue":"10","key":"2022051920100271400_B29","doi-asserted-by":"publisher","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"2022051920100271400_B30","article-title":"Learning with symmetric label noise: The importance of being unhinged","volume-title":"Advances in neural information processing systems, 28","author":"Rooyen","year":"2015"},{"issue":"1","key":"2022051920100271400_B31","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1214\/aos\/1079120130","article-title":"Statistical behavior and consistency of classification methods based on convex risk minimization","volume":"32","author":"Zhang","year":"2004","journal-title":"Annals of Statistics"},{"key":"2022051920100271400_B32","volume-title":"Advances in neural information processing systems","author":"Zhang","year":"2018"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/34\/6\/1488\/2023463\/neco_a_01502.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/34\/6\/1488\/2023463\/neco_a_01502.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,4]],"date-time":"2023-02-04T19:33:24Z","timestamp":1675539204000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/34\/6\/1488\/110643\/The-Perils-of-Being-Unhinged-On-the-Accuracy-of"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,19]]},"references-count":32,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2022,5,19]]},"published-print":{"date-parts":[[2022,5,19]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01502","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,6]]},"published":{"date-parts":[[2022,5,19]]}}}