{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:54:46Z","timestamp":1758272086673},"reference-count":37,"publisher":"MIT Press - Journals","issue":"12","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,11,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Ordinal regression is aimed at predicting an ordinal class label. In this letter, we consider its semisupervised formulation, in which we have unlabeled data along with ordinal-labeled data to train an ordinal regressor. There are several metrics to evaluate the performance of ordinal regression, such as the mean absolute error, mean zero-one error, and mean squared error. However, the existing studies do not take the evaluation metric into account, restrict model choice, and have no theoretical guarantee. To overcome these problems, we propose a novel generic framework for semisupervised ordinal regression based on the empirical risk minimization principle that is applicable to optimizing all of the metrics mentioned above. In addition, our framework has flexible choices of models, surrogate losses, and optimization algorithms without the common geometric assumption on unlabeled data such as the cluster assumption or manifold assumption. We provide an estimation error bound to show that our risk estimator is consistent. Finally, we conduct experiments to show the usefulness of our framework.<\/jats:p>","DOI":"10.1162\/neco_a_01445","type":"journal-article","created":{"date-parts":[[2021,10,28]],"date-time":"2021-10-28T22:30:09Z","timestamp":1635460209000},"page":"3361-3412","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":3,"title":["Semisupervised Ordinal Regression Based on Empirical Risk Minimization"],"prefix":"10.1162","volume":"33","author":[{"given":"Taira","family":"Tsuchiya","sequence":"first","affiliation":[{"name":"University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan, and RIKEN AIP: Chuo-ku, Tokyo 103-0027, Japan tsuchiya@sys.i.kyoto-u.ac.jp"}]},{"given":"Nontawat","family":"Charoenphakdee","sequence":"additional","affiliation":[{"name":"University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan, and RIKEN AIP: Chuo-ku, Tokyo 103-0027, Japan nontawat@ms.k.u-tokyo.ac.jp"}]},{"given":"Issei","family":"Sato","sequence":"additional","affiliation":[{"name":"University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan sato@k.u-tokyo.ac.jp"}]},{"given":"Masashi","family":"Sugiyama","sequence":"additional","affiliation":[{"name":"RIKEN AIP: Chuo-ku, Tokyo 103-0027, Japan, and University of Tokyo, Bunkyo-ku, Tokyo, 113-0333, Japan sugi@k.u-tokyo.ac.jp"}]}],"member":"281","published-online":{"date-parts":[[2021,11,12]]},"reference":[{"key":"2021112221525027900_B1","first-page":"452","article-title":"Classification from pairwise similarity and unlabeled data.","author":"Bao","year":"2018","journal-title":"Proceedings of the 35th International Conference on Machine Learning"},{"key":"2021112221525027900_B2","first-page":"463","article-title":"Rademacher and Gaussian complexities: Risk bounds and structural results","volume":"3","author":"Bartlett","year":"2002","journal-title":"Journal of Machine Learning Research"},{"key":"2021112221525027900_B3","first-page":"2399","article-title":"Manifold regularization: A geometric framework for learning from labeled and unlabeled examples","volume":"7","author":"Belkin","year":"2006","journal-title":"Journal of Machine Learning Research"},{"issue":"3","key":"2021112221525027900_B4","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1016\/S0022-0000(03)00038-2","article-title":"On the difficulty of approximately maximizing agreements","volume":"66","author":"Ben-David","year":"2003","journal-title":"Journal of Computer and System Sciences"},{"issue":"5","key":"2021112221525027900_B5","first-page":"546","article-title":"Ordinal logistic regression in medical research","volume":"31","author":"Bender","year":"1997","journal-title":"Journal of the Royal College of Physicians of London"},{"issue":"10","key":"2021112221525027900_B6","doi-asserted-by":"publisher","first-page":"809","DOI":"10.1016\/S0895-4356(98)00066-3","article-title":"Using binary logistic regression models for ordinal data with non-proportional odds.","volume":"51","author":"Bender","year":"1998","journal-title":"Journal of Clinical Epidemiology"},{"key":"2021112221525027900_B7","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511804441","article-title":"Convex optimization","author":"Boyd","year":"2004"},{"key":"2021112221525027900_B8","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/9780262033589.001.0001","author":"Chapelle","year":"2006","journal-title":"Semi-supervised learning"},{"key":"2021112221525027900_B9","first-page":"1019","article-title":"Gaussian processes for ordinal regression","volume":"6","author":"Chu","year":"2005","journal-title":"Journal of Machine Learning Research"},{"key":"2021112221525027900_B10","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1145\/1102351.1102370","article-title":"New approaches to support vector ordinal regression.","author":"Chu","year":"2005","journal-title":"Proceedings of the 22nd International Conference on Machine Learning"},{"issue":"6","key":"2021112221525027900_B11","article-title":"Support vector machines in ordinal classification: An application to corporate credit scoring","volume":"15","author":"Dikkers","year":"2005","journal-title":"Neural Network World"},{"key":"2021112221525027900_B12","first-page":"1386","article-title":"Convex formulation for learning from positive and unlabeled data.","author":"du Plessis","year":"2015","journal-title":"Proceedings of the 32nd International Conference On Machine Learning"},{"issue":"6","key":"2021112221525027900_B13","doi-asserted-by":"publisher","first-page":"1558","DOI":"10.1137\/120865094","article-title":"Agnostic learning of monomials by halfspaces is hard","volume":"41","author":"Feldman","year":"2012","journal-title":"SIAM Journal on Computing"},{"issue":"1","key":"2021112221525027900_B14","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1016\/j.ssresearch.2011.09.003","article-title":"The proportional odds with partial proportionality constraints model for ordinal response variables","volume":"41","author":"Fullerton","year":"2012","journal-title":"Social Science Research"},{"issue":"1","key":"2021112221525027900_B15","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1109\/TKDE.2015.2457911","article-title":"Ordinal regression methods: Survey and experimental study","volume":"28","author":"Gutierrez","year":"2016","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"2021112221525027900_B16","doi-asserted-by":"publisher","first-page":"1022","DOI":"10.3724\/SP.J.1087.2010.01022","article-title":"Towards semi-supervised ordinal regression with nearest neighbor","volume":"30","author":"Hua-fu","year":"2010","journal-title":"Journal of Computer Applications"},{"issue":"8","key":"2021112221525027900_B17","doi-asserted-by":"publisher","first-page":"1800","DOI":"10.1016\/j.cor.2011.06.023","article-title":"A corporate credit rating model using multi-class support vector machines with an ordinal pairwise partitioning approach","volume":"39","author":"Kim","year":"2012","journal-title":"Computers and Operations Research"},{"key":"2021112221525027900_B18","first-page":"1675","volume-title":"Advances in neural information processing systems","author":"Kiryo","year":"2017"},{"issue":"1","key":"2021112221525027900_B19","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1109\/TPAMI.2014.2299812","article-title":"Towards making unlabeled data never hurt","volume":"37","author":"Li","year":"2015","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"2021112221525027900_B20","doi-asserted-by":"crossref","first-page":"1393","DOI":"10.1145\/2072298.2072023","article-title":"Semi-supervised manifold ordinal regression for image ranking.","author":"Liu","year":"2011","journal-title":"Proceedings of the 19th ACM International Conference on Multimedia"},{"key":"2021112221525027900_B21","first-page":"1115","article-title":"Mitigating overfitting in supervised classification from two unlabeled datasets: A consistent risk correction approach.","volume":"108","author":"Lu","year":"2020","journal-title":"Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics"},{"issue":"8","key":"2021112221525027900_B22","doi-asserted-by":"publisher","first-page":"3797","DOI":"10.1109\/TIT.2008.926323","article-title":"Lower bounds for the empirical minimization algorithm","volume":"54","author":"Mendelson","year":"2008","journal-title":"IEEE Transactions on Information Theory"},{"key":"2021112221525027900_B23","author":"Mohri","year":"2018","journal-title":"Foundations of machine learning"},{"key":"2021112221525027900_B24","first-page":"1199","volume-title":"Advances in neural information processing systems","author":"Niu","year":"2016"},{"key":"2021112221525027900_B25","first-page":"3235","volume-title":"Advances in neural information processing systems","author":"Oliver","year":"2018"},{"key":"2021112221525027900_B26","first-page":"708","article-title":"Loss factorization, weakly supervised learning and label noise robustness.","author":"Patrini","year":"2016","journal-title":"Proceedings of the 33rd International Conference on Machine Learning"},{"key":"2021112221525027900_B27","first-page":"1","article-title":"On the consistency of ordinal regression methods","volume":"18","author":"Pedregosa","year":"2017","journal-title":"Journal of Machine Learning Research"},{"key":"2021112221525027900_B28","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.neunet.2016.08.004","article-title":"Semi-supervised learning for ordinal kernel discriminant analysis","volume":"84","author":"P\u00e9rez-Ortiz","year":"2016","journal-title":"Neural Networks"},{"key":"2021112221525027900_B29","first-page":"180","article-title":"Loss functions for preference levels: Regression with discrete ordered labels.","author":"Rennie","year":"2005","journal-title":"Proceedings of the IJCAI Multidisciplinary Workshop on Advances in Preference Handling"},{"key":"2021112221525027900_B30","first-page":"2998","article-title":"Semi-supervised classification based on classification from positive and unlabeled data.","author":"Sakai","year":"2017","journal-title":"Proceedings of the 34th International Conference on Machine Learning"},{"issue":"4","key":"2021112221525027900_B31","doi-asserted-by":"publisher","first-page":"767","DOI":"10.1007\/s10994-017-5678-9","article-title":"Semi-supervised AUC optimization based on positive-unlabeled learning","volume":"107","author":"Sakai","year":"2018","journal-title":"Machine Learning"},{"issue":"7","key":"2021112221525027900_B32","doi-asserted-by":"publisher","first-page":"1074","DOI":"10.1109\/TNNLS.2012.2198240","article-title":"Transductive ordinal regression","volume":"23","author":"Seah","year":"2012","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"2021112221525027900_B33","article-title":"Learning with labeled and unlabeled data","author":"Seeger","year":"2000"},{"key":"2021112221525027900_B34","author":"Shalev-Shwartz","year":"2014","journal-title":"Understanding machine learning: From theory to algorithms"},{"key":"2021112221525027900_B35","first-page":"144","article-title":"Semi-supervised gaussian process ordinal regression.","author":"Srijith","year":"2013","journal-title":"Proceedings of the 2013 European Conference on Machine Learning and Knowledge Discovery in Databases"},{"key":"2021112221525027900_B36","doi-asserted-by":"crossref","first-page":"2002","DOI":"10.1145\/3292500.3330756","article-title":"Chainer: A deep learning framework for accelerating the research cycle.","author":"Tokui","year":"2019","journal-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining"},{"issue":"suppl. 3","key":"2021112221525027900_B37","doi-asserted-by":"publisher","first-page":"S16","DOI":"10.2337\/diabetes.53.suppl_3.S16","article-title":"Five stages of evolving beta-cell dysfunction during progression to diabetes","volume":"53","author":"Weir","year":"2004","journal-title":"Diabetes"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/33\/12\/3361\/1972015\/neco_a_01445.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/33\/12\/3361\/1972015\/neco_a_01445.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,11,22]],"date-time":"2021-11-22T21:54:19Z","timestamp":1637618059000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/33\/12\/3361\/107676\/Semisupervised-Ordinal-Regression-Based-on"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,12]]},"references-count":37,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,11,12]]},"published-print":{"date-parts":[[2021,11,12]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01445","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,12]]},"published":{"date-parts":[[2021,11,12]]}}}