{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,22]],"date-time":"2025-12-22T21:40:06Z","timestamp":1766439606519,"version":"3.48.0"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2025,12,1]],"date-time":"2025-12-01T00:00:00Z","timestamp":1764547200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T00:00:00Z","timestamp":1764633600000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004462","name":"Consiglio Nazionale Delle Ricerche","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004462","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    The standard technique for predicting the accuracy that a classifier will have on unseen data (\n                    <jats:italic>classifier accuracy prediction<\/jats:italic>\n                    \u2014CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of\n                    <jats:italic>dataset shift<\/jats:italic>\n                    ), the estimates returned by CV are unreliable. The contribution of this paper is three-fold. First, we propose a CAP method specifically designed to work under\n                    <jats:italic>prior probability shift<\/jats:italic>\n                    (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. This method estimates the\n                    <jats:inline-formula>\n                      <jats:tex-math>$$n^2$$<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    entries of the contingency table of the test data (thus allowing to estimate the value of any specific evaluation measure) by solving a system of\n                    <jats:inline-formula>\n                      <jats:tex-math>$$n^2$$<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    independent linear equations, with\n                    <jats:italic>n<\/jats:italic>\n                    the number of classes. Second, we show that the equations that the cells of the contingency table must satisfy are actually more than\n                    <jats:inline-formula>\n                      <jats:tex-math>$$n^2$$<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    , which gives rise to an overconstrained problem, and present a family of methods each based on a different selection of\n                    <jats:inline-formula>\n                      <jats:tex-math>$$n^2$$<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    such equations. Third, we observe that, since a key step of the above methods involves predicting the class priors of the test data, one can exploit intuitions from the field of class prior estimation (a.k.a. \u201cquantification\u201d). Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our methods tend to outperform existing CAP methods.\n                  <\/jats:p>","DOI":"10.1007\/s10994-025-06878-y","type":"journal-article","created":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T14:26:58Z","timestamp":1764685618000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["LEAP: Linear equations for classifier accuracy prediction under prior probability shift"],"prefix":"10.1007","volume":"114","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-0851-8041","authenticated-orcid":false,"given":"Lorenzo","family":"Volpi","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0377-1025","authenticated-orcid":false,"given":"Alejandro","family":"Moreo","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4221-6427","authenticated-orcid":false,"given":"Fabrizio","family":"Sebastiani","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,12,2]]},"reference":[{"key":"6878_CR1","doi-asserted-by":"crossref","unstructured":"Storkey, A. (2009). When training and test sets are different: Characterizing learning transfer. In Qui onero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (Eds.) Dataset shift in machine learning (pp. 3\u201328). The MIT Press, Cambridge.","DOI":"10.7551\/mitpress\/7921.003.0004"},{"issue":"1","key":"6878_CR2","doi-asserted-by":"publisher","first-page":"521","DOI":"10.1016\/j.patcog.2011.06.019","volume":"45","author":"JG Moreno-Torres","year":"2012","unstructured":"Moreno-Torres, J. G., Raeder, T., Ala\u00edz-Rodr\u00edguez, R., Chawla, N. V., & Herrera, F. (2012). A unifying view on dataset shift in classification. Pattern Recognition, 45(1), 521\u2013530. https:\/\/doi.org\/10.1016\/j.patcog.2011.06.019","journal-title":"Pattern Recognition"},{"key":"6878_CR3","unstructured":"Sch\u00f6lkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., & Mooij, J.M. (2012). On causal and anticausal learning. In Proceedings of the 29th international conference on machine learning (ICML 2012), Edinburgh, UK."},{"key":"6878_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.spl.2023.109946","volume":"205","author":"PN Patrone","year":"2024","unstructured":"Patrone, P. N., & Kearsley, A. J. (2024). Minimizing uncertainty in prevalence estimates. Statistics & Probability Letters, 205, Article 109946. https:\/\/doi.org\/10.1016\/j.spl.2023.109946","journal-title":"Statistics & Probability Letters"},{"key":"6878_CR5","doi-asserted-by":"publisher","DOI":"10.1016\/J.MEDIA.2025.103504","volume":"102","author":"P Godau","year":"2025","unstructured":"Godau, P., Kalinowski, P., Christodoulou, E., Reinke, A., Tizabi, M., Ferrer, L., J\u00e4ger, P., & Maier-Hein, L. (2025). Navigating prevalence shifts in image analysis algorithm deployment. Medical Image Analysis, 102, Article 103504. https:\/\/doi.org\/10.1016\/J.MEDIA.2025.103504","journal-title":"Medical Image Analysis"},{"issue":"1","key":"6878_CR6","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1109\/4235.585893","volume":"1","author":"DH Wolpert","year":"1997","unstructured":"Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67\u201382.","journal-title":"IEEE Transactions on Evolutionary Computation"},{"key":"6878_CR7","unstructured":"Garg, S., Balakrishnan, S., Lipton, Z.C., Neyshabur, B., & Sedghi, H. (2022). Leveraging unlabeled data to predict out-of-distribution performance. In Proceedings of the 10th international conference on learning representations (ICLR 2022), Virtual event."},{"key":"6878_CR8","doi-asserted-by":"publisher","unstructured":"Forman, G. (2005). Counting positives accurately despite inaccurate classification. In Proceedings of the 16th European conference on machine learning (ECML 2005) (pp. 564\u2013575). Porto, PT. https:\/\/doi.org\/10.1007\/11564096_55","DOI":"10.1007\/11564096_55"},{"issue":"5","key":"6878_CR9","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1145\/3117807","volume":"50","author":"P Gonz\u00e1lez","year":"2017","unstructured":"Gonz\u00e1lez, P., Casta\u00f1o, A., Chawla, N. V., & Coz, J. J. (2017). A review on quantification learning. ACM Computing Surveys, 50(5), 74\u201317440. https:\/\/doi.org\/10.1145\/3117807","journal-title":"ACM Computing Surveys"},{"key":"6878_CR10","doi-asserted-by":"publisher","unstructured":"Esuli, A., Fabris, A., Moreo, A., & Sebastiani, F. (2023). Learning to quantify. Springer, Cham. https:\/\/doi.org\/10.1007\/978-3-031-20467-8","DOI":"10.1007\/978-3-031-20467-8"},{"key":"6878_CR11","unstructured":"Garg, S., Balakrishnan, S., Lipton, Z.C., Neyshabur, B., & Sedghi, H. (2022). Leveraging unlabeled data to predict out-of-distribution performance. arXiv:2201.04234 [cs.LG]."},{"key":"6878_CR12","doi-asserted-by":"crossref","unstructured":"Guillory, D., Shankar, V., Ebrahimi, S., Darrell, T., & Schmidt, L. (2021). Predicting with confidence on unseen distributions. In Proceedings of the IEEE\/CVF international conference on computer vision (ICCV 2021), Montreal, CA (pp. 1134\u20131144).","DOI":"10.1109\/ICCV48922.2021.00117"},{"key":"6878_CR13","doi-asserted-by":"publisher","unstructured":"Volpi, L., Moreo, A., & Sebastiani, F. (2024). A simple method for classifier accuracy prediction under prior probability shift. In Proceedings of the 27th international conference on discovery science (DS 2024), Pisa, IT (pp. 267\u2013283). https:\/\/doi.org\/10.1007\/978-3-031-78980-9_17","DOI":"10.1007\/978-3-031-78980-9_17"},{"key":"6878_CR14","unstructured":"Xie, R., Wei, H., Feng, L., Cao, Y., & An, B. (2023). On the importance of feature separability in predicting out-of-distribution error. In Proceedings of the 36th annual conference on neural information processing systems (NeurIPS 2023), New Orleans, US (pp. 27783\u201327800)."},{"key":"6878_CR15","unstructured":"Deng, W., Suh, Y., Gould, S., & Zheng, L. (2023). Confidence and dispersity speak: Characterizing prediction matrix for unsupervised accuracy estimation. In Proceedings of the 40th international conference on machine learning (ICML 2023), Honolulu, US (pp. 7658\u20137674)."},{"key":"6878_CR16","unstructured":"Lu, Y., Qin, Y., Zhai, R., Shen, A., Chen, K., Wang, Z., Kolouri, S., Stepputtis, S., Campbell, J., & Sycara, K. P. (2023). Characterizing out-of-distribution error via optimal transport. In Proceedings of the 36th annual conference on neural information processing systems (NeurIPS 2023), New Orleans, US."},{"key":"6878_CR17","unstructured":"Kivim\u00e4ki, J., Bialek, J., Kuberski, W., & Nurminen, J. K. (2025). Performance estimation in binary classification using calibrated confidence. arXiv:2505.05295 [cs.LG]"},{"key":"6878_CR18","doi-asserted-by":"publisher","unstructured":"Elsahar, H., & Gall , M. (2019). To annotate or not? Predicting performance drop under domain shift. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP 2019), Hong Kong, CN (pp. 2163\u20132173). https:\/\/doi.org\/10.18653\/v1\/D19-1222","DOI":"10.18653\/v1\/D19-1222"},{"key":"6878_CR19","unstructured":"Chen, M.F., Goel, K., Sohoni, N.S., Poms, F., Fatahalian, K., & R\u00e9, C. (2021). Mandoline: Model evaluation under distribution shift. In Proceedings of the 38th international conference on machine learning (ICML 2021), virtual event (pp. 1617\u20131629)."},{"key":"6878_CR20","unstructured":"Chen, J., Liu, F., Avci, B., Wu, X., Liang, Y., & Jha, S. (2021). Detecting errors and estimating accuracy on unlabeled data with self-training ensembles. In Proceedings of the 35th conference on neural information processing systems (NeurIPS 2021), virtual event (pp. 14980\u201314992)."},{"key":"6878_CR21","unstructured":"Jiang, Y., Nagarajan, V., Baek, C., & Kolter, J.Z. (2022). Assessing generalization of SGD via disagreement. In Proceedings of the international conference on learning representations (ICLR 2022), virtual event."},{"key":"6878_CR22","unstructured":"Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P., & Kawanabe, M. (2007). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of the 21st conference on advances in neural information processing systems (NIPS 2007), Vancouver, CA (pp. 1433\u20131440)."},{"key":"6878_CR23","unstructured":"Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J., Lakshminarayanan, B., & Snoek, J. (2019). Can you trust your model\u2019s uncertainty? Evaluating predictive uncertainty under dataset shift. In Proceedings of the 32nd annual conference on neural information processing systems (NeurIPS 2019), Vancouver, CA (pp. 13969\u201313980)."},{"key":"6878_CR24","unstructured":"Zhang, K., Sch lkopf, B., Muandet, K., & Wang, Z. (2013). Domain adaptation under target and conditional shift. In Proceedings of the 30th international conference on machine learning (ICML 2013) (pp. 819\u2013827). Atlanta, US."},{"key":"6878_CR25","unstructured":"Lipton, Z.C., Wang, Y., & Smola, A.J. (2018). Detecting and correcting for label shift with black box predictors. In Proceedings of the 35th international conference on machine learning (ICML 2018), Stockholm, SE (pp. 3128\u20133136)"},{"key":"6878_CR26","unstructured":"Tasche, D. (2024). Comments on Friedman\u2019s method for class distribution estimation. arXiv:2405.16666 [cs.LG]"},{"issue":"2","key":"6878_CR27","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1007\/s10618-008-0097-y","volume":"17","author":"G Forman","year":"2008","unstructured":"Forman, G. (2008). Quantifying counts and costs via classification. Data Mining and Knowledge Discovery, 17(2), 164\u2013206. https:\/\/doi.org\/10.1007\/s10618-008-0097-y","journal-title":"Data Mining and Knowledge Discovery"},{"key":"6878_CR28","doi-asserted-by":"publisher","unstructured":"Bella, A., Ferri, C., Hern ndez-Orallo, J., & Ram rez-Quintana, M.J. (2010). Quantification via probability estimators. In Proceedings of the 11th IEEE international conference on data mining (ICDM 2010), Sydney, AU (pp. 737\u2013742). https:\/\/doi.org\/10.1109\/icdm.2010.75","DOI":"10.1109\/icdm.2010.75"},{"key":"6878_CR29","unstructured":"Bunse, M. (2022). On multi-class extensions of adjusted classify and count. In Proceedings of the 2nd international workshop on learning to quantify (LQ 2022), Grenoble, IT (pp. 43\u201350)."},{"key":"6878_CR30","doi-asserted-by":"publisher","unstructured":"Moreo, A., Gonz lez, P., del & Coz, J.J. (2025). Kernel density estimation for multiclass quantification. Machine Learning 114(4). https:\/\/doi.org\/10.1007\/s10994-024-06726-5","DOI":"10.1007\/s10994-024-06726-5"},{"key":"6878_CR31","unstructured":"Smith, N.A., & Tromble, R.W. (2004). Sampling uniformly from the unit simplex [Technical report], Johns Hopkins University. https:\/\/www.cs.cmu.edu\/~nasmith\/papers\/smith+tromble.tr04.pdf"},{"key":"6878_CR32","unstructured":"Kelly, M., Longjohn, R., & Nottingham, K. The UCI machine learning repository. https:\/\/archive.ics.uci.edu"},{"key":"6878_CR33","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825\u20132830.","journal-title":"Journal of Machine Learning Research"},{"key":"6878_CR34","doi-asserted-by":"crossref","unstructured":"Popordanoska, T., Radevski, G., Tuytelaars, T., & Blaschko, M. (2024) Lascal: Label-shift calibration without target labels. In Proceedings of the 37th annual conference on neural information processing systems (NeurIPS 2024), Vancouver, CA (pp. 65386\u201365414).","DOI":"10.52202\/079017-2088"},{"key":"6878_CR35","doi-asserted-by":"publisher","unstructured":"Moreo, A., Esuli, A., & Sebastiani, F. (2021). QuaPy: A Python-based framework for quantification. In Proceedings of the 30th ACM international conference on knowledge management (CIKM 2021), Gold Coast, AU (pp. 4534\u20134543). https:\/\/doi.org\/10.1145\/3459637.3482015","DOI":"10.1145\/3459637.3482015"},{"issue":"83","key":"6878_CR36","first-page":"1","volume":"17","author":"S Diamond","year":"2016","unstructured":"Diamond, S., & Boyd, S. (2016). CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83), 1\u20135.","journal-title":"Journal of Machine Learning Research"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-025-06878-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-025-06878-y","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-025-06878-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,22]],"date-time":"2025-12-22T21:30:03Z","timestamp":1766439003000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-025-06878-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12]]},"references-count":36,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["6878"],"URL":"https:\/\/doi.org\/10.1007\/s10994-025-06878-y","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"type":"print","value":"0885-6125"},{"type":"electronic","value":"1573-0565"}],"subject":[],"published":{"date-parts":[[2025,12]]},"assertion":[{"value":"7 April 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 July 2025","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 August 2025","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 December 2025","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The code to reproduce all the experiments is available at\n                      \n                      .","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}},{"value":"The authors declare no Conflict of interest.","order":6,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"293"}}