{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,27]],"date-time":"2025-08-27T16:03:44Z","timestamp":1756310624689},"reference-count":19,"publisher":"Oxford University Press (OUP)","issue":"18","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Classification algorithms for high-dimensional biological data like gene expression profiles or metabolomic fingerprints are typically evaluated by the number of misclassifications across a test dataset. However, to judge the classification of a single case in the context of clinical diagnosis, we need to assess the uncertainties associated with that individual case rather than the average accuracy across many cases. Reliability of individual classifications can be expressed in terms of class probabilities. While classification algorithms are a well-developed area of research, the estimation of class probabilities is considerably less progressed in biology, with only a few classification algorithms that provide estimated class probabilities.<\/jats:p>\n               <jats:p>Results: We compared several probability estimators in the context of classification of metabolomics profiles. Evaluation criteria included sparseness biases, calibration of the estimator, the variance of the estimator and its performance in identifying highly reliable classifications. We observed that several of them display artifacts that compromise their use in practice. Classification probabilities based on a combination of local cross-validation error rates and monotone regression prove superior in metabolomic profiling.<\/jats:p>\n               <jats:p>Availability: The source code written in R is freely available at http:\/\/compdiag.uni-regensburg.de\/software\/probEstimation.shtml.<\/jats:p>\n               <jats:p>Contact: \u00a0inka.appel@klinik.uni-regensburg.de<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr434","type":"journal-article","created":{"date-parts":[[2011,7,23]],"date-time":"2011-07-23T04:30:30Z","timestamp":1311395430000},"page":"2563-2570","source":"Crossref","is-referenced-by-count":5,"title":["Estimating classification probabilities in high-dimensional diagnostic studies"],"prefix":"10.1093","volume":"27","author":[{"given":"Inka J.","family":"Appel","sequence":"first","affiliation":[{"name":"Institute of Functional Genomics, University of Regensburg, 93053 Regensburg, Germany"}]},{"given":"Wolfram","family":"Gronwald","sequence":"additional","affiliation":[{"name":"Institute of Functional Genomics, University of Regensburg, 93053 Regensburg, Germany"}]},{"given":"Rainer","family":"Spang","sequence":"additional","affiliation":[{"name":"Institute of Functional Genomics, University of Regensburg, 93053 Regensburg, Germany"}]}],"member":"286","published-online":{"date-parts":[[2011,7,22]]},"reference":[{"key":"2023012512001220800_B1","doi-asserted-by":"crossref","first-page":"6562","DOI":"10.1073\/pnas.102102699","article-title":"Selection bias in gene extraction on the basis of microarray gene-expression data","volume":"99","author":"Ambroise","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512001220800_B2","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1214\/aoms\/1177728423","article-title":"An empirical distribution function for sampling with incomplete information","volume":"26","author":"Ayer","year":"1955","journal-title":"Ann. Math. Stat."},{"key":"2023012512001220800_B3","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1080\/01621459.1982.10477856","article-title":"The well-calibrated Bayesian","volume":"77","author":"Dawid","year":"1982","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012512001220800_B4","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discrimination methods for the classification of tumors using gene expression data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012512001220800_B5","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1158\/1078-0432.CCR-09-1815","article-title":"DNA microarrays are predictive of cancer prognosis: a re-evaluation","volume":"16","author":"Fan","year":"2010","journal-title":"Clin. Cancer Res."},{"key":"2023012512001220800_B6","doi-asserted-by":"crossref","first-page":"1244","DOI":"10.1038\/ki.2011.30","article-title":"Detection of autosomal dominant polycystic kidney disease by NMR spectroscopic fingerprinting of urine","volume":"79","author":"Gronwald","year":"2011","journal-title":"Kidney Int."},{"issue":"Suppl. 1","key":"2023012512001220800_B7","doi-asserted-by":"crossref","first-page":"i101","DOI":"10.1093\/bioinformatics\/bth927","article-title":"Predicting gene regulation by sigma factors in Bacillus subtilis from genome-wide data","volume":"20","author":"de Hoon","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012512001220800_B8","doi-asserted-by":"crossref","first-page":"827","DOI":"10.1038\/nbt.1665","article-title":"The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models","volume":"28","author":"MAQC Consortium","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012512001220800_B9","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1038\/sj.bjc.6603673","article-title":"Interpretation of microarray data in cancer","volume":"96","author":"Michiels","year":"2007","journal-title":"Br. J. Cancer"},{"key":"2023012512001220800_B10","doi-asserted-by":"crossref","first-page":"625","DOI":"10.1145\/1102351.1102430","article-title":"Predicting good probabilities with supervised learning","author":"Niculescu-Mizil","year":"2005","journal-title":"ICML'05: Proceedings of the 22nd International Conference on Machine Learning."},{"key":"2023012512001220800_B11","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1186\/1471-2105-8-234","article-title":"Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation","volume":"8","author":"Parsons","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012512001220800_B12","first-page":"61","article-title":"Advances in large margin classifiers","volume-title":"Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.","author":"Platt","year":"2000"},{"key":"2023012512001220800_B13","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1038\/nrc2173","article-title":"Taking gene-expression profiling to the clinic: when will molecular signatures become relevant to patient care?","volume":"7","author":"Sotiriou","year":"2007","journal-title":"Nat. Rev. Cancer"},{"key":"2023012512001220800_B14","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512001220800_B15","doi-asserted-by":"crossref","first-page":"3755","DOI":"10.1093\/bioinformatics\/bti429","article-title":"A protocol for building and evaluating predictors of disease state based on microarray data","volume":"21","author":"Wessels","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512001220800_B16","doi-asserted-by":"crossref","first-page":"11462","DOI":"10.1073\/pnas.201162998","article-title":"Predicting the clinical status of human breast cancer by using gene expression profiles","volume":"98","author":"West","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512001220800_B17","doi-asserted-by":"crossref","first-page":"9991","DOI":"10.1073\/pnas.1732008100","article-title":"A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma","volume":"100","author":"Wright","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512001220800_B18","first-page":"694","article-title":"Transforming classifier scores into accurate multiclass probability estimates","volume-title":"SIGKDD'02","author":"Zadrozny","year":"2002"},{"key":"2023012512001220800_B19","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1186\/1471-2105-10-53","article-title":"Outcome prediction based on microarray analysis: a critical perspective on methods","volume":"10","author":"Zervakis","year":"2009","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/18\/2563\/48865329\/bioinformatics_27_18_2563.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/18\/2563\/48865329\/bioinformatics_27_18_2563.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T13:43:21Z","timestamp":1674654201000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/18\/2563\/181494"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,7,22]]},"references-count":19,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2011,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr434","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,9,15]]},"published":{"date-parts":[[2011,7,22]]}}}