{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,9]],"date-time":"2026-07-09T14:59:33Z","timestamp":1783609173031,"version":"3.55.0"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2018,11,29]],"date-time":"2018-11-29T00:00:00Z","timestamp":1543449600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Deichmann foundation"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Clinical decision support systems have been applied in numerous fields, ranging from cancer survival toward drug resistance prediction. Nevertheless, clinical decision support systems typically have a caveat: many of them are perceived as black-boxes by non-experts and, unfortunately, the obtained scores cannot usually be interpreted as class probability estimates. In probability-focused medical applications, it is not sufficient to perform well with regards to discrimination and, consequently, various calibration methods have been developed to enable probabilistic interpretation. The aims of this study were (i) to develop a tool for fast and comparative analysis of different calibration methods, (ii) to demonstrate their limitations for the use on clinical data and (iii) to introduce our novel method GUESS.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We compared the performances of two different state-of-the-art calibration methods, namely histogram binning and Bayesian Binning in Quantiles, as well as our novel method GUESS on both, simulated and real-world datasets. GUESS demonstrated calibration performance comparable to the state-of-the-art methods and always retained accurate class discrimination. GUESS showed superior calibration performance in small datasets and therefore may be an optimal calibration method for typical clinical datasets. Moreover, we provide a framework (CalibratR) for R, which can be used to identify the most suitable calibration method for novel datasets in a timely and efficient manner. Using calibrated probability estimates instead of original classifier scores will contribute to the acceptance and dissemination of machine learning based classification models in cost-sensitive applications, such as clinical research.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>GUESS as part of CalibratR can be downloaded at CRAN.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty984","type":"journal-article","created":{"date-parts":[[2018,11,28]],"date-time":"2018-11-28T12:10:08Z","timestamp":1543407008000},"page":"2458-2465","source":"Crossref","is-referenced-by-count":44,"title":["GUESS: projecting machine learning scores to well-calibrated probability estimates for clinical decision-making"],"prefix":"10.1093","volume":"35","author":[{"given":"Johanna","family":"Schwarz","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dominik","family":"Heider","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2018,11,29]]},"reference":[{"key":"2023062712310913700_bty984-B1","doi-asserted-by":"crossref","first-page":"e151.","DOI":"10.1093\/nar\/gkx642","article-title":"De novo pathway-based biomarker identification","volume":"45","author":"Alcaraz","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023062712310913700_bty984-B2","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1186\/1471-2164-9-184","article-title":"Linking cytoscape and the corynebacterial reference database coryneregnet","volume":"9","author":"Baumbach","year":"2008","journal-title":"BMC Genomics"},{"key":"2023062712310913700_bty984-B3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1089\/sysm.2017.28999.jba","article-title":"The end of medicine as we know it","volume":"1","author":"Baumbach","year":"2018","journal-title":"Syst. Med"},{"key":"2023062712310913700_bty984-B4","doi-asserted-by":"crossref","first-page":"1296","DOI":"10.1055\/s-0042-119529","article-title":"The GALAD scoring algorithm based on AFP, AFP-l3, and DCP significantly improves detection of BCLC early stage hepatocellular carcinoma","volume":"54","author":"Best","year":"2016","journal-title":"Z. Gastroenterol"},{"key":"2023062712310913700_bty984-B5","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1016\/j.canlet.2016.05.033","article-title":"Big data and machine learning in radiation oncology: state of the art and future prospects","volume":"382","author":"Bibault","year":"2016","journal-title":"Cancer Lett"},{"key":"2023062712310913700_bty984-B6","doi-asserted-by":"crossref","first-page":"517","DOI":"10.1001\/jama.2017.7797","article-title":"Unintended consequences of machine learning in medicine","volume":"318","author":"Cabitza","year":"2017","journal-title":"JAMA"},{"key":"2023062712310913700_bty984-B7","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1186\/s12874-015-0015-0","article-title":"Predictive modeling in pediatric traumatic brain injury using machine learning","volume":"15","author":"Chong","year":"2015","journal-title":"BMC Med. Res. Methodol"},{"key":"2023062712310913700_bty984-B8","doi-asserted-by":"crossref","first-page":"1559","DOI":"10.1038\/s41591-018-0177-5","article-title":"Classification and mutation prediction from non\u2013small cell lung cancer histopathology images using deep learning","volume":"24","author":"Coudray","year":"2018","journal-title":"Nat. Med"},{"key":"2023062712310913700_bty984-B9","first-page":"41","author":"Czerniak","year":"2003"},{"key":"2023062712310913700_bty984-B10","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1016\/j.gie.2014.02.1028","article-title":"Endoscopic management is the treatment of choice for bile leaks after liver resection","volume":"80","author":"Dech\u00eane","year":"2014","journal-title":"Gastrointest. Endosc"},{"key":"2023062712310913700_bty984-B11","first-page":"1","article-title":"Statistical comparisons of classifiers over multiple data sets","volume":"7","author":"Demsar","year":"2006","journal-title":"J. Mach. Learn. Res"},{"key":"2023062712310913700_bty984-B12","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1186\/1756-0381-4-26","article-title":"Improved Bevirimat resistance prediction by combination of structural and sequence-based classifiers","volume":"4","author":"Dybowski","year":"2011","journal-title":"BioData Min"},{"key":"2023062712310913700_bty984-B13","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1080\/01621459.1937.10503522","article-title":"The use of ranks to avoid the assumption of normality implicit in the analysis of variance","volume":"32","author":"Friedman","year":"1937","journal-title":"J. Am. Stat. Assoc"},{"key":"2023062712310913700_bty984-B14","first-page":"104","author":"Haberman","year":"1976"},{"key":"2023062712310913700_bty984-B15","first-page":"65","article-title":"A simple sequentially rejective multiple test procedure","volume":"6","author":"Holm","year":"1979","journal-title":"Scand. J. Stat"},{"key":"2023062712310913700_bty984-B16","doi-asserted-by":"crossref","first-page":"790","DOI":"10.1038\/nrmicro1477","article-title":"Bioinformatics-assisted anti-HIV therapy","volume":"4","author":"Lengauer","year":"2006","journal-title":"Nat. Rev. Microbiol"},{"key":"2023062712310913700_bty984-B17","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1038\/nrg3920","article-title":"Machine learning applications in genetics and genomics","volume":"16","author":"Libbrecht","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2023062712310913700_bty984-B18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1515\/jib-2014-236","article-title":"Classification of breast cancer subtypes by combining gene expression and DNA methylation data","volume":"11","author":"List","year":"2014","journal-title":"J. Integr. Bioinform"},{"key":"2023062712310913700_bty984-B19","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.media.2016.06.037","article-title":"Image analysis and machine learning in digital pathology: challenges and opportunities","volume":"33","author":"Madabhushi","year":"2016","journal-title":"Med. Image Anal"},{"key":"2023062712310913700_bty984-B20","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1007\/s10115-017-1133-2","article-title":"Binary classifier calibration using an ensemble of piecewise linear regression models","volume":"54","author":"Naeini","year":"2018","journal-title":"Knowl. Inf. Syst"},{"key":"2023062712310913700_bty984-B21","first-page":"2901","article-title":"Obtaining well calibrated probabilities using Bayesian binning","volume":"2015","author":"Naeini","year":"2015","journal-title":"Proc. Conf. AAAI Artif. Intell"},{"key":"2023062712310913700_bty984-B22","doi-asserted-by":"crossref","first-page":"1216","DOI":"10.1056\/NEJMp1606181","article-title":"Predicting the future - big data, machine learning, and clinical medicine","volume":"375","author":"Obermeyer","year":"2016","journal-title":"New Engl. J. Med"},{"key":"2023062712310913700_bty984-B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1542\/peds.142.1MA2.116","article-title":"A machine-learning approach to predicting need for hospitalization for pediatric asthma exacerbation at the time of emergency department triage","volume":"142","author":"Patel","year":"2018","journal-title":"Pediatrics"},{"key":"2023062712310913700_bty984-B24","doi-asserted-by":"crossref","first-page":"3010","DOI":"10.1002\/hbm.22121","article-title":"Baseline activity predicts working memory load of preceding task condition","volume":"34","author":"Pyka","year":"2013","journal-title":"Hum. Brain Mapp"},{"key":"2023062712310913700_bty984-B25","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1089\/sysm.2018.0002","article-title":"Data science for molecular diagnostics applications: from academia to clinic to industry","volume":"1","author":"Riemenschneider","year":"2018","journal-title":"Syst. Med"},{"key":"2023062712310913700_bty984-B26","doi-asserted-by":"crossref","first-page":"77.","DOI":"10.1186\/1471-2105-12-77","article-title":"pROC: an open-source package for R and S+ to analyze and compare ROC curves","volume":"12","author":"Robin","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023062712310913700_bty984-B27","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nm0102-68","article-title":"Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning","volume":"8","author":"Shipp","year":"2002","journal-title":"Nat. Med"},{"key":"2023062712310913700_bty984-B28","doi-asserted-by":"crossref","first-page":"3940","DOI":"10.1093\/bioinformatics\/bti623","article-title":"ROCR: visualizing classifier performance in R","volume":"21","author":"Sing","year":"2005","journal-title":"Bioinformatics"},{"key":"2023062712310913700_bty984-B29","doi-asserted-by":"crossref","first-page":"e101444.","DOI":"10.1371\/journal.pone.0101444","article-title":"Non-invasive separation of alcoholic and non-alcoholic liver disease with predictive modeling","volume":"9","author":"Sowa","year":"2014","journal-title":"PLoS One"},{"key":"2023062712310913700_bty984-B30","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1097\/EDE.0b013e3181c30fb2","article-title":"Assessing the performance of prediction models: a framework for traditional and novel measures","volume":"21","author":"Steyerberg","year":"2010","journal-title":"Epidemiology"},{"key":"2023062712310913700_bty984-B31","first-page":"695","author":"Wallace","year":"2012"},{"key":"2023062712310913700_bty984-B32","doi-asserted-by":"crossref","first-page":"9193","DOI":"10.1073\/pnas.87.23.9193","article-title":"Multisurface method of pattern separation for medical diagnosis applied to breast cytology","volume":"87","author":"Wolberg","year":"1990","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062712310913700_bty984-B33","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1007\/s10549-016-4035-1","article-title":"Using machine learning to parse breast pathology reports","volume":"161","author":"Yala","year":"2017","journal-title":"Breast Cancer Res. Treat"},{"key":"2023062712310913700_bty984-B34","first-page":"609","author":"Zadrozny","year":"2001"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/14\/2458\/50720522\/bioinformatics_35_14_2458.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/14\/2458\/50720522\/bioinformatics_35_14_2458.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T12:31:50Z","timestamp":1687869110000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/14\/2458\/5216311"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Philipps-University of Marburg, Marburg, Germany"}],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2018,11,29]]},"references-count":34,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2019,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty984","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,7]]},"published":{"date-parts":[[2018,11,29]]}}}