{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T10:41:29Z","timestamp":1778668889226,"version":"3.51.4"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2019,8,15]],"date-time":"2019-08-15T00:00:00Z","timestamp":1565827200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,8,15]],"date-time":"2019-08-15T00:00:00Z","timestamp":1565827200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000070","name":"U.S. Department of Health & Human Services | NIH | National Institute of Biomedical Imaging and Bioengineering","doi-asserted-by":"publisher","award":["EB017205"],"award-info":[{"award-number":["EB017205"]}],"id":[{"id":"10.13039\/100000070","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Illness severity scores are regularly employed for quality improvement and benchmarking in the intensive care unit, but poor generalization performance, particularly with respect to probability calibration, has limited their use for decision support. These models tend to perform worse in patients at a high risk for mortality. We hypothesized that a sequential modeling approach wherein an initial regression model assigns risk and all patients deemed <jats:italic>high risk<\/jats:italic> then have their risk quantified by a second, high-risk-specific, regression model would result in a model with superior calibration across the risk spectrum. We compared this approach to a logistic regression model and a sophisticated machine learning approach, the gradient boosting machine. The sequential approach did not have an effect on the receiver operating characteristic curve or the precision-recall curve but resulted in improved reliability curves. The gradient boosting machine achieved a small improvement in discrimination performance and was similarly calibrated to the sequential models.<\/jats:p>","DOI":"10.1038\/s41746-019-0153-6","type":"journal-article","created":{"date-parts":[[2019,8,15]],"date-time":"2019-08-15T10:02:31Z","timestamp":1565863351000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["Developing well-calibrated illness severity scores for decision support in the critically ill"],"prefix":"10.1038","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1980-241X","authenticated-orcid":false,"given":"Christopher V.","family":"Cosgriff","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6712-6626","authenticated-orcid":false,"given":"Leo Anthony","family":"Celi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephanie","family":"Ko","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tejas","family":"Sundaresan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7012-2973","authenticated-orcid":false,"given":"Miguel \u00c1ngel","family":"Armengol de la Hoz","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Aaron Russell","family":"Kaufman","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David J.","family":"Stone","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8518-9006","authenticated-orcid":false,"given":"Omar","family":"Badawi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rodrigo Octavio","family":"Deliberato","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2019,8,15]]},"reference":[{"key":"153_CR1","doi-asserted-by":"publisher","first-page":"518","DOI":"10.1378\/chest.11-0331","volume":"141","author":"MJ Breslow","year":"2012","unstructured":"Breslow, M. J. & Badawi, O. Severity scoring in the critically ill: Part 2: Maximizing value from outcome prediction scoring systems. Chest 141, 518\u2013527 (2012).","journal-title":"Chest"},{"key":"153_CR2","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1378\/chest.11-0330","volume":"141","author":"MJ Breslow","year":"2012","unstructured":"Breslow, M. J. & Badawi, O. Severity scoring in the critically ill: part 1\u2013interpretation and accuracy of outcome prediction scoring systems. Chest 141, 245\u2013252 (2012).","journal-title":"Chest"},{"key":"153_CR3","doi-asserted-by":"publisher","first-page":"1297","DOI":"10.1097\/01.CCM.0000215112.84523.F0","volume":"34","author":"JE Zimmerman","year":"2006","unstructured":"Zimmerman, J. E., Kramer, A. A., McNair, D. S. & Malila, F. M. Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today\u2019s critically ill patients. Crit. Care Med. 34, 1297\u20131310 (2006).","journal-title":"Crit. Care Med."},{"key":"153_CR4","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1007\/s00134-005-2763-5","volume":"31","author":"RP Moreno","year":"2005","unstructured":"Moreno, R. P. et al. SAPS 3\u2013From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission. Intensive Care Med. 31, 1345\u20131355 (2005).","journal-title":"Intensive Care Med."},{"key":"153_CR5","doi-asserted-by":"publisher","first-page":"707","DOI":"10.1007\/BF01709751","volume":"22","author":"JL Vincent","year":"1996","unstructured":"Vincent, J. L. et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction\/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. 22, 707\u2013710 (1996).","journal-title":"Intensive Care Med."},{"key":"153_CR6","doi-asserted-by":"publisher","first-page":"2478","DOI":"10.1001\/jama.1993.03510200084037","volume":"270","author":"S Lemeshow","year":"1993","unstructured":"Lemeshow, S. et al. Mortality Probability Models (MPM II) based on an international cohort of intensive care unit patients. JAMA 270, 2478\u20132486 (1993).","journal-title":"JAMA"},{"key":"153_CR7","doi-asserted-by":"publisher","first-page":"802","DOI":"10.1378\/chest.115.3.802","volume":"115","author":"JV Pappachan","year":"1999","unstructured":"Pappachan, J. V., Millar, B., Bennett, E. D. & Smith, G. B. Comparison of outcome from intensive care admission after adjustment for case mix by the APACHE III prognostic system. Chest 115, 802\u2013810 (1999).","journal-title":"Chest"},{"key":"153_CR8","doi-asserted-by":"publisher","first-page":"1392","DOI":"10.1097\/00003246-199409000-00007","volume":"22","author":"KM Rowan","year":"1994","unstructured":"Rowan, K. M. et al. Intensive Care Society\u2019s Acute Physiology and Chronic Health Evaluation (APACHE II) study in Britain and Ireland: a prospective, multicenter, cohort study comparing two methods for predicting outcome for adult intensive care patients. Crit. Care Med. 22, 1392\u20131401 (1994).","journal-title":"Crit. Care Med."},{"key":"153_CR9","doi-asserted-by":"publisher","first-page":"977","DOI":"10.1136\/bmj.307.6910.977","volume":"307","author":"KM Rowan","year":"1993","unstructured":"Rowan, K. M. et al. Intensive Care Society\u2019s APACHE II study in Britain and Ireland\u2013II: Outcome comparisons of intensive care units after adjustment for case mix by the American APACHE II method. BMJ (Clin. Res. Ed.) 307, 977\u2013981 (1993).","journal-title":"BMJ (Clin. Res. Ed.)"},{"key":"153_CR10","doi-asserted-by":"publisher","first-page":"1317","DOI":"10.1001\/jama.2017.18391","volume":"319","author":"AL Beam","year":"2018","unstructured":"Beam, A. L. & Kohane, I. S. Big data and machine learning in health care. JAMA 319, 1317\u20131318 (2018).","journal-title":"JAMA"},{"key":"153_CR11","doi-asserted-by":"publisher","first-page":"162","DOI":"10.1177\/0272989X14547233","volume":"35","author":"B Van Calster","year":"2015","unstructured":"Van Calster, B. & Vickers, A. J. Calibration of risk prediction models: impact on decision-analytic performance. Med. Decis. Making 35, 162\u2013169 (2015).","journal-title":"Med. Decis. Making"},{"key":"153_CR12","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1097\/CCM.0000000000000694","volume":"43","author":"AA Kramer","year":"2015","unstructured":"Kramer, A. A., Higgins, T. L. & Zimmerman, J. E. Comparing observed and predicted mortality among ICUs using different prognostic systems: why do performance assessments differ? Crit. Care Med. 43, 261\u2013269 (2015).","journal-title":"Crit. Care Med."},{"key":"153_CR13","doi-asserted-by":"publisher","first-page":"544","DOI":"10.1097\/CCM.0b013e3182a66a49","volume":"42","author":"AA Kramer","year":"2014","unstructured":"Kramer, A. A., Higgins, T. L. & Zimmerman, J. E. Comparison of the Mortality Probability Admission Model III, National Quality Forum, and Acute Physiology and Chronic Health Evaluation IV hospital mortality models: implications for national benchmarking. Crit. Care Med. 42, 544\u2013553 (2014).","journal-title":"Crit. Care Med."},{"key":"153_CR14","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1046\/j.1365-2044.2002.02362.x","volume":"57","author":"DH Beck","year":"2002","unstructured":"Beck, D. H., Smith, G. B. & Taylor, B. L. The impact of low-risk intensive care unit admissions on mortality probabilities by SAPS II, APACHE II and APACHE III. Anaesthesia 57, 21\u201326 (2002).","journal-title":"Anaesthesia"},{"key":"153_CR15","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1016\/j.jclinepi.2019.02.004","volume":"110","author":"E Christodoulou","year":"2019","unstructured":"Christodoulou, E. et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 110, 12\u201322 (2019).","journal-title":"J. Clin. Epidemiol."},{"key":"153_CR16","doi-asserted-by":"publisher","first-page":"2517","DOI":"10.1097\/01.CCM.0000240233.01711.D9","volume":"34","author":"JE Zimmerman","year":"2006","unstructured":"Zimmerman, J. E., Kramer, A. A., McNair, D. S., Malila, F. M. & Shaffer, V. L. Intensive care unit length of stay: benchmarking based on Acute Physiology and Chronic Health Evaluation (APACHE) IV. Crit. Care Med. 34, 2517\u20132529 (2006).","journal-title":"Crit. Care Med."},{"key":"153_CR17","first-page":"625","volume":"2017","author":"SE Davis","year":"2018","unstructured":"Davis, S. E., Lasko, T. A., Chen, G. & Matheny, M. E. Calibration drift among regression and machine learning models for hospital mortality. AMIA. Annu. Symp. Proc. 2017, 625\u2013634 (2018).","journal-title":"AMIA. Annu. Symp. Proc."},{"key":"153_CR18","first-page":"994","volume":"2017","author":"AEW Johnson","year":"2018","unstructured":"Johnson, A. E. W. & Mark, R. G. Real-time mortality prediction in the Intensive Care Unit. AMIA. Annu. Symp. Proc. 2017, 994\u20131003 (2018).","journal-title":"AMIA. Annu. Symp. Proc."},{"key":"153_CR19","doi-asserted-by":"publisher","first-page":"1070","DOI":"10.1097\/CCM.0000000000003123","volume":"46","author":"JL Koyner","year":"2018","unstructured":"Koyner, J. L., Carey, K. A., Edelson, D. P. & Churpek, M. M. The development of a machine learning inpatient acute kidney injury prediction model. Crit. Care Med. 46, 1070\u20131077 (2018).","journal-title":"Crit. Care Med."},{"key":"153_CR20","doi-asserted-by":"publisher","first-page":"846","DOI":"10.1513\/AnnalsATS.201710-787OC","volume":"15","author":"JC Rojas","year":"2018","unstructured":"Rojas, J. C. et al. Predicting intensive care unit readmission with machine learning using electronic health record data. Ann. Am. Thorac. Soc. 15, 846\u2013853 (2018).","journal-title":"Ann. Am. Thorac. Soc."},{"key":"153_CR21","doi-asserted-by":"publisher","first-page":"829","DOI":"10.1038\/nbt.4233","volume":"36","author":"M Wainberg","year":"2018","unstructured":"Wainberg, M., Merico, D., Delong, A. & Frey, B. J. Deep learning in biomedicine. Nat. Biotechnol. 36, 829\u2013838 (2018).","journal-title":"Nat. Biotechnol."},{"key":"153_CR22","doi-asserted-by":"publisher","first-page":"1099","DOI":"10.1001\/jama.2018.11103","volume":"320","author":"CD Naylor","year":"2018","unstructured":"Naylor, C. D. On the prospects for a (deep) learning health care system. JAMA 320, 1099\u20131100 (2018).","journal-title":"JAMA"},{"key":"153_CR23","doi-asserted-by":"publisher","first-page":"1101","DOI":"10.1001\/jama.2018.11100","volume":"320","author":"G Hinton","year":"2018","unstructured":"Hinton, G. Deep learning-a technology with the potential to transform health care. JAMA 320, 1101\u20131102 (2018).","journal-title":"JAMA"},{"key":"153_CR24","doi-asserted-by":"publisher","first-page":"55","DOI":"10.7326\/M14-0697","volume":"162","author":"GS Collins","year":"2015","unstructured":"Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. M. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): The TRIPOD Statement. Ann. Intern. Med. 162, 55\u201363 (2015).","journal-title":"Ann. Intern. Med."},{"key":"153_CR25","doi-asserted-by":"crossref","unstructured":"Cosgriff, C. V. et al. Developing well calibrated illness severity scores for decision support in the critically ill. https:\/\/github.com\/cosgriffc\/seq-severityscore (2019).","DOI":"10.1038\/s41746-019-0153-6"},{"key":"153_CR26","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2018.178","volume":"5","author":"TJ Pollard","year":"2018","unstructured":"Pollard, T. J. et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci. Data 5, 180178 (2018).","journal-title":"Sci. Data"},{"key":"153_CR27","doi-asserted-by":"crossref","unstructured":"Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785\u2013794 (ACM, San Francisco, CA, 2016).","DOI":"10.1145\/2939672.2939785"},{"key":"153_CR28","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825\u20132830 (2011).","journal-title":"J. Mach. Learn. Res."},{"key":"153_CR29","unstructured":"Niculescu-Mizil, A. & Caruana, R. Obtaining calibrated probabilities from boosting. In Proc. Twenty-First Conference on Uncertainty in Artificial Intelligence 413\u2013420 (AUAI Press, Edinburgh, 2005)."},{"key":"153_CR30","doi-asserted-by":"publisher","first-page":"1377","DOI":"10.1001\/jama.2017.12126","volume":"318","author":"AC Alba","year":"2017","unstructured":"Alba, A. C. et al. Discrimination and calibration of clinical prediction models: users\u2019 guides to the medical literature. JAMA 318, 1377\u20131384 (2017).","journal-title":"JAMA"}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0153-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0153-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0153-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,17]],"date-time":"2022-12-17T18:31:50Z","timestamp":1671301910000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0153-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,8,15]]},"references-count":30,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,12]]}},"alternative-id":["153"],"URL":"https:\/\/doi.org\/10.1038\/s41746-019-0153-6","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,8,15]]},"assertion":[{"value":"11 December 2018","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 July 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 August 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"O.B. is employed by Philips Healthcare. The other authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"76"}}