{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T00:52:36Z","timestamp":1775868756430,"version":"3.50.1"},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,1,23]],"date-time":"2020-01-23T00:00:00Z","timestamp":1579737600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,1,23]],"date-time":"2020-01-23T00:00:00Z","timestamp":1579737600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The ability to identify patients who are likely to have an adverse outcome is an essential component of good clinical care. Therefore, predictive risk stratification models play an important role in clinical decision making. Determining whether a given predictive model is suitable for clinical use usually involves evaluating the model\u2019s performance on large patient datasets using standard statistical measures of success (e.g., accuracy, discriminatory ability). However, as these metrics correspond to averages over patients who have a range of different characteristics, it is difficult to discern whether an individual prediction on a given patient should be trusted using these measures alone. In this paper, we introduce a new method for identifying patient subgroups where a predictive model is expected to be poor, thereby highlighting when a given prediction is misleading and should not be trusted. The resulting \u201cunreliability score\u201d can be computed for any clinical risk model and is suitable in the setting of large class imbalance, a situation often encountered in healthcare settings. Using data from more than 40,000 patients in the Global Registry of Acute Coronary Events (GRACE), we demonstrate that patients with high unreliability scores form a subgroup in which the predictive model has both decreased accuracy and decreased discriminatory ability.<\/jats:p>","DOI":"10.1038\/s41746-019-0209-7","type":"journal-article","created":{"date-parts":[[2020,1,23]],"date-time":"2020-01-23T11:02:58Z","timestamp":1579777378000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Identifying unreliable predictions in clinical risk models"],"prefix":"10.1038","volume":"3","author":[{"given":"Paul D.","family":"Myers","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0792-070X","authenticated-orcid":false,"given":"Kenney","family":"Ng","sequence":"additional","affiliation":[]},{"given":"Kristen","family":"Severson","sequence":"additional","affiliation":[]},{"given":"Uri","family":"Kartoun","sequence":"additional","affiliation":[]},{"given":"Wangzhi","family":"Dai","sequence":"additional","affiliation":[]},{"given":"Wei","family":"Huang","sequence":"additional","affiliation":[]},{"given":"Frederick A.","family":"Anderson","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3415-242X","authenticated-orcid":false,"given":"Collin M.","family":"Stultz","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,1,23]]},"reference":[{"key":"209_CR1","doi-asserted-by":"publisher","first-page":"1276","DOI":"10.1016\/j.ahj.2005.02.037","volume":"150","author":"ED Michos","year":"2005","unstructured":"Michos, E. D. et al. Women with a low Framingham risk score and a family history of premature coronary heart disease have a high prevalence of subclinical coronary atherosclerosis. Am. Heart J. 150, 1276\u20131281 (2005).","journal-title":"Am. Heart J."},{"key":"209_CR2","doi-asserted-by":"publisher","first-page":"385","DOI":"10.3233\/IDA-2009-0371","volume":"13","author":"Z Bosni\u0107","year":"2009","unstructured":"Bosni\u0107, Z. & Kononenko, I. An overview of advances in reliability estimation of individual predictions in machine learning. Intell. Data Anal. 13, 385\u2013401 (2009).","journal-title":"Intell. Data Anal."},{"key":"209_CR3","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1016\/S0893-6080(99)00080-5","volume":"13","author":"I Rivals","year":"2000","unstructured":"Rivals, I. & Personnaz, L. Construction of confidence intervals for neural networks based on least squares estimation. Neural Netw. 13, 463\u2013484 (2000).","journal-title":"Neural Netw."},{"key":"209_CR4","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1109\/72.478409","volume":"7","author":"G Chryssolouris","year":"1996","unstructured":"Chryssolouris, G., Lee, M. & Ramsey, A. Confidence interval prediction for neural network models. IEEE Trans. Neural Netw. 7, 229\u2013232 (1996).","journal-title":"IEEE Trans. Neural Netw."},{"key":"209_CR5","doi-asserted-by":"crossref","unstructured":"Dybowski, R. & Gant, V. Clinical Applications Of Artificial Neural Networks (Cambridge University Press, 2001).","DOI":"10.1017\/CBO9780511543494"},{"key":"209_CR6","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1111\/rssb.12026","volume":"76","author":"CH Zhang","year":"2014","unstructured":"Zhang, C. H. & Zhang, S. S. Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. B. 76, 217\u2013242 (2014).","journal-title":"J. R. Stat. Soc. B"},{"key":"209_CR7","doi-asserted-by":"publisher","first-page":"819","DOI":"10.1016\/0098-1354(92)80035-8","volume":"16","author":"JA Leonard","year":"1992","unstructured":"Leonard, J. A., Kramer, M. A. & Ungar, L. H. A neural network architecture that computes its own reliability. Comput. Chem. Eng. 16, 819\u2013835 (1992).","journal-title":"Comput. Chem. Eng."},{"key":"209_CR8","doi-asserted-by":"publisher","first-page":"842","DOI":"10.1016\/j.neunet.2011.05.008","volume":"24","author":"H Papadopoulos","year":"2011","unstructured":"Papadopoulos, H. & Haralambous, H. Reliable prediction intervals with regression neural networks. Neural Netw. 24, 842\u2013851 (2011).","journal-title":"Neural Netw."},{"key":"209_CR9","doi-asserted-by":"crossref","unstructured":"Kukar, M. & Kononenko, I. Reliable Classifications with Machine Learning. Machine Learning: ECML 2002. ECML 2002. Lecture Notes in Computer Science Vol. 2430 (eds Mannila, H. et al.) 219\u2013231 (Springer Verlag, 2002).","DOI":"10.1007\/3-540-36755-1_19"},{"key":"209_CR10","unstructured":"Jiang, H., Bachas, K., Guan, M. & Gupta, M. To Trust Or Not To Trust A Classifier. Advances in Neural Information Processing Systems 31 (NIPS 2018) (eds Bengio, S. et al.) 5541\u20135552 (Curran Associates, Inc., 2018)."},{"key":"209_CR11","doi-asserted-by":"publisher","first-page":"2345","DOI":"10.1001\/archinte.163.19.2345","volume":"163","author":"CB Granger","year":"2003","unstructured":"Granger, C. B. et al. Predictors of hospital mortality in the global registry of acute coronary events. Arch. Intern. Med. 163, 2345\u20132353 (2003).","journal-title":"Arch. Intern. Med."},{"key":"209_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1175\/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2","volume":"78","author":"GW Brier","year":"1950","unstructured":"Brier, G. W. Verification of forecasts expressed in terms of probability. Monthly Weather Rev. 78, 1\u20133 (1950).","journal-title":"Monthly Weather Rev."},{"key":"209_CR13","first-page":"2813","volume":"13","author":"J Hernandez-Orallo","year":"2012","unstructured":"Hernandez-Orallo, J., Flach, P. & Ferri, C. A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813\u20132869 (2012).","journal-title":"J. Mach. Learn. Res."},{"key":"209_CR14","doi-asserted-by":"publisher","first-page":"128","DOI":"10.1097\/EDE.0b013e3181c30fb2","volume":"21","author":"EW Steyerberg","year":"2010","unstructured":"Steyerberg, E. W. et al. Assessing the performance of prediction models a framework for traditional and novel measures. Epidemiology 21, 128\u2013138 (2010).","journal-title":"Epidemiology"},{"key":"209_CR15","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0091249","volume":"9","author":"YC Wu","year":"2014","unstructured":"Wu, Y. C. & Lee, W. C. Alternative performance measures for prediction models. PLoS ONE 9, e91249 (2014).","journal-title":"PLoS ONE"},{"key":"209_CR16","doi-asserted-by":"publisher","first-page":"971","DOI":"10.1093\/aje\/kwq223","volume":"172","author":"Y Vergouwe","year":"2010","unstructured":"Vergouwe, Y., Moons, K. G. M. & Steyerberg, E. W. External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. Am. J. Epidemiol. 172, 971\u2013980 (2010).","journal-title":"Am. J. Epidemiol."},{"key":"209_CR17","doi-asserted-by":"publisher","first-page":"4136","DOI":"10.1002\/sim.6997","volume":"35","author":"D van Klaveren","year":"2016","unstructured":"van Klaveren, D., G\u00f6nen, M., Steyerberg, E. W. & Vergouwe, Y. A new concordance measure for risk prediction models in external validation settings. Stat. Med. 35, 4136\u20134152 (2016).","journal-title":"Stat. Med."},{"key":"209_CR18","doi-asserted-by":"publisher","first-page":"1879","DOI":"10.1056\/NEJM200106213442501","volume":"344","author":"CP Cannon","year":"2001","unstructured":"Cannon, C. P. et al. Comparison of early invasive and conservative strategies in patients with unstable coronary syndromes treated with the glycoprotein IIb\/IIIa inhibitor tirofiban. N. Engl. J. Med. 344, 1879\u20131887 (2001).","journal-title":"N. Engl. J. Med."},{"key":"209_CR19","doi-asserted-by":"publisher","first-page":"835","DOI":"10.1001\/jama.284.7.835","volume":"284","author":"EM Antman","year":"2000","unstructured":"Antman, E. M. et al. The TIMI risk score for unstable angina\/non-ST elevation MI: A method for prognostication and therapeutic decision making. JAMA 284, 835\u2013842 (2000).","journal-title":"JAMA"},{"key":"209_CR20","doi-asserted-by":"publisher","first-page":"1356","DOI":"10.1001\/jama.286.11.1356","volume":"286","author":"DA Morrow","year":"2001","unstructured":"Morrow, D. A. et al. Application of the TIMI risk score for ST-Elevation MI in the National Registry of Myocardial Infarction 3. JAMA 286, 1356\u20131359 (2001).","journal-title":"JAMA"},{"key":"209_CR21","doi-asserted-by":"crossref","unstructured":"GRACE Investigators. Rationale and design of the GRACE (Global Registry of Acute Coronary Events) Project: a multinational registry of patients hospitalized with acute coronary syndromes. Am. Heart J. 141, 190\u2013199 (2001).","DOI":"10.1067\/mhj.2001.112404"},{"key":"209_CR22","doi-asserted-by":"publisher","first-page":"2727","DOI":"10.1001\/jama.291.22.2727","volume":"291","author":"KA Eagle","year":"2004","unstructured":"Eagle, K. A. et al. A validated prediction model for all forms of acute coronary syndrome: estimating the risk of 6-month postdischarge death in an international registry. JAMA 291, 2727\u20132733 (2004).","journal-title":"JAMA"},{"key":"209_CR23","doi-asserted-by":"publisher","first-page":"1091","DOI":"10.1136\/bmj.38985.646481.55","volume":"333","author":"KA Fox","year":"2006","unstructured":"Fox, K. A. et al. Prediction of risk of death and myocardial infarction in the six months after presentation with acute coronary syndrome: prospective multinational observational study (GRACE). BMJ 333, 1091 (2006).","journal-title":"BMJ"},{"key":"209_CR24","doi-asserted-by":"crossref","unstructured":"Fox, K. A. A. et al. Should patients with acute coronary disease be stratified for management according to their risk? Derivation, external validation and outcomes using the updated GRACE risk score. BMJ Open 4, e004425 (2014).","DOI":"10.1136\/bmjopen-2013-004425"}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0209-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0209-7","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0209-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T01:49:23Z","timestamp":1670377763000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0209-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,23]]},"references-count":24,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["209"],"URL":"https:\/\/doi.org\/10.1038\/s41746-019-0209-7","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1,23]]},"assertion":[{"value":"30 July 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 December 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 January 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"8"}}