{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:00:44Z","timestamp":1760148044386,"version":"build-2065373602"},"reference-count":48,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T00:00:00Z","timestamp":1679875200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"<jats:p>(1) One in four hospital readmissions is potentially preventable. Machine learning (ML) models have been developed to predict hospital readmissions and risk-stratify patients, but thus far they have been limited in clinical applicability, timeliness, and generalizability. (2) Methods: Using deidentified clinical data from the University of California, San Francisco (UCSF) between January 2016 and November 2021, we developed and compared four supervised ML models (logistic regression, random forest, gradient boosting, and XGBoost) to predict 30-day readmissions for adults admitted to a UCSF hospital. (3) Results: Of 147,358 inpatient encounters, 20,747 (13.9%) patients were readmitted within 30 days of discharge. The final model selected was XGBoost, which had an area under the receiver operating characteristic curve of 0.783 and an area under the precision-recall curve of 0.434. The most important features by Shapley Additive Explanations were days since last admission, discharge department, and inpatient length of stay. (4) Conclusions: We developed and internally validated a supervised ML model to predict 30-day readmissions in a US-based healthcare system. This model has several advantages including state-of-the-art performance metrics, the use of clinical data, the use of features available within 24 h of discharge, and generalizability to multiple disease states.<\/jats:p>","DOI":"10.3390\/informatics10020033","type":"journal-article","created":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T03:01:14Z","timestamp":1679886074000},"page":"33","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Development and Internal Validation of an Interpretable Machine Learning Model to Predict Readmissions in a United States Healthcare System"],"prefix":"10.3390","volume":"10","author":[{"given":"Amanda L.","family":"Luo","sequence":"first","affiliation":[{"name":"Master of Science in Data Science Program, University of San Francisco, San Francisco, CA 94117, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4346-0555","authenticated-orcid":false,"given":"Akshay","family":"Ravi","sequence":"additional","affiliation":[{"name":"Division of Hospital Medicine, Department of Medicine, University of California, San Francisco, CA 94143, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1540-9186","authenticated-orcid":false,"given":"Simone","family":"Arvisais-Anhalt","sequence":"additional","affiliation":[{"name":"Department of Laboratory Medicine, University of California, San Francisco, CA 94143, USA"}]},{"given":"Anoop N.","family":"Muniyappa","sequence":"additional","affiliation":[{"name":"Division of Hospital Medicine, Department of Medicine, University of California, San Francisco, CA 94143, USA"}]},{"given":"Xinran","family":"Liu","sequence":"additional","affiliation":[{"name":"Division of Hospital Medicine, Department of Medicine, University of California, San Francisco, CA 94143, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5442-1009","authenticated-orcid":false,"given":"Shan","family":"Wang","sequence":"additional","affiliation":[{"name":"Master of Science in Data Science Program, University of San Francisco, San Francisco, CA 94117, USA"},{"name":"Department of Mathematics and Statistics, University of San Francisco, San Francisco, CA 94117, USA"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,27]]},"reference":[{"key":"ref_1","unstructured":"(2022, July 20). Hospital Readmissions Reduction Program (HRRP)|CMS, Available online: https:\/\/www.cms.gov\/Medicare\/Medicare-."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1001\/jamainternmed.2015.7863","article-title":"Preventability and causes of readmissions in a national cohort of general medicine patients","volume":"176","author":"Auerbach","year":"2016","journal-title":"JAMA Intern. Med."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"e2119346","DOI":"10.1001\/jamanetworkopen.2021.19346","article-title":"Interventions to Improve Communication at Hospital Discharge and Rates of Readmission: A Systematic Review and Meta-analysis","volume":"4","author":"Becker","year":"2021","journal-title":"JAMA Netw. Open"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1146\/annurev-med-022613-090415","article-title":"Reducing hospital readmission rates: Current strategies and future directions","volume":"65","author":"Kripalani","year":"2014","journal-title":"Annu. Rev. Med."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lo, Y.-T., Liao, J.C., Chen, M.-H., Chang, C.-M., and Li, C.-T. (2021). Predictive modeling for 14-day unplanned hospital readmission risk by using machine learning algorithms. BMC Med. Inf. Decis. Mak., 21.","DOI":"10.1186\/s12911-021-01639-y"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1307","DOI":"10.1016\/j.jval.2020.06.009","article-title":"How Good Is Machine Learning in Predicting All-Cause 30-Day Hospital Readmission? Evidence From Administrative Data","volume":"23","author":"Li","year":"2020","journal-title":"Value Health"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"9277","DOI":"10.1038\/s41598-019-45685-z","article-title":"Neural networks versus Logistic regression for 30\u2009days all-cause readmission prediction","volume":"9","author":"Allam","year":"2019","journal-title":"Sci. Rep."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1097\/ALN.0000000000003140","article-title":"Machine learning prediction of postoperative emergency department hospital readmission","volume":"132","author":"Gabel","year":"2020","journal-title":"Anesthesiology"},{"key":"ref_9","unstructured":"(2022, November 14). AHA Guide. Available online: https:\/\/guide.prod.iam.aha.org\/guide\/hospitalProfile\/6930043."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Chen, Q., Peng, Y., and Lu, Z. (2019, January 10\u201313). BioSentVec: Creating sentence embeddings for biomedical texts. Proceedings of the 2019 IEEE International Conference on Healthcare Informatics (ICHI), Xi\u2019an, China.","DOI":"10.1109\/ICHI.2019.8904728"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1038\/s41597-019-0055-0","article-title":"BioWordVec, improving biomedical word embeddings with subword information and MeSH","volume":"6","author":"Zhang","year":"2019","journal-title":"Sci. Data"},{"key":"ref_12","unstructured":"Parr, T., Turgutlu, K., Csiszar, C., and Howard, J. (2022, October 27). Beware Default Random Forest Importances. Available online: https:\/\/explained.ai\/rf-importance\/."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1055\/s-0041-1729752","article-title":"Rethinking PICO in the Machine Learning Era: ML-PICO","volume":"12","author":"Liu","year":"2021","journal-title":"Appl. Clin. Inform."},{"key":"ref_15","unstructured":"Hyndman, R.J., and Athanasopoulos, G. (2022, October 27). Forecasting: Principles and Practice, 2nd ed. Available online: https:\/\/otexts.com\/fpp2\/."},{"key":"ref_16","unstructured":"(2022, October 27). Omphalos. Uber\u2019s Parallel and Language-Extensible Time Series Backtesting Tool|Uber Blog. Available online: https:\/\/www.uber.com\/blog\/omphalos\/."},{"key":"ref_17","unstructured":"Lundberg, S.M., and Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. Adv. Neural Inf. Process. Syst., Available online: https:\/\/papers.nips.cc\/paper\/2017\/hash\/8a20a8621978632d76c43dfd28b67767-Abstract.html."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Slack, D., Hilgard, S., Jia, E., Singh, S., and Lakkaraju, H. (2019). Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods. arXiv.","DOI":"10.1145\/3375627.3375830"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Roth, A.E. (1988). The Shapley Value: Essays in Honor of Lloyd S. Shapley, Cambridge University Press.","DOI":"10.1017\/CBO9780511528446"},{"key":"ref_20","unstructured":"Sundararajan, M., and Najmi, A. (2019). The many Shapley values for model explanation. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1007\/s10115-013-0679-x","article-title":"Explaining prediction models and individual predictions with feature contributions","volume":"41","author":"Kononenko","year":"2014","journal-title":"Knowl. Inf. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Huang, Y., Talwar, A., Chatterjee, S., and Aparasu, R.R. (2021). Application of machine learning in predicting hospital readmissions: A scoping review of the literature. BMC Med. Res. Methodol., 21.","DOI":"10.1186\/s12874-021-01284-z"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1177\/1062860618798715","article-title":"Lessons Learned in Providing Claims-Based Data to Participants in Health Care Innovation Models","volume":"34","author":"Cohen","year":"2019","journal-title":"Am. J. Med. Qual."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1055\/s-0039-1688553","article-title":"Development and Prospective Validation of a Machine Learning-Based Risk of Readmission Model in a Large Military Hospital","volume":"10","author":"Eckert","year":"2019","journal-title":"Appl. Clin. Inform."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"103826","DOI":"10.1016\/j.jbi.2021.103826","article-title":"Improving hospital readmission prediction using individualized utility analysis","volume":"119","author":"Ko","year":"2021","journal-title":"J. Biomed. Inform."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1007\/s11606-020-05982-0","article-title":"Impact of instrumental activities of daily living limitations on hospital readmission: An observational study using machine learning","volume":"35","author":"Schiltz","year":"2020","journal-title":"J. Gen. Intern. Med."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1347","DOI":"10.1111\/1475-6773.13735","article-title":"Differences in health outcomes for high-need high-cost patients across high-income countries","volume":"56","author":"Papanicolas","year":"2021","journal-title":"Health Serv. Res."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"e227","DOI":"10.1016\/j.wneu.2021.05.080","article-title":"Prediction of Major Complications and Readmission After Lumbar Spinal Fusion: A Machine Learning-Driven Approach","volume":"152","author":"Shah","year":"2021","journal-title":"World Neurosurg."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"918","DOI":"10.1097\/XCS.0000000000000141","article-title":"Novel Machine Learning Approach for the Prediction of Hernia Recurrence, Surgical Complication, and 30-Day Readmission after Abdominal Wall Reconstruction","volume":"234","author":"Hassan","year":"2022","journal-title":"J. Am. Coll. Surg."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"910688","DOI":"10.3389\/fmolb.2022.910688","article-title":"Machine learning prediction of postoperative unplanned 30-day hospital readmission in older adult","volume":"9","author":"Li","year":"2022","journal-title":"Front. Mol. Biosci."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"638267","DOI":"10.3389\/fneur.2021.638267","article-title":"Machine Learning-Enabled 30-Day Readmission Model for Stroke Patients","volume":"12","author":"Darabi","year":"2021","journal-title":"Front. Neurol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"649521","DOI":"10.3389\/fneur.2021.649521","article-title":"Prediction of 30-Day Readmission After Stroke Using Machine Learning and Natural Language Processing","volume":"12","author":"Lineback","year":"2021","journal-title":"Front. Neurol."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"100250","DOI":"10.1016\/j.ajogmf.2020.100250","article-title":"A machine learning algorithm for predicting maternal readmission for hypertensive disorders of pregnancy","volume":"3","author":"Hoffman","year":"2021","journal-title":"Am. J. Obstet. Gynecol. MFM"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1001\/jamacardio.2016.3956","article-title":"Prediction of 30-Day All-Cause Readmissions in Patients Hospitalized for Heart Failure: Comparison of Machine Learning and Other Statistical Approaches","volume":"2","author":"Frizzell","year":"2017","journal-title":"JAMA Cardiol."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1161\/CIRCOUTCOMES.116.003039","article-title":"Analysis of machine learning techniques for heart failure readmissions","volume":"9","author":"Mortazavi","year":"2016","journal-title":"Circ. Cardiovasc. Qual. Outcomes"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1005","DOI":"10.1016\/j.jfma.2020.08.043","article-title":"The COPD-readmission (CORE) score: A novel prediction model for one-year chronic obstructive pulmonary disease readmissions","volume":"120","author":"Wu","year":"2021","journal-title":"J. Formos. Med. Assoc."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1080\/15412555.2019.1688278","article-title":"Machine Learning-Based Prediction Models for 30-Day Readmission after Hospitalization for Chronic Obstructive Pulmonary Disease","volume":"16","author":"Goto","year":"2019","journal-title":"COPD J. Chronic Obstr. Pulm. Dis."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1347","DOI":"10.1056\/NEJMra1814259","article-title":"Machine learning in medicine","volume":"380","author":"Rajkomar","year":"2019","journal-title":"N. Engl. J. Med."},{"key":"ref_39","unstructured":"Burkov, A. (2019). The Hundred-Page Machine Learning Book, Anton Burkov."},{"key":"ref_40","first-page":"1683","article-title":"Arcing the Edge","volume":"26","author":"Breiman","year":"1998","journal-title":"Ann. Prob."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: A gradient boosting machine","volume":"29","author":"Friedman","year":"2001","journal-title":"Ann. Statist."},{"key":"ref_42","unstructured":"Mason, L., Baxter, J., Bartlett, P., and Frean, M. (1990). Boosting Algorithms as Gradient Descent, MIT Press."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning, Springer. [2nd ed.].","DOI":"10.1007\/978-0-387-84858-7"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/BF00116251","article-title":"Induction of decision trees","volume":"1","author":"Quinlan","year":"1986","journal-title":"Mach. Learn."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"899","DOI":"10.7326\/M20-5475","article-title":"Preventing hospital readmission for patients with comorbid substance use disorder: A randomized trial","volume":"174","author":"Gryczynski","year":"2021","journal-title":"Ann. Intern. Med."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1002\/hpm.2648","article-title":"Predictors of hospital readmissions in internal medicine patients: Application of Andersen\u2019s Model","volume":"34","author":"Kaya","year":"2019","journal-title":"Int. J. Health Plann. Manag."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1038\/s41430-021-00937-y","article-title":"Clinical and nutritional predictors of hospital readmission within 30 days","volume":"76","author":"Cruz","year":"2022","journal-title":"Eur. J. Clin. Nutr."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Arnaud, \u00c9., Elbattah, M., Gignon, M., and Dequen, G. (2020, January 10\u201313). Deep Learning to Predict Hospitalization at Triage: Integration of Structured Data and Unstructured Text. Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA.","DOI":"10.1109\/BigData50022.2020.9378073"}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/10\/2\/33\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:03:46Z","timestamp":1760123026000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/10\/2\/33"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,27]]},"references-count":48,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,6]]}},"alternative-id":["informatics10020033"],"URL":"https:\/\/doi.org\/10.3390\/informatics10020033","relation":{},"ISSN":["2227-9709"],"issn-type":[{"type":"electronic","value":"2227-9709"}],"subject":[],"published":{"date-parts":[[2023,3,27]]}}}