{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T23:59:23Z","timestamp":1773446363626,"version":"3.50.1"},"reference-count":15,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,4,17]],"date-time":"2025-04-17T00:00:00Z","timestamp":1744848000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"United States Veterans Administration Health Services Research Department","award":["1I21HX003278-01A1"],"award-info":[{"award-number":["1I21HX003278-01A1"]}]},{"name":"United States Veterans Administration Health Services Research Department","award":["R01 HS28450-01A1"],"award-info":[{"award-number":["R01 HS28450-01A1"]}]},{"DOI":"10.13039\/100000133","name":"Agency for Healthcare Research and Quality","doi-asserted-by":"publisher","award":["1I21HX003278-01A1"],"award-info":[{"award-number":["1I21HX003278-01A1"]}],"id":[{"id":"10.13039\/100000133","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000133","name":"Agency for Healthcare Research and Quality","doi-asserted-by":"publisher","award":["R01 HS28450-01A1"],"award-info":[{"award-number":["R01 HS28450-01A1"]}],"id":[{"id":"10.13039\/100000133","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>We used machine learning (ML) to characterize 894,154 medical records of outpatient visits from the Veterans Administration Central Data Warehouse (VA CDW) by the likelihood of assignment of 200 International Classification of Diseases (ICD) code blocks. Using four different predictive models, we found the ML-derived predictions for the code blocks were consistently more effective in predicting death or 90-day rehospitalization than the assigned code block in the record. We reviewed records of ICD chapter assignments. The review revealed that the ML-predicted chapter assignments were consistently better than those humanly assigned. Impact factor analysis, a method of explanation of AI findings that was developed in our group, demonstrated little effect on any one assigned ICD code block but a marked impact on the ML-derived code blocks of kidney disease as well as several other morbidities. In this study, machine learning was much better than human code assignment at predicting the relatively rare outcomes of death or rehospitalization. Future work will address generalizability using other datasets, as well as addressing coding that is more nuanced than that of the categorization provided by code blocks.<\/jats:p>","DOI":"10.3390\/make7020036","type":"journal-article","created":{"date-parts":[[2025,4,17]],"date-time":"2025-04-17T06:53:46Z","timestamp":1744872826000},"page":"36","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Machine-Learned Codes from EHR Data Predict Hard Outcomes Better than Human-Assigned ICD Codes"],"prefix":"10.3390","volume":"7","author":[{"given":"Ying","family":"Yin","sequence":"first","affiliation":[{"name":"Biomedical Informatics Center, George Washington University, Washington, DC 20052, USA"},{"name":"Veterans Administration Hospital, Washington, DC 20422, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yijun","family":"Shao","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Center, George Washington University, Washington, DC 20052, USA"},{"name":"Veterans Administration Hospital, Washington, DC 20422, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8302-5021","authenticated-orcid":false,"given":"Phillip","family":"Ma","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Center, George Washington University, Washington, DC 20052, USA"},{"name":"Veterans Administration Hospital, Washington, DC 20422, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qing","family":"Zeng-Treitler","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Center, George Washington University, Washington, DC 20052, USA"},{"name":"Veterans Administration Hospital, Washington, DC 20422, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stuart J.","family":"Nelson","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Center, George Washington University, Washington, DC 20052, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,4,17]]},"reference":[{"key":"ref_1","unstructured":"Office of the National Coordinator for Health Information Technology (2024, June 22). National Trends in Hospital and Physician Adoption of Electronic Health Records. (Health IT Quick-Stat #61), Available online: http:\/\/www.HealthIT.gov."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1146\/annurev-publhealth-032315-021353","article-title":"Using Electronic Health Records for Population Health Research: A Review of Methods and Applications","volume":"37","author":"Casey","year":"2016","journal-title":"Annu. Rev. Public Health"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s00392-016-1025-6","article-title":"Electronic health records to facilitate clinical research","volume":"106","author":"Cowie","year":"2017","journal-title":"Clin. Res. Cardiol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"e80","DOI":"10.1002\/cphg.80","article-title":"Using Electronic Health Records To Generate Phenotypes For Research","volume":"100","author":"Pendergrass","year":"2019","journal-title":"Curr. Protoc. Hum. Genet."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"596","DOI":"10.3174\/ajnr.A4696","article-title":"ICD-10: History and Context","volume":"37","author":"Hirsch","year":"2016","journal-title":"AJNR Am. J. Neuroradiol."},{"key":"ref_6","first-page":"19","article-title":"A qualitative evaluation of clinically coded data quality from health information manager perspectives","volume":"49","author":"Doktorchik","year":"2020","journal-title":"Health Inf. Manag. J."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"20552076241297056","DOI":"10.1177\/20552076241297056","article-title":"Are ICD codes reliable for observational studies? Assessing coding consistency for data quality","volume":"10","author":"Nelson","year":"2024","journal-title":"Digit. Health"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"e42","DOI":"10.1097\/MLR.0000000000001010","article-title":"ICD-10 Coding Will Challenge Researchers: Caution and Collaboration may Reduce Measurement Error and Improve Comparability Over Time","volume":"57","author":"Mainor","year":"2019","journal-title":"Med. Care"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"14","DOI":"10.5334\/egems.281","article-title":"Impact of ICD-10-CM Transition on Mental Health Diagnoses Recording","volume":"7","author":"Stewart","year":"2019","journal-title":"EGEMS"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1007\/s10916-020-1532-x","article-title":"Problems and Barriers during the Process of Clinical Coding: A Focus Group Study of Coders\u2019 Perceptions","volume":"44","author":"Alonso","year":"2020","journal-title":"J. Med. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"e59","DOI":"10.1016\/j.jsurg.2016.07.017","article-title":"Evaluating Coding Accuracy in General Surgery Residents\u2019 Accreditation Council for Graduate Medical Education Procedural Case Logs","volume":"73","author":"Balla","year":"2016","journal-title":"J. Surg. Educ."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/j.acap.2015.01.008","article-title":"The Accuracy of ICD Codes: Identifying Physical Abuse in 4 Children\u2019s Hospitals","volume":"15","author":"Hooft","year":"2015","journal-title":"Acad. Pediatr."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1093\/jamia\/ocac216","article-title":"Machine learning approaches for electronic health records phenotyping: A methodical review","volume":"30","author":"Yang","year":"2023","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_14","first-page":"1","article-title":"A Unified Review of Deep Learning for Automated Medical Coding","volume":"56","author":"Ji","year":"2024","journal-title":"ACM Comput. Surv."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1007\/s10916-020-01701-8","article-title":"Shedding Light on the Black Box: Explaining Deep Neural Network Prediction of Clinical Outcomes","volume":"45","author":"Shao","year":"2021","journal-title":"J. Med. Syst."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/2\/36\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:16:24Z","timestamp":1760030184000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/2\/36"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,17]]},"references-count":15,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["make7020036"],"URL":"https:\/\/doi.org\/10.3390\/make7020036","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,17]]}}}