{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T22:56:11Z","timestamp":1780354571849,"version":"3.54.1"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T00:00:00Z","timestamp":1770768000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"name":"Mayo Clinic Center for Digital Health AI\/ML Enablement"},{"name":"Dalio Philanthropies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objectives<\/jats:title>\n                    <jats:p>To comprehensively evaluate the validity of International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) codes for both prevalent diagnoses and less common diseases, and to assess the performance of a large language model (LLM)-based system in validating these codes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>This retrospective study analyzed hospital admissions from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. We developed a validated LLM-based system using GPT-4o, refined through iterative prompt engineering, to assess ICD-10-CM code validity. We measured the positive predictive value (PPV) of ICD-10-CM codes, PPV of principal and secondary diagnoses, and the performance of an LLM-based system in code validation.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Among 865\u00a0079 assigned codes, the PPV was 84.6% (95% CI, 84.5%-84.6%). Principal diagnoses had a PPV of 93.9% (95% CI, 93.7%-94.1%), while secondary diagnoses had a PPV of 83.8% (95% CI, 83.7%-83.9%). The LLM system demonstrated high performance in validating ICD codes, achieving 93.6% accuracy, 95.4% sensitivity, and 85.2% specificity. Among correctly assigned secondary diagnoses, the majority (67.9%) represented historical or baseline conditions, while 32.1% reflected active conditions that deviated from baseline status; 22.3% of these emerged after hospital admission. PPV decreases with later diagnosis positions, with the largest decline occurring between principal and secondary diagnoses.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion and Conclusion<\/jats:title>\n                    <jats:p>In this large-scale evaluation, ICD-10-CM codes exhibited generally high accuracy, though variability existed by position and condition type. A validated LLM system performed comparably to physician review and offers a scalable means to improve coding accuracy. These findings support the potential for integrating LLM-based auditing into routine workflows to strengthen the quality of administrative and research data.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocag008","type":"journal-article","created":{"date-parts":[[2026,1,14]],"date-time":"2026-01-14T12:45:05Z","timestamp":1768394705000},"page":"947-956","source":"Crossref","is-referenced-by-count":3,"title":["Validation of 13\u00a0102 International Classification of Diseases, Tenth Revision, Clinical Modification codes using a large language model-based system"],"prefix":"10.1093","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0723-5913","authenticated-orcid":false,"given":"Yichen","family":"Wang","sequence":"first","affiliation":[{"name":"Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]},{"name":"Division of Hospital Medicine, Department of Medicine, Perelman School of Medicine at the University of Pennsylvania , Philadelphia, PA 19104,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yilin","family":"Song","sequence":"additional","affiliation":[{"name":"Department of Medicine, University of Maryland Medical Center Midtown Campus , Baltimore, MD 21201,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Rex","family":"Siu","sequence":"additional","affiliation":[{"name":"Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Induja R","family":"Nimma","sequence":"additional","affiliation":[{"name":"Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yan","family":"Yan","sequence":"additional","affiliation":[{"name":"Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Thomas R","family":"Savage","sequence":"additional","affiliation":[{"name":"Division of Hospital Medicine, Department of Medicine, Perelman School of Medicine at the University of Pennsylvania , Philadelphia, PA 19104,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yiming","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computing, Xi\u2019an Jiaotong-Liverpool University , Suzhou, Jiangsu 215123,","place":["China"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhichen","family":"Li","sequence":"additional","affiliation":[{"name":"Ponte Vedra High School , Ponte Vedra, FL 32081,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Daryl","family":"Ramai","sequence":"additional","affiliation":[{"name":"Division of Gastroenterology, Hepatology, and Endoscopy, Department of Medicine, Brigham and Women\u2019s Hospital , Boston, MA 02115,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jiale","family":"Wang","sequence":"additional","affiliation":[{"name":"Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dilhana","family":"Badurdeen","sequence":"additional","affiliation":[{"name":"Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Cui","family":"Tao","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence and Informatics, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Vivek","family":"Kumbhari","sequence":"additional","affiliation":[{"name":"Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9986-5124","authenticated-orcid":false,"given":"Yuting","family":"Huang","sequence":"additional","affiliation":[{"name":"Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic , Jacksonville, FL 32224,","place":["United States"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2026,2,11]]},"reference":[{"key":"2026042913021474500_ocag008-B1","author":"World Health Organization"},{"key":"2026042913021474500_ocag008-B2","volume-title":"International Statistical Classification of Diseases and Related Health Problems: Tenth Revision","author":"World Health Organization","year":"2004","edition":"2nd"},{"key":"2026042913021474500_ocag008-B3","author":"Centers for Medicare & Medicaid Services","year":"2023"},{"key":"2026042913021474500_ocag008-B4","author":"Centers for Medicare & Medicaid Services","year":"2022"},{"key":"2026042913021474500_ocag008-B5","volume-title":"Healthcare Cost and Utilization Project (HCUP)","author":"Agency for Healthcare Research and Quality"},{"key":"2026042913021474500_ocag008-B6","doi-asserted-by":"publisher","first-page":"875","DOI":"10.1001\/jamaneurol.2024.2044","article-title":"Derivation and validation of ICD-10 codes for identifying incident stroke","volume":"81","author":"Columbo","year":"2024","journal-title":"JAMA Neurol"},{"key":"2026042913021474500_ocag008-B7","doi-asserted-by":"publisher","first-page":"e248255","DOI":"10.1001\/jamanetworkopen.2024.8255","article-title":"Accuracy of influenza ICD-10 diagnosis codes in identifying influenza illness in children","volume":"7","author":"Antoon","year":"2024","journal-title":"JAMA Netw Open"},{"key":"2026042913021474500_ocag008-B8","doi-asserted-by":"publisher","first-page":"e009952","DOI":"10.1136\/bmjopen-2015-009952","article-title":"Systematic review of validated case definitions for diabetes in ICD-9-coded and ICD-10-coded data in adult populations","volume":"6","author":"Khokhar","year":"2016","journal-title":"BMJ Open"},{"key":"2026042913021474500_ocag008-B9","doi-asserted-by":"publisher","first-page":"2011","DOI":"10.1001\/jama.2017.17653","article-title":"Adherence to methodological standards in research using the national inpatient sample","volume":"318","author":"Khera","year":"2017","journal-title":"JAMA"},{"key":"2026042913021474500_ocag008-B10","doi-asserted-by":"publisher","first-page":"210","DOI":"10.7326\/M23-2772","article-title":"Large language models in medicine: the potentials and pitfalls: a narrative review","volume":"177","author":"Omiye","year":"2024","journal-title":"Ann Intern Med"},{"key":"2026042913021474500_ocag008-B11","author":"OpenAI"},{"key":"2026042913021474500_ocag008-B12","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1038\/s41746-022-00705-7","article-title":"Automated clinical coding: what, why, and where we are?","volume":"5","author":"Dong","year":"2022","journal-title":"NPJ Digit Med"},{"key":"2026042913021474500_ocag008-B13","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1093\/pubmed\/fdr054","article-title":"Systematic review of discharge coding accuracy","volume":"34","author":"Burns","year":"2012","journal-title":"J Public Health (Oxf)"},{"key":"2026042913021474500_ocag008-B14","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1038\/s41597-022-01899-x","article-title":"MIMIC-IV, a freely accessible electronic health record dataset","volume":"10","author":"Johnson","year":"2023","journal-title":"Sci Data"},{"key":"2026042913021474500_ocag008-B15","doi-asserted-by":"publisher","first-page":"1130","DOI":"10.1097\/01.mlr.0000182534.19832.83","article-title":"Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data","volume":"43","author":"Quan","year":"2005","journal-title":"Med Care"},{"key":"2026042913021474500_ocag008-B16","doi-asserted-by":"publisher","first-page":"194","DOI":"10.1038\/s41746-022-00742-2","article-title":"A large language model for electronic health records","volume":"5","author":"Yang","year":"2022","journal-title":"NPJ Digit Med"},{"key":"2026042913021474500_ocag008-B17","doi-asserted-by":"publisher","first-page":"1535","DOI":"10.1111\/jgh.16561","article-title":"Validation of GPT-4 for clinical event classification: a comparative analysis with ICD codes and human reviewers","volume":"39","author":"Wang","year":"2024","journal-title":"J Gastroenterol Hepatol"},{"key":"2026042913021474500_ocag008-B18","doi-asserted-by":"publisher","first-page":"1753","DOI":"10.1097\/HEP.0000000000001115","article-title":"Evaluating the positive predictive value of code-based identification of cirrhosis and its complications utilizing GPT-4","volume":"81","author":"Far","year":"2025","journal-title":"Hepatology"},{"key":"2026042913021474500_ocag008-B19","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1038\/s41746-023-00989-3","article-title":"DRG-LLaMA: tuning LLaMA model to predict diagnosis-related group for hospitalized patients","volume":"7","author":"Wang","year":"2024","journal-title":"NPJ Digit Med"},{"key":"2026042913021474500_ocag008-B20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1056\/AIdbp2300040","article-title":"Large language models are poor medical coders\u2014benchmarking of medical code querying","volume":"1","author":"Soroush","year":"2024","journal-title":"NEJM AI"},{"key":"2026042913021474500_ocag008-B21","author":"Kwan","year":"2024"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/33\/5\/947\/66851284\/ocag008.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/33\/5\/947\/66851284\/ocag008.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T17:02:25Z","timestamp":1777482145000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/33\/5\/947\/8472666"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,11]]},"references-count":21,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2026,2,11]]},"published-print":{"date-parts":[[2026,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocag008","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,5]]},"published":{"date-parts":[[2026,2,11]]}}}