{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T05:20:34Z","timestamp":1778044834018,"version":"3.51.4"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2024,2,27]],"date-time":"2024-02-27T00:00:00Z","timestamp":1708992000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1945764"],"award-info":[{"award-number":["1945764"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["2327143"],"award-info":[{"award-number":["2327143"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Office of Research Computing"},{"DOI":"10.13039\/100006369","name":"George Mason University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006369","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1625039"],"award-info":[{"award-number":["1625039"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["2018631"],"award-info":[{"award-number":["2018631"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>The use of electronic health records (EHRs) for clinical risk prediction is on the rise. However, in many practical settings, the limited availability of task-specific EHR data can restrict the application of standard machine learning pipelines. In this study, we investigate the potential of leveraging language models (LMs) as a means to\u00a0incorporate supplementary domain knowledge for improving the performance of various EHR-based risk prediction tasks.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Methods<\/jats:title>\n                  <jats:p>We propose two novel LM-based methods, namely \u201cLLaMA2-EHR\u201d and \u201cSent-e-Med.\u201d Our focus is on utilizing the textual descriptions within structured EHRs to make risk predictions about future diagnoses. We conduct a comprehensive comparison with previous approaches across various data types and sizes.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Experiments across 6 different methods and 3 separate risk prediction tasks reveal that employing LMs to represent structured EHRs, such as diagnostic histories, results in significant performance improvements when evaluated using standard metrics such as area under the receiver operating characteristic (ROC) curve and precision-recall (PR) curve. Additionally, they offer benefits such as few-shot learning, the ability to handle previously unseen medical concepts, and adaptability to various medical vocabularies. However, it is noteworthy that outcomes may exhibit sensitivity to a specific prompt.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>LMs encompass extensive embedded knowledge, making them valuable for the analysis of EHRs in the context of risk prediction. Nevertheless, it is important to exercise caution in their application, as ongoing safety concerns related to LMs persist and require continuous consideration.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocae030","type":"journal-article","created":{"date-parts":[[2024,2,27]],"date-time":"2024-02-27T20:21:34Z","timestamp":1709065294000},"page":"1856-1864","source":"Crossref","is-referenced-by-count":17,"title":["Clinical risk prediction using language models: benefits and considerations"],"prefix":"10.1093","volume":"31","author":[{"given":"Angeela","family":"Acharya","sequence":"first","affiliation":[{"name":"George Mason University , Fairfax, VA,","place":["United States"]}]},{"given":"Sulabh","family":"Shrestha","sequence":"additional","affiliation":[{"name":"George Mason University , Fairfax, VA,","place":["United States"]}]},{"given":"Anyi","family":"Chen","sequence":"additional","affiliation":[{"name":"Staten Island Performing Provider System , Staten Island, NY,","place":["United States"]}]},{"given":"Joseph","family":"Conte","sequence":"additional","affiliation":[{"name":"Staten Island Performing Provider System , Staten Island, NY,","place":["United States"]}]},{"given":"Sanja","family":"Avramovic","sequence":"additional","affiliation":[{"name":"George Mason University , Fairfax, VA,","place":["United States"]}]},{"given":"Siddhartha","family":"Sikdar","sequence":"additional","affiliation":[{"name":"George Mason University , Fairfax, VA,","place":["United States"]}]},{"given":"Antonios","family":"Anastasopoulos","sequence":"additional","affiliation":[{"name":"George Mason University , Fairfax, VA,","place":["United States"]}]},{"given":"Sanmay","family":"Das","sequence":"additional","affiliation":[{"name":"George Mason University , Fairfax, VA,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2024,2,27]]},"reference":[{"issue":"1","key":"2025030418553859100_ocae030-B1","doi-asserted-by":"crossref","first-page":"e80","DOI":"10.1002\/cphg.80","article-title":"Using electronic health records to generate phenotypes for research","volume":"100","author":"Pendergrass","year":"2018","journal-title":"Curr Protoc Hum Genet"},{"issue":"1","key":"2025030418553859100_ocae030-B2","doi-asserted-by":"crossref","first-page":"198\u2013","DOI":"10.1093\/jamia\/ocw042","article-title":"Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review","volume":"24","author":"Goldstein","year":"2017","journal-title":"J Am Med Inform Assoc"},{"key":"2025030418553859100_ocae030-B3","first-page":"787","author":"Choi","year":"2017"},{"key":"2025030418553859100_ocae030-B4","author":"Shang","year":"2019"},{"key":"2025030418553859100_ocae030-B5","first-page":"596"},{"issue":"1","key":"2025030418553859100_ocae030-B6","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/s41746-021-00455-y","article-title":"Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction","volume":"4","author":"Rasmy","year":"2021","journal-title":"NPJ Digit Med"},{"issue":"1","key":"2025030418553859100_ocae030-B7","doi-asserted-by":"crossref","first-page":"453\u2013","DOI":"10.1609\/aaai.v35i1.16122","article-title":"RareBERT: transformer architecture for rare disease patient identification using administrative claims","volume":"35","author":"Prakash","year":"2021","journal-title":"AAAI"},{"issue":"1","key":"2025030418553859100_ocae030-B8","doi-asserted-by":"crossref","first-page":"7155","DOI":"10.1038\/s41598-020-62922-y","article-title":"BEHRT: transformer for electronic health records","volume":"10","author":"Li","year":"2020","journal-title":"Sci Rep"},{"key":"2025030418553859100_ocae030-B9","first-page":"239","author":"Pang","year":"2021"},{"key":"2025030418553859100_ocae030-B10","author":"Devlin"},{"issue":"15","key":"2025030418553859100_ocae030-B11","doi-asserted-by":"crossref","first-page":"541\u2013","DOI":"10.15585\/mmwr.mm7015a1","article-title":"State-level economic costs of opioid use disorder and fatal opioid overdose\u2013United States, 2017","volume":"70","author":"Luo","year":"2021","journal-title":"MMWR Morb Mortal Wkly Rep"},{"key":"2025030418553859100_ocae030-B12","first-page":"685","author":"Acharya","year":"2022"},{"issue":"12","key":"2025030418553859100_ocae030-B13","doi-asserted-by":"crossref","first-page":"e0269509","DOI":"10.1371\/journal.pone.0269509","article-title":"Exploring county-level spatio-temporal patterns in opioid overdose related emergency department visits","volume":"17","author":"Acharya","year":"2022","journal-title":"PLoS One"},{"key":"2025030418553859100_ocae030-B14","first-page":"112\u2013","article-title":"Substance misuse and substance use disorders: Why do they matter in healthcare?","volume":"128","author":"McLellan","year":"2017","journal-title":"Trans Am Clin Climatol Assoc"},{"key":"2025030418553859100_ocae030-B15","author":"Huang","year":"2019"},{"key":"2025030418553859100_ocae030-B16","author":"Yang","year":"5:194."},{"key":"2025030418553859100_ocae030-B17","author":"Touvron","year":"2023"},{"key":"2025030418553859100_ocae030-B18","first-page":"1","author":"Chowdhery","year":". 2023"},{"key":"2025030418553859100_ocae030-B19","first-page":"1877","volume-title":"Advances in Neural Information Processing Systems","author":"Brown","year":"2020"},{"key":"2025030418553859100_ocae030-B20","author":"Johnson"},{"issue":"3","key":"2025030418553859100_ocae030-B21","doi-asserted-by":"crossref","first-page":"230\u2013","DOI":"10.1093\/jamia\/ocx079","article-title":"Synthea: an approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record","volume":"25","author":"Walonoski","year":"2017","journal-title":"J Am Med Inf Assoc"},{"issue":"1","key":"2025030418553859100_ocae030-B22","doi-asserted-by":"crossref","first-page":"e24594","DOI":"10.2196\/24594","article-title":"Use of the systematized nomenclature of medicine clinical terms (SNOMED CT) for processing free text in health care: systematic scoping review","volume":"23","author":"Gaudet-Blavignac","year":"2021","journal-title":"J Med Internet Res"},{"key":"2025030418553859100_ocae030-B23","author":"Shickel","year":"2017"},{"key":"2025030418553859100_ocae030-B24","article-title":"A deep learning method to detect opioid prescription and opioid use disorder from electronic health records","author":"Kashyap","journal-title":"Int J Med Inf"},{"key":"2025030418553859100_ocae030-B25","first-page":"3982","author":"Reimers"},{"key":"2025030418553859100_ocae030-B26","author":"Naveed","year":"2023"},{"key":"2025030418553859100_ocae030-B27","author":"Li","year":"2022"},{"key":"2025030418553859100_ocae030-B28","author":"Kaushik","year":"2021"},{"key":"2025030418553859100_ocae030-B29","first-page":"625","author":"Shrestha","year":"2024"},{"key":"2025030418553859100_ocae030-B30","first-page":"2758","author":"McKenna"},{"key":"2025030418553859100_ocae030-B31","author":"Peng","year":"2023"},{"key":"2025030418553859100_ocae030-B32","author":"Chen","year":"2023"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/31\/9\/1856\/58868302\/ocae030.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/31\/9\/1856\/58868302\/ocae030.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,4]],"date-time":"2025-03-04T20:51:35Z","timestamp":1741121495000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/31\/9\/1856\/7614964"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,27]]},"references-count":32,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,2,27]]},"published-print":{"date-parts":[[2024,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocae030","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,9]]},"published":{"date-parts":[[2024,2,27]]}}}