{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T17:54:07Z","timestamp":1770746047794,"version":"3.49.0"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T00:00:00Z","timestamp":1742342400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000024","name":"Canadian Institutes of Health Research","doi-asserted-by":"publisher","award":["DC0190GP"],"award-info":[{"award-number":["DC0190GP"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objectives<\/jats:title>\n                  <jats:p>Adverse event detection from Electronic Medical Records (EMRs) is challenging due to the low incidence of the event, variability in clinical documentation, and the complexity of data formats. Pulmonary embolism as an adverse event (PEAE) is particularly difficult to identify using existing approaches. This study aims to develop and evaluate a Large Language Model (LLM)-based framework for detecting PEAE from unstructured narrative data in EMRs.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We conducted a chart review of adult patients (aged 18-100) admitted to tertiary-care hospitals in Calgary, Alberta, Canada, between 2017-2022. We developed an LLM-based detection framework consisting of three modules: evidence extraction (implementing both keyword-based and semantic similarity-based filtering methods), discharge information extraction (focusing on six key clinical sections), and PEAE detection. Four open-source LLMs (Llama3, Mistral-7B, Gemma, and Phi-3) were evaluated using positive predictive value, sensitivity, specificity, and F1-score. Model performance for population-level surveillance was assessed at yearly, quarterly, and monthly granularities.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The chart review included 10 066 patients, with 40 cases of PEAE identified (0.4% prevalence). All four LLMs demonstrated high sensitivity (87.5-100%) and specificity (94.9-98.9%) across different experimental conditions. Gemma achieved the highest F1-score (28.11%) using keyword-based retrieval with discharge summary inclusion, along with 98.4% specificity, 87.5% sensitivity, and 99.95% negative predictive value. Keyword-based filtering reduced the median chunks per patient from 789 to 310, while semantic filtering further reduced this to 9 chunks. Including discharge summaries improved performance metrics across most models. For population-level surveillance, all models showed strong correlation with actual PEAE trends at yearly granularity (r=0.92-0.99), with Llama3 achieving the highest correlation (0.988).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>The results of our method for PEAE detection using EMR notes demonstrate high sensitivity and specificity across all four tested LLMs, indicating strong performance in distinguishing PEAE from non-PEAE cases. However, the low incidence rate of PEAE contributed to a lower PPV. The keyword-based chunking approach consistently outperformed semantic similarity-based methods, achieving higher F1 scores and PPV, underscoring the importance of domain knowledge in text segmentation. Including discharge summaries further enhanced performance metrics. Our population-based analysis revealed better performance for yearly trends compared to monthly granularity, suggesting the framework's utility for long-term surveillance despite dataset imbalance. Error analysis identified contextual misinterpretation, terminology confusion, and preprocessing limitations as key challenges for future improvement.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusions<\/jats:title>\n                  <jats:p>Our proposed method demonstrates that LLMs can effectively detect PEAE from narrative EMRs with high sensitivity and specificity. While these models serve as effective screening tools to exclude non-PEAE cases, their lower PPV indicates they cannot be relied upon solely for definitive PEAE identification. Further chart review remains necessary for confirmation. Future work should focus on improving contextual understanding, medical terminology interpretation, and exploring advanced prompting techniques to enhance precision in adverse event detection from EMRs.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocaf048","type":"journal-article","created":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T15:42:17Z","timestamp":1742398937000},"page":"876-884","source":"Crossref","is-referenced-by-count":5,"title":["Utilizing large language models for detecting hospital-acquired conditions: an empirical study on pulmonary embolism"],"prefix":"10.1093","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-0810-3011","authenticated-orcid":false,"given":"Cheligeer","family":"Cheligeer","sequence":"first","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Provincial Research Data Services, Alberta Health Services , Calgary T2N 4N1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0006-0033","authenticated-orcid":false,"given":"Danielle A","family":"Southern","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5148-1399","authenticated-orcid":false,"given":"Jun","family":"Yan","sequence":"additional","affiliation":[{"name":"Concordia Institute for Information Systems Engineering, Concordia University , Montreal H3G 2W1,","place":["Canada"]}]},{"given":"Guosong","family":"Wu","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Department of Community Health Sciences, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6398-1756","authenticated-orcid":false,"given":"Jie","family":"Pan","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Department of Community Health Sciences, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]}]},{"given":"Seungwon","family":"Lee","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Provincial Research Data Services, Alberta Health Services , Calgary T2N 4N1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5127-4333","authenticated-orcid":false,"given":"Elliot A","family":"Martin","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Provincial Research Data Services, Alberta Health Services , Calgary T2N 4N1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-9410-7675","authenticated-orcid":false,"given":"Hamed","family":"Jafarpour","sequence":"additional","affiliation":[{"name":"Concordia Institute for Information Systems Engineering, Concordia University , Montreal H3G 2W1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4569-8014","authenticated-orcid":false,"given":"Cathy A","family":"Eastwood","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Department of Community Health Sciences, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6678-271X","authenticated-orcid":false,"given":"Yong","family":"Zeng","sequence":"additional","affiliation":[{"name":"Concordia Institute for Information Systems Engineering, Concordia University , Montreal H3G 2W1,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7848-7256","authenticated-orcid":false,"given":"Hude","family":"Quan","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Department of Community Health Sciences, Cumming School of Medicine, University of Calgary , Calgary T2N 4N1,","place":["Canada"]},{"name":"Libin Cardiovascular Institute, University of Calgary , Calgary T2N 4N1,","place":["Canada"]}]}],"member":"286","published-online":{"date-parts":[[2025,3,19]]},"reference":[{"key":"2025042203522718000_ocaf048-B1","doi-asserted-by":"publisher","first-page":"1007","DOI":"10.1093\/jamia\/ocv180","article-title":"Extracting information from the text of electronic medical records to improve case detection: a systematic review","volume":"23","author":"Ford","year":"2016","journal-title":"J Am Med Inform Assoc"},{"key":"2025042203522718000_ocaf048-B2","doi-asserted-by":"publisher","first-page":"e756-767","DOI":"10.1542\/peds.2013-0794","article-title":"Electronic health record identification of nephrotoxin exposure and associated acute kidney injury","volume":"132","author":"Goldstein","year":"2013","journal-title":"Pediatrics"},{"key":"2025042203522718000_ocaf048-B3","doi-asserted-by":"publisher","first-page":"e48995","DOI":"10.2196\/48995","article-title":"BERT-based neural network for inpatient fall detection from electronic medical records: retrospective cohort study","volume":"12","author":"Cheligeer","year":"2024","journal-title":"JMIR Med Inform"},{"key":"2025042203522718000_ocaf048-B4","doi-asserted-by":"publisher","first-page":"1172","DOI":"10.1109\/Tr.2023.3330733","article-title":"A machine-learning-based approach for identifying diagnostic errors in electronic medical records","volume":"73","author":"Zhao","year":"2024","journal-title":"IEEE Trans Rel"},{"key":"2025042203522718000_ocaf048-B5","doi-asserted-by":"publisher","first-page":"1423","DOI":"10.1161\/Hypertensionaha.109.139279","article-title":"Validation of a case definition to define hypertension using administrative data","volume":"54","author":"Quan","year":"2009","journal-title":"Hypertension"},{"key":"2025042203522718000_ocaf048-B6","doi-asserted-by":"publisher","first-page":"610","DOI":"10.1016\/j.cardfail.2020.04.003","article-title":"Enhancing ICD-code-based case definition for heart failure using electronic medical record data","volume":"26","author":"Xu","year":"2020","journal-title":"J Card Fail"},{"key":"2025042203522718000_ocaf048-B7","doi-asserted-by":"publisher","first-page":"e0198847","DOI":"10.1371\/journal.pone.0198847","article-title":"Comparing the validity of different ICD coding abstraction strategies for sepsis case identification in German claims data","volume":"13","author":"Fleischmann-Struzek","year":"2018","journal-title":"PLoS One"},{"key":"2025042203522718000_ocaf048-B8","doi-asserted-by":"publisher","first-page":"e009487","DOI":"10.1136\/bmjopen-2015-009487","article-title":"Validation and optimisation of an ICD-10-coded case definition for sepsis using administrative health data","volume":"5","author":"Jolley","year":"2015","journal-title":"BMJ Open"},{"key":"2025042203522718000_ocaf048-B9","doi-asserted-by":"publisher","first-page":"1567","DOI":"10.1093\/jamia\/ocy094","article-title":"Identification of validated case definitions for medical conditions used in primary care electronic medical record databases: a systematic review","volume":"25","author":"McBrien","year":"2018","journal-title":"J Am Med Inform Assoc"},{"key":"2025042203522718000_ocaf048-B10","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1111\/j.1528-1167.2009.02201.x","article-title":"How accurate is ICD coding for epilepsy?","volume":"51","author":"Jette","year":"2010","journal-title":"Epilepsia"},{"key":"2025042203522718000_ocaf048-B11","doi-asserted-by":"publisher","first-page":"e12239","DOI":"10.2196\/12239","article-title":"Natural language processing of clinical notes on chronic diseases: systematic review","volume":"7","author":"Sheikhalishahi","year":"2019","journal-title":"JMIR Med Inform"},{"key":"2025042203522718000_ocaf048-B12","doi-asserted-by":"publisher","first-page":"364","DOI":"10.1093\/jamia\/ocy173","article-title":"Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review","volume":"26","author":"Koleck","year":"2019","journal-title":"J Am Med Inform Assoc"},{"key":"2025042203522718000_ocaf048-B13","doi-asserted-by":"publisher","first-page":"e705","DOI":"10.1002\/osp4.705","article-title":"Automated extraction of weight, height, and obesity in electronic medical records are highly valid","volume":"10","author":"Sandhu","year":"2024","journal-title":"Obes Sci Pract"},{"key":"2025042203522718000_ocaf048-B14","doi-asserted-by":"publisher","first-page":"1037","DOI":"10.1056\/NEJMra072753","article-title":"Acute pulmonary embolism","volume":"358","author":"Tapson","year":"2008","journal-title":"N Engl J Med"},{"key":"2025042203522718000_ocaf048-B15","author":"Thrombosis Canada"},{"key":"2025042203522718000_ocaf048-B16","doi-asserted-by":"publisher","first-page":"448","DOI":"10.1197\/jamia.M1794","article-title":"Automated detection of adverse events using natural language processing of discharge summaries","volume":"12","author":"Melton","year":"2005","journal-title":"J Am Med Inform Assoc"},{"key":"2025042203522718000_ocaf048-B17","author":"Llama Team, AI @ Meta"},{"key":"2025042203522718000_ocaf048-B18","author":"Jiang"},{"key":"2025042203522718000_ocaf048-B19","author":"Mesnard"},{"key":"2025042203522718000_ocaf048-B20","author":"Abdin"},{"key":"2025042203522718000_ocaf048-B21","doi-asserted-by":"publisher","DOI":"10.1136\/bmjoq-2023-002722","article-title":"Achieving high inter-rater reliability in establishing data labels: a retrospective chart review study","volume":"13","author":"Wu","year":"2024","journal-title":"BMJ Open Qual"},{"key":"2025042203522718000_ocaf048-B22","doi-asserted-by":"publisher","first-page":"e0275250","DOI":"10.1371\/journal.pone.0275250","article-title":"Developing EMR-based algorithms to Identify hospital adverse events for health system performance evaluation and improvement: study protocol","volume":"17","author":"Wu","year":"2022","journal-title":"PLoS One"},{"key":"2025042203522718000_ocaf048-B23","doi-asserted-by":"publisher","first-page":"421","DOI":"10.1017\/s1481803500010484","article-title":"Errors, near misses and adverse events in the emergency department: what can patients tell us?","volume":"10","author":"Friedman","year":"2008","journal-title":"CJEM"},{"key":"2025042203522718000_ocaf048-B24","doi-asserted-by":"publisher","first-page":"1102","DOI":"10.4065\/76.11.1102","article-title":"Incidence of venous thromboembolism in hospitalized patients vs community residents","volume":"76","author":"Heit","year":"2001","journal-title":"Mayo Clin Proc"},{"key":"2025042203522718000_ocaf048-B25","doi-asserted-by":"publisher","first-page":"792","DOI":"10.1111\/bjh.16010","article-title":"The National VTE Exemplar Centres Network response to implementation of updated NICE guidance: venous thromboembolism in over 16s: reducing the risk of hospital-acquired deep vein thrombosis or pulmonary embolism (NG89)","volume":"186","author":"Gee","year":"2019","journal-title":"Br J Haematol"},{"key":"2025042203522718000_ocaf048-B26","doi-asserted-by":"publisher","first-page":"2683555241297566","DOI":"10.1177\/02683555241297566","article-title":"Prevalence and risk factors of hospital acquired venous thromboembolism","author":"Li","year":"2024","journal-title":"Phlebology"},{"key":"2025042203522718000_ocaf048-B27","first-page":"15339","author":"Levy","year":"2024"},{"key":"2025042203522718000_ocaf048-B28","first-page":"3982","author":"Reimers"},{"key":"2025042203522718000_ocaf048-B29","author":"Devlin"},{"key":"2025042203522718000_ocaf048-B30","author":"Luo"},{"key":"2025042203522718000_ocaf048-B31","first-page":"72","author":"Alsentzer"},{"key":"2025042203522718000_ocaf048-B32","author":"Wang"},{"key":"2025042203522718000_ocaf048-B33","author":"Li"},{"key":"2025042203522718000_ocaf048-B34","first-page":"16857","article-title":"Mpnet: masked and permuted pre-training for language understanding","volume":"33","author":"Song","year":"2020","journal-title":"Adv Neural Inform Process Syst"},{"key":"2025042203522718000_ocaf048-B35","author":"Li"},{"key":"2025042203522718000_ocaf048-B36","doi-asserted-by":"publisher","first-page":"430","DOI":"10.1111\/1742-6723.12285","article-title":"Review article: components of a good quality discharge summary: a systematic review","volume":"26","author":"Wimsett","year":"2014","journal-title":"Emerg Med Australas"},{"key":"2025042203522718000_ocaf048-B37","first-page":"188","author":"Kind"},{"key":"2025042203522718000_ocaf048-B38","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1016\/S1553-7250(16)30107-6","article-title":"Design and hospital wide implementation of a standardized discharge summary in an electronic health record","volume":"42","author":"Dean","year":"2016","journal-title":"Jt Comm J Qual Patient Saf"},{"key":"2025042203522718000_ocaf048-B39","doi-asserted-by":"publisher","first-page":"349","DOI":"10.1186\/s12913-021-06345-z","article-title":"What makes a \u201csuccessful\u201d or \u201cunsuccessful\u201d discharge letter? Hospital clinician and General Practitioner assessments of the quality of discharge letters","volume":"21","author":"Weetman","year":"2021","journal-title":"BMC Health Serv Res"},{"key":"2025042203522718000_ocaf048-B40","author":"Python Software Foundation"},{"key":"2025042203522718000_ocaf048-B41","author":"Paszke"},{"key":"2025042203522718000_ocaf048-B42","author":"Wolf"},{"key":"2025042203522718000_ocaf048-B43","first-page":"24824","author":"Wei"},{"key":"2025042203522718000_ocaf048-B44","first-page":"1998","author":"Agrawal"},{"key":"2025042203522718000_ocaf048-B45","first-page":"1877","author":"Brown"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/32\/5\/876\/62461093\/ocaf048.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/32\/5\/876\/62461093\/ocaf048.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,22]],"date-time":"2025-04-22T07:52:35Z","timestamp":1745308355000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/32\/5\/876\/8086871"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,19]]},"references-count":45,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,3,19]]},"published-print":{"date-parts":[[2025,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaf048","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,5]]},"published":{"date-parts":[[2025,3,19]]}}}