{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,12]],"date-time":"2026-01-12T23:08:00Z","timestamp":1768259280107,"version":"3.49.0"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2022,2,11]],"date-time":"2022-02-11T00:00:00Z","timestamp":1644537600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"crossref","award":["UL1TR000371"],"award-info":[{"award-number":["UL1TR000371"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"crossref","award":["U01TR002393"],"award-info":[{"award-number":["U01TR002393"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100004917","name":"Cancer Prevention and Research Institute of Texas","doi-asserted-by":"crossref","award":["RP170668"],"award-info":[{"award-number":["RP170668"]}],"id":[{"id":"10.13039\/100004917","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Reynolds and Reynolds Professorship in Clinical Informatics"},{"DOI":"10.13039\/100000070","name":"National Institute of Biomedical Imaging and Bioengineering (NIBIB","doi-asserted-by":"crossref","award":["R21EB029575"],"award-info":[{"award-number":["R21EB029575"]}],"id":[{"id":"10.13039\/100000070","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,4,13]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Objectives<\/jats:title><jats:p>Scanned documents (SDs), while common in electronic health records and potentially rich in clinically relevant information, rarely fit well with clinician workflow. Here, we identify scanned imaging reports requiring follow-up with high recall and practically useful precision.<\/jats:p><\/jats:sec><jats:sec><jats:title>Materials and methods<\/jats:title><jats:p>We focused on identifying imaging findings for 3 common causes of malpractice claims: (1) potentially malignant breast (mammography) and (2) lung (chest computed tomography [CT]) lesions and (3) long-bone fracture (X-ray) reports. We train our ClinicalBERT-based pipeline on existing typed\/dictated reports classified manually or using ICD-10 codes, evaluate using a test set of manually classified SDs, and compare against string-matching (baseline approach).<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>A total of 393 mammograms, 305 chest CT, and 683 bone X-ray reports were manually reviewed. The string-matching approach had an F1 of 0.667. For mammograms, chest CTs, and bone X-rays, respectively: models trained on manually classified training data and optimized for F1 reached an F1 of 0.900, 0.905, and 0.817, while separate models optimized for recall achieved a recall of 1.000 with precisions of 0.727, 0.518, and 0.275. Models trained on ICD-10-labelled data and optimized for F1 achieved F1 scores of 0.647, 0.830, and 0.643, while those optimized for recall achieved a recall of 1.0 with precisions of 0.407, 0.683, and 0.358.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p>Our pipeline can identify abnormal reports with potentially useful performance and so decrease the manual effort required to screen for abnormal findings that require follow-up.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>It is possible to automatically identify clinically significant abnormalities in SDs with high recall and practically useful precision in a generalizable and minimally laborious way.<\/jats:p><\/jats:sec>","DOI":"10.1093\/jamia\/ocac007","type":"journal-article","created":{"date-parts":[[2022,1,13]],"date-time":"2022-01-13T12:09:08Z","timestamp":1642075748000},"page":"831-840","source":"Crossref","is-referenced-by-count":8,"title":["Closing the loop: automatically identifying abnormal imaging results in scanned documents"],"prefix":"10.1093","volume":"29","author":[{"given":"Akshat","family":"Kumar","sequence":"first","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"},{"name":"McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Heath","family":"Goodrum","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5931-9476","authenticated-orcid":false,"given":"Ashley","family":"Kim","sequence":"additional","affiliation":[{"name":"McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Carly","family":"Stender","sequence":"additional","affiliation":[{"name":"McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Kirk","family":"Roberts","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]},{"given":"Elmer V","family":"Bernstam","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA"},{"name":"Division of General Internal Medicine, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,2,11]]},"reference":[{"key":"2022041311582330300_ocac007-B1","doi-asserted-by":"crossref","first-page":"349","DOI":"10.12788\/jhm.3128","article-title":"Follow-up of incidental high-risk pulmonary nodules on computed tomography pulmonary angiography at care transitions","volume":"14","author":"Kwan","year":"2019","journal-title":"J Hosp Med"},{"issue":"2","key":"2022041311582330300_ocac007-B2","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1016\/j.jacr.2017.10.014","article-title":"Adherence to radiology recommendations in a clinical CT lung screening program","volume":"15","author":"Alshora","year":"2018","journal-title":"J Am Coll Radiol"},{"issue":"9","key":"2022041311582330300_ocac007-B3","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1089\/jpm.2012.0472","article-title":"Multiple locations of advance care planning documentation in an electronic health record: are they easy to find?","volume":"16","author":"Wilson","year":"2013","journal-title":"J Palliat Med"},{"key":"2022041311582330300_ocac007-B4","author":"Hanscom"},{"key":"2022041311582330300_ocac007-B5","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1186\/s12911-016-0306-3","article-title":"Temporal bone radiology report classification using open source machine learning and natural langue processing libraries","volume":"16","author":"Masino","year":"2016","journal-title":"BMC Med Inform Decis Mak"},{"issue":"1","key":"2022041311582330300_ocac007-B6","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1007\/s10278-017-0013-3","article-title":"Using natural language processing of free-text radiology reports to identify type 1 modic endplate changes","volume":"31","author":"Huhdanpaa","year":"2018","journal-title":"J Digit Imaging"},{"key":"2022041311582330300_ocac007-B7","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1186\/1471-2105-15-266","article-title":"Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings","volume":"15","author":"Pham","year":"2014","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"2022041311582330300_ocac007-B8","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/j.jbi.2012.12.005","article-title":"A text processing pipeline to extract recommendations from radiology reports","volume":"46","author":"Yetisgen-Yildiz","year":"2013","journal-title":"J Biomed Inform"},{"issue":"2","key":"2022041311582330300_ocac007-B9","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1111\/acem.12859","article-title":"Automated outcome classification of computed tomography imaging reports for pediatric traumatic brain injury","volume":"23","author":"Yadav","year":"2016","journal-title":"Acad Emerg Med"},{"key":"2022041311582330300_ocac007-B10","doi-asserted-by":"publisher","author":"Reback","year":"2021","DOI":"10.5281\/zenodo.4681666"},{"key":"2022041311582330300_ocac007-B11","first-page":"108","article-title":"API design for machine learning software: experiences from the Scikit-Learn project","author":"Buitinck","year":"2013"},{"issue":"7825","key":"2022041311582330300_ocac007-B12","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"2022041311582330300_ocac007-B13","volume-title":"Advances in Neural Information Processing Systems","author":"Paszke","year":"2019"},{"key":"2022041311582330300_ocac007-B14","first-page":"38","author":"Wolf","year":"2020"},{"key":"2022041311582330300_ocac007-B15","volume-title":"Python 3 Reference Manual","author":"Van Rossum","year":"2009"},{"key":"2022041311582330300_ocac007-B16","doi-asserted-by":"crossref","first-page":"104302","DOI":"10.1016\/j.ijmedinf.2020.104302","article-title":"Automatic classification of scanned electronic health record documents","volume":"144","author":"Goodrum","year":"2020","journal-title":"Int J Med Inf"},{"issue":"2","key":"2022041311582330300_ocac007-B17","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1016\/j.jacr.2015.07.013","article-title":"Radiology malpractice claims in the United States from 2008 to 2012: characteristics and implications","volume":"13","author":"Harvey","year":"2016","journal-title":"J Am Coll Radiol"},{"issue":"3","key":"2022041311582330300_ocac007-B18","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1111\/cts.12614","article-title":"Access to routinely collected clinical data for research: a process implemented at an academic medical center","volume":"12","author":"Guerrero","year":"2019","journal-title":"Clin Transl Sci"},{"key":"2022041311582330300_ocac007-B19","doi-asserted-by":"publisher","first-page":"72","DOI":"10.18653\/v1\/W19-1909","author":"Alsentzer","year":"2019"},{"key":"2022041311582330300_ocac007-B20","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1145\/2939672.2939778","author":"Ribeiro","year":"2016"},{"key":"2022041311582330300_ocac007-B21"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/5\/831\/43372392\/ocac007.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/29\/5\/831\/43372392\/ocac007.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,15]],"date-time":"2023-11-15T20:18:07Z","timestamp":1700079487000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/29\/5\/831\/6526658"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,11]]},"references-count":21,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2022,2,11]]},"published-print":{"date-parts":[[2022,4,13]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocac007","relation":{},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5,1]]},"published":{"date-parts":[[2022,2,11]]}}}