{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,14]],"date-time":"2026-01-14T19:34:08Z","timestamp":1768419248632,"version":"3.49.0"},"reference-count":71,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2024,10,1]],"date-time":"2024-10-01T00:00:00Z","timestamp":1727740800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000862","name":"Doris Duke Charitable Foundation","doi-asserted-by":"publisher","award":["2020143"],"award-info":[{"award-number":["2020143"]}],"id":[{"id":"10.13039\/100000862","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000025","name":"National Institute of Mental Health","doi-asserted-by":"publisher","award":["K23MH128613"],"award-info":[{"award-number":["K23MH128613"]}],"id":[{"id":"10.13039\/100000025","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018784","name":"Foundation of Hope for Research and Treatment of Mental Illness","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100018784","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000050","name":"National Heart, Lung, and Blood Institute","doi-asserted-by":"publisher","award":["F31HL156464-03"],"award-info":[{"award-number":["F31HL156464-03"]}],"id":[{"id":"10.13039\/100000050","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["UL1TR002489"],"award-info":[{"award-number":["UL1TR002489"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Health Informatics J"],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:p> Objective: We analyzed a natural language processing (NLP) toolkit\u2019s ability to classify unstructured EHR data by psychiatric diagnosis. Expertise can be a barrier to using NLP. We employed an NLP toolkit (CLARK) created to support studies led by investigators with a range of informatics knowledge. Methods: The EHR of 652 patients were manually reviewed to establish Depression and Substance Use Disorder (SUD) labeled datasets, which were split into training and evaluation datasets. We used CLARK to train depression and SUD classification models using training datasets; model performance was analyzed against evaluation datasets. Results: The depression model accurately classified 69% of records (sensitivity = 0.68, specificity = 0.70, F1 = 0.68). The SUD model accurately classified 84% of records (sensitivity = 0.56, specificity = 0.92, F1 = 0.57). Conclusion: The depression model performed a more balanced job, while the SUD model\u2019s high specificity was paired with a low sensitivity. NLP applications may be especially helpful when combined with a confidence threshold for manual review. <\/jats:p>","DOI":"10.1177\/14604582241296411","type":"journal-article","created":{"date-parts":[[2024,10,28]],"date-time":"2024-10-28T16:05:26Z","timestamp":1730131526000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":3,"title":["Using a natural language processing toolkit to classify electronic health records by psychiatric diagnosis"],"prefix":"10.1177","volume":"30","author":[{"given":"Alissa","family":"Hutto","sequence":"first","affiliation":[{"name":"Department of Psychiatry, University of North Carolina School of Medicine, Chapel Hill, NC, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3360-249X","authenticated-orcid":false,"given":"Tarek M","family":"Zikry","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of North Carolina Gillings School of Global Public Health, Chapel Hill, NC, USA"}]},{"given":"Buck","family":"Bohac","sequence":"additional","affiliation":[{"name":"North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA"}]},{"given":"Terra","family":"Rose","sequence":"additional","affiliation":[{"name":"Department of Psychiatry, University of North Carolina School of Medicine, Chapel Hill, NC, USA"},{"name":"Department of Health Sciences, University of North Carolina School of Medicine, Chapel Hill, NC, USA"}]},{"given":"Jasmine","family":"Staebler","sequence":"additional","affiliation":[{"name":"Department of Health Sciences, University of North Carolina School of Medicine, Chapel Hill, NC, USA"}]},{"given":"Janet","family":"Slay","sequence":"additional","affiliation":[{"name":"Department of Health Sciences, University of North Carolina School of Medicine, Chapel Hill, NC, USA"}]},{"given":"C Ray","family":"Cheever","sequence":"additional","affiliation":[{"name":"University of North Carolina School of Medicine, Chapel Hill, NC, USA"}]},{"given":"Michael R","family":"Kosorok","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of North Carolina Gillings School of Global Public Health, Chapel Hill, NC, USA"},{"name":"Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC, USA"}]},{"given":"Rebekah P.","family":"Nash","sequence":"additional","affiliation":[{"name":"Department of Psychiatry, University of North Carolina School of Medicine, Chapel Hill, NC, USA"}]}],"member":"179","published-online":{"date-parts":[[2024,10,28]]},"reference":[{"key":"bibr1-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1038\/s41591-018-0300-7"},{"key":"bibr2-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1176\/appi.ajp.2020.20030250"},{"key":"bibr3-14604582241296411","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyt.2022.946387"},{"key":"bibr4-14604582241296411","doi-asserted-by":"publisher","DOI":"10.2196\/15708"},{"key":"bibr5-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2017.10.005"},{"key":"bibr6-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1192\/bjp.2021.188"},{"key":"bibr7-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-clinpsy-032816-045037"},{"key":"bibr8-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1002\/ajmg.b.32548"},{"key":"bibr9-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.bpsc.2021.02.001"},{"key":"bibr10-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1007\/s11920-019-1094-0"},{"key":"bibr11-14604582241296411","doi-asserted-by":"publisher","DOI":"10.3390\/ijms21030969"},{"key":"bibr12-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1017\/S0033291721003871"},{"key":"bibr13-14604582241296411","doi-asserted-by":"publisher","DOI":"10.3390\/ijerph18105072"},{"key":"bibr14-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1136\/jamia.2009.001560"},{"key":"bibr15-14604582241296411","unstructured":"Gorrell G, Song X, Roberts A. Bio-YODIE: a named entity linking system for biomedical text."},{"key":"bibr16-14604582241296411","first-page":"17","author":"Aronson AR","year":"2001","journal-title":"Proc AMIA Symp"},{"key":"bibr17-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.artmed.2021.102083"},{"key":"bibr18-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1007\/s12021-013-9178-1"},{"key":"bibr19-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-015-0871-y"},{"key":"bibr20-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1197\/jamia.M1552"},{"key":"bibr21-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1136\/bmjopen-2016-012012"},{"key":"bibr22-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1136\/amiajnl-2014-002733"},{"key":"bibr23-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1017\/S0033291711000997"},{"key":"bibr24-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-018-25773-2"},{"key":"bibr25-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijmedinf.2019.103973"},{"key":"bibr26-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1001\/jamapsychiatry.2022.4634"},{"key":"bibr27-14604582241296411","unstructured":"Minor LB. Stanford medicine 2020 health trends report, the rise of the data-driven physician, 2020."},{"key":"bibr28-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1186\/s12909-022-03896-5"},{"key":"bibr29-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2017.04.017"},{"key":"bibr30-14604582241296411","doi-asserted-by":"publisher","DOI":"10.2196\/16042"},{"key":"bibr31-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpainsymman.2021.10.014"},{"key":"bibr32-14604582241296411","unstructured":"Wolf T, Debut L, Sanh V, et al. HuggingFace\u2019s transformers: state-of-the-art Natural Language processing."},{"key":"bibr33-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1097\/TP.0000000000000824"},{"key":"bibr34-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1111\/j.1600-6143.2011.03496.x"},{"key":"bibr35-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1176\/appi.psy.42.4.337"},{"key":"bibr36-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1097\/PSY.0b013e3181dbbb7d"},{"key":"bibr37-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1177\/1534765609333782"},{"key":"bibr38-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1097\/TP.0b013e3181817dd7"},{"key":"bibr39-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0165517"},{"key":"bibr40-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1177\/152692480001000408"},{"key":"bibr41-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1177\/008124630803800307"},{"key":"bibr42-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0175161"},{"key":"bibr43-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2369-15-191"},{"key":"bibr44-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1176\/appi.books.9780890425596"},{"key":"bibr45-14604582241296411","unstructured":"Sanh V, Debut L, Chaumond J, et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and Lighter."},{"key":"bibr46-14604582241296411","unstructured":"Sun C, Qiu X, Xu Y, et al. How to fine-tune BERT for text classification."},{"key":"bibr47-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"bibr48-14604582241296411","first-page":"2825","volume":"12","author":"Pedregosa F","year":"2011","journal-title":"JMLR"},{"key":"bibr49-14604582241296411","unstructured":"Microsoft corporation. Microsoft Excel, 2018. https:\/\/office.microsoft.com\/excel (accessed 16 January 2023)."},{"key":"bibr50-14604582241296411","unstructured":"R Core Team. R: a language and environment for statistical computing."},{"key":"bibr51-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-12-77"},{"key":"bibr52-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bti623"},{"key":"bibr53-14604582241296411","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v028.i05"},{"key":"bibr54-14604582241296411","volume-title":"Practical machine learning with R : define, build, and evaluate machine learning models for real-world applications","author":"Jeyaraman B","year":"2019"},{"key":"bibr55-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2019.103208"},{"key":"bibr56-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2008.08.010"},{"key":"bibr57-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.alcohol.2019.09.008"},{"key":"bibr58-14604582241296411","first-page":"2121","volume":"2015","author":"Wang Y","year":"2015","journal-title":"AMIA Annu Symp Proc"},{"key":"bibr59-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1111\/add.15730"},{"key":"bibr60-14604582241296411","doi-asserted-by":"publisher","DOI":"10.5498\/wjp.v4.i4.133"},{"key":"bibr61-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1377\/hlthaff.2021.01423"},{"key":"bibr62-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1136\/bmjqs-2018-008370"},{"key":"bibr63-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.drugalcdep.2007.03.007"},{"key":"bibr64-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1136\/bmj.h1885"},{"key":"bibr65-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1016\/j.cmpb.2023.107925"},{"key":"bibr66-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1002\/wps.20998"},{"key":"bibr67-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-022-00589-7"},{"key":"bibr68-14604582241296411","volume-title":"Machine Learning","author":"Zhou ZH","year":"2016","edition":"1"},{"key":"bibr69-14604582241296411","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-7138-7"},{"key":"bibr70-14604582241296411","first-page":"61","volume":"10","author":"Platt J","year":"1999","journal-title":"Advances in large margin classifiers"},{"key":"bibr71-14604582241296411","doi-asserted-by":"publisher","DOI":"10.3390\/info14070420"}],"container-title":["Health Informatics Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14604582241296411","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14604582241296411","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14604582241296411","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T23:10:29Z","timestamp":1741043429000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14604582241296411"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10]]},"references-count":71,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["10.1177\/14604582241296411"],"URL":"https:\/\/doi.org\/10.1177\/14604582241296411","relation":{},"ISSN":["1460-4582","1741-2811"],"issn-type":[{"value":"1460-4582","type":"print"},{"value":"1741-2811","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10]]},"article-number":"14604582241296411"}}