{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T11:08:08Z","timestamp":1740136088659,"version":"3.37.3"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2021,9,29]],"date-time":"2021-09-29T00:00:00Z","timestamp":1632873600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","award":["R15LM013209"],"award-info":[{"award-number":["R15LM013209"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Center for Advancing Translational Sciences of National Institutes of Health","award":["UL1TR002319"],"award-info":[{"award-number":["UL1TR002319"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Neural network deidentification studies have focused on individual datasets. These studies assume the availability of a sufficient amount of human-annotated data to train models that can generalize to corresponding test data. In real-world situations, however, researchers often have limited or no in-house training data. Existing systems and external data can help jump-start deidentification on in-house data; however, the most efficient way of utilizing existing systems and external data is unclear. This article investigates the transferability of a state-of-the-art neural clinical deidentification system, NeuroNER, across a variety of datasets, when it is modified architecturally for domain generalization and when it is trained strategically for domain transfer.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We conducted a comparative study of the transferability of NeuroNER using 4 clinical note corpora with multiple note types from 2 institutions. We modified NeuroNER architecturally to integrate 2 types of domain generalization approaches. We evaluated each architecture using 3 training strategies. We measured transferability from external sources; transferability across note types; the contribution of external source data when in-domain training data are available; and transferability across institutions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results and Conclusions<\/jats:title>\n                  <jats:p>Transferability from a single external source gave inconsistent results. Using additional external sources consistently yielded an F1-score of approximately 80%. Fine-tuning emerged as a dominant transfer strategy, with or without domain generalization. We also found that external sources were useful even in cases where in-domain training data were available. Transferability across institutions differed by note type and annotation label but resulted in improved performance.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocab207","type":"journal-article","created":{"date-parts":[[2021,9,10]],"date-time":"2021-09-10T19:09:34Z","timestamp":1631300974000},"page":"2661-2669","source":"Crossref","is-referenced-by-count":3,"title":["Transferability of neural network clinical deidentification systems"],"prefix":"10.1093","volume":"28","author":[{"given":"Kahyun","family":"Lee","sequence":"first","affiliation":[{"name":"Department of Information Science and Technology, George Mason University, Fairfax, Virginia, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3598-8747","authenticated-orcid":false,"given":"Nicholas J","family":"Dobbins","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, USA"}]},{"given":"Bridget","family":"McInnes","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA"}]},{"given":"Meliha","family":"Yetisgen","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, USA"}]},{"given":"\u00d6zlem","family":"Uzuner","sequence":"additional","affiliation":[{"name":"Department of Information Science and Technology, George Mason University, Fairfax, Virginia, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,9,29]]},"reference":[{"issue":"3","key":"2021120106310831400_ocab207-B1","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1055\/s-0038-1634080","article-title":"Assessing the difficulty and time cost of de-identification in clinical narratives","volume":"45","author":"Dorr","year":"2006","journal-title":"Methods Inf Med"},{"key":"2021120106310831400_ocab207-B2","first-page":"333","article-title":"Replacing personally-identifying information in medical records, the Scrub system","author":"Sweeney","year":"1996","journal-title":"AMIA Annu Symp Proc"},{"key":"2021120106310831400_ocab207-B3","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1186\/1471-2288-10-70","article-title":"Automatic de-identification of textual documents in the electronic health record: a review of recent research","volume":"10","author":"Meystre","year":"2010","journal-title":"BMC Med Res Methodol"},{"key":"2021120106310831400_ocab207-B4","first-page":"254","article-title":"HIDE: an integrated system for health information DE-identification","author":"Gardner","year":"2008","journal-title":"Proc IEEE Symp Comput Med Syst"},{"issue":"1","key":"2021120106310831400_ocab207-B5","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.artmed.2007.10.001","article-title":"A de-identifier for medical discharge summaries","volume":"42","author":"Uzuner","year":"2008","journal-title":"Artif Intell Med"},{"key":"2021120106310831400_ocab207-B6","doi-asserted-by":"crossref","first-page":"S60","DOI":"10.1016\/j.jbi.2015.09.004","article-title":"Hidden Markov model using Dirichlet process for de-identification","volume":"58 (Suppl","author":"Chen","year":"2015","journal-title":"J Biomed Inform"},{"issue":"5","key":"2021120106310831400_ocab207-B7","doi-asserted-by":"crossref","first-page":"564","DOI":"10.1197\/jamia.M2435","article-title":"Rapidly retargetable approaches to de-identification in medical records","volume":"14","author":"Wellner","year":"2007","journal-title":"J Am Med Inform Assoc"},{"issue":"3","key":"2021120106310831400_ocab207-B8","doi-asserted-by":"crossref","first-page":"596","DOI":"10.1093\/jamia\/ocw156","article-title":"De-identification of patient notes with recurrent neural networks","volume":"24","author":"Dernoncourt","year":"2017","journal-title":"J Am Med Inform Assoc"},{"year":"2018","author":"Khin","key":"2021120106310831400_ocab207-B9"},{"year":"2018","author":"Peters","key":"2021120106310831400_ocab207-B10"},{"issue":"1","key":"2021120106310831400_ocab207-B11","doi-asserted-by":"crossref","first-page":"S34","DOI":"10.1016\/j.jbi.2017.05.023","article-title":"De-identification of clinical notes via recurrent neural network and conditional random field","volume":"75","author":"Liu","year":"2017","journal-title":"J Biomed Inform"},{"key":"2021120106310831400_ocab207-B12","doi-asserted-by":"crossref","first-page":"S11","DOI":"10.1016\/j.jbi.2015.06.007","article-title":"Automated systems for the de-identification of longitudinal clinical narratives: overview of 2014 i2b2\/UTHealth shared task Track 1","volume":"58 Suppl (2015","author":"Stubbs","year":"2015","journal-title":"J Biomed Inform"},{"issue":"2017","key":"2021120106310831400_ocab207-B13","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1016\/j.jbi.2017.06.011","article-title":"De-identification of psychiatric intake records: overview of CEGS N-GRID shared tasks Track 1","volume":"75","author":"Stubbs","year":"2017","journal-title":"J Biomed Inform"},{"issue":"2020","key":"2021120106310831400_ocab207-B14","first-page":"3","article-title":"Comparing rule-based, feature-based and deep neural methods for de-identification of Dutch medical records","volume":"2551","author":"Trienes","year":"2020","journal-title":"CEUR Workshop Proc"},{"issue":"2017","key":"2021120106310831400_ocab207-B15","article-title":"A hybrid approach to automatic de-identification of psychiatric notes","volume":"75","author":"Lee","year":"2017","journal-title":"J Biomed Inform"},{"key":"2021120106310831400_ocab207-B16","article-title":"Frustratingly easy domain adaptation","author":"Daum\u00e9","year":"2007","journal-title":"ACL 2007\u2014Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics"},{"key":"2021120106310831400_ocab207-B17","first-page":"1070","article-title":"Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation","volume":"2017","author":"Lee","year":"2018","journal-title":"AMIA Annu Symp Proc"},{"key":"2021120106310831400_ocab207-B18","first-page":"4470","article-title":"Transfer learning for named-entity recognition with neural networks","author":"Lee","year":"2017","journal-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)"},{"year":"2020","author":"Piratla","key":"2021120106310831400_ocab207-B19"},{"issue":"1","key":"2021120106310831400_ocab207-B20","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1186\/s12911-017-0556-8","article-title":"Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach","volume":"17","author":"Weng","year":"2017","journal-title":"BMC Med Inform Decis Mak"},{"issue":"2","key":"2021120106310831400_ocab207-B21","doi-asserted-by":"crossref","first-page":"560","DOI":"10.4338\/ACI-2016-12-RA-0211","article-title":"The effects of natural language processing on cross-institutional portability of influenza case detection for disease surveillance","volume":"8","author":"Ferraro","year":"2017","journal-title":"Appl Clin Inform"},{"issue":"5","key":"2021120106310831400_ocab207-B22","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1197\/jamia.M2444","article-title":"2007. Evaluating the state of the art in automatic de-identification","volume":"14","author":"Uzuner","year":"2007","journal-title":"J Am Med Inform Assoc"},{"key":"2021120106310831400_ocab207-B23","doi-asserted-by":"crossref","first-page":"S6","DOI":"10.1016\/j.jbi.2015.09.018","article-title":"Creation of a new longitudinal corpus of clinical narratives","volume":"58 (Suppl","author":"Kumar","year":"2015","journal-title":"J Biomed Inform"},{"key":"2021120106310831400_ocab207-B24","doi-asserted-by":"crossref","first-page":"1532","DOI":"10.3115\/v1\/D14-1162","article-title":"Glove: global vectors for word representation","author":"Pennington","year":"2014","journal-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)"},{"key":"2021120106310831400_ocab207-B25","first-page":"3490","article-title":"2018. Learning to generalize: meta-learning for domain generalization","author":"Li","year":"2018","journal-title":"32nd AAAI Conf Artif Intell AAAI 2018"},{"key":"2021120106310831400_ocab207-B26","first-page":"10","article-title":"2013. Domain generalization via invariant feature representation","author":"Muandet","year":"2013","journal-title":"30th Int Conf Mach Learn ICML 2013"},{"key":"2021120106310831400_ocab207-B27","article-title":"Generalizing across domains via cross-gradient training","author":"Shankar","year":"2018","journal-title":"Int Conf Learn Represent 2018"},{"key":"2021120106310831400_ocab207-B28","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018","journal-title":"MLM"},{"volume-title":"Computer-Intensive Methods for Testing Hypotheses","year":"1989","author":"Noreen","key":"2021120106310831400_ocab207-B29"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/28\/12\/2661\/41325408\/ocab207.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/28\/12\/2661\/41325408\/ocab207.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,1]],"date-time":"2021-12-01T10:05:10Z","timestamp":1638353110000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/28\/12\/2661\/6377891"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,29]]},"references-count":29,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,9,29]]},"published-print":{"date-parts":[[2021,11,25]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocab207","relation":{},"ISSN":["1527-974X"],"issn-type":[{"type":"electronic","value":"1527-974X"}],"subject":[],"published-other":{"date-parts":[[2021,12,1]]},"published":{"date-parts":[[2021,9,29]]}}}