{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T17:10:25Z","timestamp":1773162625228,"version":"3.50.1"},"reference-count":51,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2017,12,1]],"date-time":"2017-12-01T00:00:00Z","timestamp":1512086400000},"content-version":"vor","delay-in-days":1,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["R01GM102282"],"award-info":[{"award-number":["R01GM102282"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000070","name":"National Institute of Biomedical Imaging and Bioengineering","doi-asserted-by":"publisher","award":["R01EB19403"],"award-info":[{"award-number":["R01EB19403"]}],"id":[{"id":"10.13039\/100000070","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000050","name":"National Heart, Lung, and Blood Institute","doi-asserted-by":"publisher","award":["R01HL126667"],"award-info":[{"award-number":["R01HL126667"]}],"id":[{"id":"10.13039\/100000050","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000071","name":"National Institute of Child Health and Human Development","doi-asserted-by":"publisher","award":["R21Al116839-01"],"award-info":[{"award-number":["R21Al116839-01"]}],"id":[{"id":"10.13039\/100000071","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Objective<\/jats:title><jats:p>To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability.<\/jats:p><\/jats:sec><jats:sec><jats:title>Materials and Methods<\/jats:title><jats:p>Birth cohorts from Mayo Clinic and Sanford Children\u2019s Hospital (SCH) were used in this study (n\u2009=\u2009298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>There exist notable lexical variations (word-level similarity\u2009=\u20090.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity\u2009=\u20090.944, asthma-related concept similarity\u2009=\u20090.971). The NLP system for asthma ascertainment had anF-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p>The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity.<\/jats:p><\/jats:sec>","DOI":"10.1093\/jamia\/ocx138","type":"journal-article","created":{"date-parts":[[2017,10,25]],"date-time":"2017-10-25T19:18:44Z","timestamp":1508959124000},"page":"353-359","source":"Crossref","is-referenced-by-count":60,"title":["Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions"],"prefix":"10.1093","volume":"25","author":[{"given":"Sunghwan","family":"Sohn","sequence":"first","affiliation":[{"name":"Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Yanshan","family":"Wang","sequence":"additional","affiliation":[{"name":"Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Chung-Il","family":"Wi","sequence":"additional","affiliation":[{"name":"Department of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Elizabeth A","family":"Krusemark","sequence":"additional","affiliation":[{"name":"Department of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Euijung","family":"Ryu","sequence":"additional","affiliation":[{"name":"Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Mir H","family":"Ali","sequence":"additional","affiliation":[{"name":"Department of Pediatrics, Sanford Children\u2019s Hospital, Sioux Falls, SD, USA"}]},{"given":"Young J","family":"Juhn","sequence":"additional","affiliation":[{"name":"Department of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, MN, USA"}]},{"given":"Hongfang","family":"Liu","sequence":"additional","affiliation":[{"name":"Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,11,30]]},"reference":[{"issue":"4","key":"2020110612371576700_ocx138-B1","doi-asserted-by":"crossref","first-page":"430","DOI":"10.1164\/rccm.201610-2006OC","article-title":"Application of a natural language processing algorithm to asthma ascertainment: an automated chart review","volume":"196","author":"Wi","year":"2017","journal-title":"Am J Respir Crit Care Med."},{"issue":"5","key":"2020110612371576700_ocx138-B2","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1016\/j.anai.2013.07.022","article-title":"Automated chart review for asthma cohort identification using natural language processing: an exploratory study","volume":"111","author":"Wu","year":"2013","journal-title":"Ann Allergy Asthma Immunol."},{"issue":"8","key":"2020110612371576700_ocx138-B3","doi-asserted-by":"crossref","first-page":"848","DOI":"10.1001\/jama.2011.1204","article-title":"Automated identification of postoperative complications within an electronic medical record using natural language processing","volume":"306","author":"Murff","year":"2011","journal-title":"JAMA."},{"issue":"4","key":"2020110612371576700_ocx138-B4","doi-asserted-by":"crossref","first-page":"448","DOI":"10.1197\/jamia.M1794","article-title":"Automated detection of adverse events using natural language processing of discharge summaries","volume":"12","author":"Melton","year":"2005","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612371576700_ocx138-B5","doi-asserted-by":"crossref","first-page":"858","DOI":"10.1136\/amiajnl-2013-002190","article-title":"MedXN: an open source medication extraction and normalization tool for clinical text","volume":"21","author":"Sohn","year":"2014","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B6","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1136\/amiajnl-2011-000351","article-title":"Drug side effect extraction from clinical narratives of psychiatry and psychology patients","volume":"18","author":"Sohn","year":"2011","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B7","first-page":"619","article-title":"Mayo clinic smoking status classification system: extensions and improvements","volume":"2009","author":"Sohn","year":"2009","journal-title":"AMIA Annu Symp."},{"key":"2020110612371576700_ocx138-B8","first-page":"249","article-title":"Identifying abdominal aortic aneurysm cases and controls using natural language processing of radiology reports","volume":"2013","author":"Sohn","year":"2013","journal-title":"AMIA Jt Summits Transl Sci Proc."},{"key":"2020110612371576700_ocx138-B9","first-page":"43","article-title":"A hybrid approach to sentiment sentence classification in suicide notes","volume":"5","author":"Sohn","year":"2012","journal-title":"Biomed Inform Insights."},{"issue":"5","key":"2020110612371576700_ocx138-B10","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1016\/j.jbi.2009.08.007","article-title":"What can natural language processing do for clinical decision support?","volume":"42","author":"Demner-Fushman","year":"2009","journal-title":"J Biomed Inform."},{"key":"2020110612371576700_ocx138-B11","first-page":"12","article-title":"Combining decision support methodologies to diagnose pneumonia","author":"Aronsky","year":"2001","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612371576700_ocx138-B12","doi-asserted-by":"crossref","first-page":"568","DOI":"10.1136\/jamia.2010.004366","article-title":"Leveraging informatics for genetic studies: use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease","volume":"17","author":"Kullo","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"9","key":"2020110612371576700_ocx138-B13","doi-asserted-by":"crossref","first-page":"e13011","DOI":"10.1371\/journal.pone.0013011","article-title":"A genome-wide association study of red blood cell traits using the electronic medical record","volume":"5","author":"Kullo","year":"2010","journal-title":"PLoS One."},{"issue":"5","key":"2020110612371576700_ocx138-B14","doi-asserted-by":"crossref","first-page":"392","DOI":"10.1197\/jamia.M1552","article-title":"Automated encoding of clinical documents based on natural language processing","volume":"11","author":"Friedman","year":"2004","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612371576700_ocx138-B15","doi-asserted-by":"crossref","first-page":"516","DOI":"10.1197\/jamia.M2077","article-title":"Automating the assignment of diagnosis codes to patient encounters using example-based and machine learning techniques","volume":"13","author":"Pakhomov","year":"2006","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612371576700_ocx138-B16","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1136\/jamia.2009.001560","article-title":"Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications","volume":"17","author":"Savova","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612371576700_ocx138-B17","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1136\/amiajnl-2011-000093","article-title":"The Yale cTAKES extensions for document classification: architecture and application","volume":"18","author":"Garla","year":"2011","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B18","first-page":"1639","article-title":"Using medical text extraction, reasoning and mapping system (MTERMS) to process medication information in outpatient clinical notes","volume":"2011","author":"Zhou","year":"2011","journal-title":"AMIA Annu Symp Proc."},{"issue":"1","key":"2020110612371576700_ocx138-B19","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1186\/1472-6947-6-30","article-title":"Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system","volume":"6","author":"Zeng","year":"2006","journal-title":"BMC Med Inform Dec Mak."},{"issue":"2","key":"2020110612371576700_ocx138-B20","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1136\/jamia.1994.95236146","article-title":"A general natural-language text processor for clinical radiology","volume":"1","author":"Friedman","year":"1994","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B21","first-page":"149","article-title":"An information extraction framework for cohort identification using electronic health records","volume":"2013","author":"Liu","year":"2013","journal-title":"AMIA Jt Summits Transl Sci Proc."},{"issue":"5","key":"2020110612371576700_ocx138-B22","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1136\/amiajnl-2011-000155","article-title":"Using machine learning for concept extraction on clinical documents from multiple data sources","volume":"18","author":"Torii","year":"2011","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B23","first-page":"382","article-title":"Part-of-speech tagging for clinical text: wall or bridge between institutions?","volume":"2011","author":"Fan","year":"2011","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612371576700_ocx138-B24","first-page":"38","article-title":"Feasibility of pooling annotated corpora for clinical concept extraction","volume":"2012","author":"Wagholikar","year":"2012","journal-title":"AMIA Jt Summits Transl Sci Proc."},{"key":"2020110612371576700_ocx138-B25","first-page":"270","article-title":"A broad-coverage natural language processing system","author":"Friedman","year":"2000","journal-title":"Proc AMIA Symp."},{"key":"2020110612371576700_ocx138-B26","first-page":"742","article-title":"The sublanguage of cross-coverage","author":"Stetson","year":"2002","journal-title":"Proc AMIA Symp."},{"issue":"4","key":"2020110612371576700_ocx138-B27","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1016\/S1532-0464(03)00012-1","article-title":"Two biomedical sublanguages: a description based on the theories of Zellig Harris","volume":"35","author":"Friedman","year":"2002","journal-title":"J Biomed Inform."},{"key":"2020110612371576700_ocx138-B28","volume-title":"A Grammar of English on Mathematical Principles","author":"Harris","year":"1982"},{"key":"2020110612371576700_ocx138-B29","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198242246.001.0001","volume-title":"A Theory of Language and Information: A Mathematical Approach","author":"Harris","year":"1991"},{"issue":"e1","key":"2020110612371576700_ocx138-B30","doi-asserted-by":"crossref","first-page":"e79","DOI":"10.1093\/jamia\/ocw109","article-title":"A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD)","volume":"24","author":"Wu","year":"2017","journal-title":"J Am Med Inform Assoc."},{"issue":"1","key":"2020110612371576700_ocx138-B31","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1197\/jamia.M2927","article-title":"Methods for building sense inventories of abbreviations in clinical notes","volume":"16","author":"Xu","year":"2009","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B32","first-page":"1099","article-title":"Document clustering of clinical narratives: a systematic study of clinical sublanguages","volume":"2011","author":"Patterson","year":"2011","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612371576700_ocx138-B33","first-page":"577","article-title":"A study of transportability of an existing smoking status detection module across institutions","volume":"2012","author":"Liu","year":"2012","journal-title":"AMIA Annu Symp Proc."},{"issue":"e1","key":"2020110612371576700_ocx138-B34","doi-asserted-by":"crossref","first-page":"e162","DOI":"10.1136\/amiajnl-2011-000583","article-title":"Portability of an algorithm to identify rheumatoid arthritis in electronic health records","volume":"19","author":"Carroll","year":"2012","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612371576700_ocx138-B35","first-page":"604","article-title":"Identification of patients with family history of pancreatic cancer: investigation of an NLP system portability","volume":"216","author":"Mehrabi","year":"2014","journal-title":"Stud Health Technol Inform."},{"key":"2020110612371576700_ocx138-B36","first-page":"568","article-title":"Towards a semantic lexicon for clinical natural language processing","volume":"2012","author":"Liu","year":"2012","journal-title":"AMIA Annu Symp Proc."},{"issue":"4","key":"2020110612371576700_ocx138-B37","doi-asserted-by":"crossref","first-page":"888","DOI":"10.1164\/ajrccm\/146.4.888","article-title":"A community-based study of the epidemiology of asthma: incidence rates, 1964\u20131983","volume":"146","author":"Yunginger","year":"1992","journal-title":"Am Rev Respir Dis."},{"issue":"4","key":"2020110612371576700_ocx138-B38","first-page":"35","article-title":"Modern information retrieval: a brief overview","volume":"24","author":"Singhal","year":"2001","journal-title":"IEEE Data Eng Bull."},{"key":"2020110612371576700_ocx138-B39","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J Machine Learn Res."},{"issue":"7","key":"2020110612371576700_ocx138-B40","doi-asserted-by":"crossref","first-page":"1736","DOI":"10.1002\/asi.23444","article-title":"Indexing by latent Dirichlet allocation and an ensemble model","volume":"67","author":"Wang","year":"2016","journal-title":"J Assoc Inform Sci Technol."},{"issue":"8","key":"2020110612371576700_ocx138-B41","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1002\/ppul.20644","article-title":"Prevalence of asthma-like symptoms in young children","volume":"42","author":"Bisgaard","year":"2007","journal-title":"Pediatric Pulmonol."},{"issue":"11","key":"2020110612371576700_ocx138-B42","doi-asserted-by":"crossref","first-page":"1529","DOI":"10.1111\/j.1398-9995.2008.01749.x","article-title":"Timeliness of diagnosis of asthma in children and its predictors","volume":"63","author":"Molis","year":"2008","journal-title":"Allergy."},{"issue":"1","key":"2020110612371576700_ocx138-B43","doi-asserted-by":"crossref","first-page":"79","DOI":"10.4104\/pcrj.2010.00076","article-title":"Characterisation of children\u2019s asthma status by ICD-9 code and criteria-based medical record review","volume":"20","author":"Juhn","year":"2011","journal-title":"Prim Care Respir J."},{"issue":"4","key":"2020110612371576700_ocx138-B44","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1016\/S0091-6749(97)70072-1","article-title":"Attained adult height after childhood asthma: effect of glucocorticoid therapy","volume":"99","author":"Silverstein","year":"1997","journal-title":"J Allergy Clin Immunol."},{"issue":"1","key":"2020110612371576700_ocx138-B45","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/S0091-6749(99)70525-7","article-title":"Allergic rhinitis in Rochester, Minnesota residents with asthma: frequency and impact on health care charges","volume":"103","author":"Yawn","year":"1999","journal-title":"J Allergy Clin Immunol."},{"issue":"2","key":"2020110612371576700_ocx138-B46","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1378\/chest.111.2.303","article-title":"Incidence and outcomes of asthma in the elderly: a population-based study in Rochester, Minnesota","volume":"111","author":"Bauer","year":"1997","journal-title":"Chest."},{"issue":"15","key":"2020110612371576700_ocx138-B47","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1001\/jama.1993.03500150059027","article-title":"Accuracy of the death certificate in a population-based study of asthmatic patients","volume":"269","author":"Hunt","year":"1993","journal-title":"JAMA."},{"issue":"4","key":"2020110612371576700_ocx138-B48","doi-asserted-by":"crossref","first-page":"838","DOI":"10.1016\/j.jaci.2009.12.998","article-title":"The influence of neighborhood environment on the incidence of childhood asthma: a propensity score approach","volume":"125","author":"Juhn","year":"2010","journal-title":"J Allergy Clin Immunol."},{"issue":"4","key":"2020110612371576700_ocx138-B49","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1016\/S1081-1206(10)60937-4","article-title":"Childhood asthma and measles vaccine response","volume":"97","author":"Juhn","year":"2006","journal-title":"Ann Allergy Asthma Immunol."},{"issue":"11","key":"2020110612371576700_ocx138-B50","doi-asserted-by":"crossref","first-page":"e112774","DOI":"10.1371\/journal.pone.0112774","article-title":"Negation\u2019s not solved: generalizability versus optimizability in clinical natural language processing","volume":"9","author":"Wu","year":"2014","journal-title":"PLoS One."},{"key":"2020110612371576700_ocx138-B51","first-page":"1","article-title":"Trends in asthma prevalence, health care use, and mortality in the United States, 2001-2010","volume":"94","author":"Akinbami","year":"2012","journal-title":"NCHS Data Brief."}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/25\/3\/353\/34150205\/ocx138.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/25\/3\/353\/34150205\/ocx138.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,28]],"date-time":"2024-06-28T01:10:44Z","timestamp":1719537044000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/25\/3\/353\/4677327"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11,30]]},"references-count":51,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2017,11,30]]},"published-print":{"date-parts":[[2018,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocx138","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,3]]},"published":{"date-parts":[[2017,11,30]]}}}