{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T04:08:27Z","timestamp":1770350907982,"version":"3.49.0"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Genome-wide association studies (GWASs) are effective for describing genetic complexities of common diseases. Phenome-wide association studies (PheWASs) offer an alternative and complementary approach to GWAS using data embedded in the electronic health record (EHR) to define the phenome. International Classification of Disease version 9 (ICD9) codes are used frequently to define the phenome, but using ICD9 codes alone misses other clinically relevant information from the EHR that can be used for PheWAS analyses and discovery.<\/jats:p>\n               <jats:p>Results: As an alternative to ICD9 coding, a text-based phenome was defined by 23\u2009384 clinically relevant terms extracted from Marshfield Clinic\u2019s EHR. Five single nucleotide polymorphisms (SNPs) with known phenotypic associations were genotyped in 4235 individuals and associated across the text-based phenome. All five SNPs genotyped were associated with expected terms (P\u2009&amp;lt;\u20090.02), most at or near the top of their respective PheWAS ranking. Raw association results indicate that text data performed equivalently to ICD9 coding and demonstrate the utility of information beyond ICD9 coding for application in PheWAS.<\/jats:p>\n               <jats:p>Contact: hebbring.scott@mcrf.mfldclin.edu<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv076","type":"journal-article","created":{"date-parts":[[2015,2,6]],"date-time":"2015-02-06T04:29:59Z","timestamp":1423196999000},"page":"1981-1987","source":"Crossref","is-referenced-by-count":36,"title":["Application of clinical text data for phenome-wide association studies (PheWASs)"],"prefix":"10.1093","volume":"31","author":[{"given":"Scott J.","family":"Hebbring","sequence":"first","affiliation":[{"name":"1 Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and 2Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Majid","family":"Rastegar-Mojarad","sequence":"additional","affiliation":[{"name":"1 Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and 2Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhan","family":"Ye","sequence":"additional","affiliation":[{"name":"1 Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and 2Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"John","family":"Mayer","sequence":"additional","affiliation":[{"name":"1 Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and 2Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Crystal","family":"Jacobson","sequence":"additional","affiliation":[{"name":"1 Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and 2Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Simon","family":"Lin","sequence":"additional","affiliation":[{"name":"1 Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and 2Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2015,2,4]]},"reference":[{"key":"2023020115204664500_btv076-B1","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1186\/1471-2105-12-420","article-title":"BioNOT: a searchable database of biomedical negated sentences","volume":"12","author":"Agarwal","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023020115204664500_btv076-B2","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023020115204664500_btv076-B3","doi-asserted-by":"crossref","first-page":"2375","DOI":"10.1093\/bioinformatics\/btu197","article-title":"R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment","volume":"30","author":"Carroll","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020115204664500_btv076-B4","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020115204664500_btv076-B5","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1016\/j.ajhg.2011.09.008","article-title":"Variants near FOXE1 are associated with \nhypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies","volume":"89","author":"Denny","year":"2011","journal-title":"Am. J. Hum. Genet."},{"key":"2023020115204664500_btv076-B6","doi-asserted-by":"crossref","first-page":"1102","DOI":"10.1038\/nbt.2749","article-title":"Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data","volume":"31","author":"Denny","year":"2013","journal-title":"Nat. Biotechnol."},{"key":"2023020115204664500_btv076-B7","doi-asserted-by":"crossref","first-page":"1873","DOI":"10.1167\/iovs.09-4000","article-title":"Inverse association of female hormone \nreplacement therapy with age-related macular degeneration and interactions with ARMS2 polymorphisms","volume":"51","author":"Edwards","year":"2010","journal-title":"Invest. Ophthalmol. Vis. Sci."},{"key":"2023020115204664500_btv076-B8","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1001\/archopht.126.4.519","article-title":"Menopausal and reproductive factors and risk of age-related macular degeneration","volume":"126","author":"Feskanich","year":"2008","journal-title":"Arch. Ophthalmol."},{"key":"2023020115204664500_btv076-B9","doi-asserted-by":"crossref","first-page":"E215","DOI":"10.1161\/01.CIR.101.23.e215","article-title":"PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals","volume":"101","author":"Goldberger","year":"2000","journal-title":"Circulation"},{"key":"2023020115204664500_btv076-B10","doi-asserted-by":"crossref","first-page":"1696","DOI":"10.1056\/NEJMp0806284","article-title":"Common genetic variation and human traits","volume":"360","author":"Goldstein","year":"2009","journal-title":"N. Engl. J. Med."},{"key":"2023020115204664500_btv076-B11","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1111\/imm.12195","article-title":"The challenges, advantages and future of phenome-wide association studies","volume":"141","author":"Hebbring","year":"2014","journal-title":"Immunology"},{"key":"2023020115204664500_btv076-B12","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1038\/gene.2013.2","article-title":"A PheWAS approach in studying HLA-DRB1*1501","volume":"14","author":"Hebbring","year":"2013","journal-title":"Genes Immun."},{"key":"2023020115204664500_btv076-B13","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1186\/1472-6963-10-99","article-title":"Do coder characteristics influence validity of ICD-10 hospital discharge data?","volume":"10","author":"Hennessy","year":"2010","journal-title":"BMC Health Serv. Res."},{"key":"2023020115204664500_btv076-B14","volume-title":"A Catalog of Published Genome-Wide Association Studies","author":"Hindorff","year":"2012"},{"key":"2023020115204664500_btv076-B15","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1136\/amiajnl-2013-001612","article-title":"Mining clinical text for signals of adverse drug-drug interactions","volume":"21","author":"Iyer","year":"2014","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"2023020115204664500_btv076-B16","doi-asserted-by":"crossref","first-page":"e89324","DOI":"10.1371\/journal.pone.0089324","article-title":"Automated detection of off-label drug use","volume":"9","author":"Jung","year":"2014","journal-title":"PLoS One"},{"key":"2023020115204664500_btv076-B17","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1007\/978-3-642-38457-8_7","article-title":"Unsupervised extraction of diagnosis codes from EMRs using knowledge-based and extractive text summarization techniques","volume-title":"Advanced in Artificial Intelligence: Lecture Notes in Computer Science, Volume 7884","author":"Kavuluru","year":"2013"},{"key":"2023020115204664500_btv076-B18","first-page":"652","article-title":"Banner: an exucutable survey of advances in biomedical named entity recognition","volume":"13","author":"Leaman","year":"2008","journal-title":"Pac. Symp. Biocomp."},{"key":"2023020115204664500_btv076-B19","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1007\/s10072-006-0721-9","article-title":"Inter-coder agreement for ICD-9-CM coding of stroke","volume":"27","author":"Leone","year":"2006","journal-title":"Neurol. Sci."},{"key":"2023020115204664500_btv076-B20","first-page":"40","article-title":"The unified medical language system (UMLS) of the national library of medicine","volume":"61","author":"Lindberg","year":"1990","journal-title":"J. Am. Med. Rec. Assoc."},{"key":"2023020115204664500_btv076-B21","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1055\/s-0038-1634945","article-title":"The unified medical language system","volume":"32","author":"Lindberg","year":"1993","journal-title":"Methods Inf. Med."},{"key":"2023020115204664500_btv076-B22","first-page":"47","article-title":"Using temporal patterns in medical records to discern adverse drug events from indications","volume":"2012","author":"Liu","year":"2012","journal-title":"AMIA Summits Transl. Sci. Proc."},{"key":"2023020115204664500_btv076-B23","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/nature08494","article-title":"Finding the missing heritability of complex diseases","volume":"461","author":"Manolio","year":"2009","journal-title":"Nature"},{"key":"2023020115204664500_btv076-B24","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1136\/amiajnl-2014-002694","article-title":"N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit","volume":"21","author":"Marafino","year":"2014","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"2023020115204664500_btv076-B25","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1038\/nrg2344","article-title":"Genome-wide association studies for complex traits: consensus, uncertainty and challenges","volume":"9","author":"McCarthy","year":"2008","journal-title":"Nat. Rev. Genet."},{"key":"2023020115204664500_btv076-B26","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1517\/17410541.2.1.49","article-title":"Marshfield Clinic personalized medicine research project (PMRP): design, methods and recruitment for a large population-based biobank","volume":"2","author":"McCarty","year":"2005","journal-title":"Per. Med."},{"key":"2023020115204664500_btv076-B27","doi-asserted-by":"crossref","first-page":"3026","DOI":"10.1002\/ajmg.a.32559","article-title":"Community consultation and communication for a population-based DNA biobank: the Marshfield Clinic personalized medicine research project","volume":"146A","author":"McCarty","year":"2008","journal-title":"Am. J. Med. Genet. A"},{"key":"2023020115204664500_btv076-B28","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/1472-6947-8-32","article-title":"Automated de-identification of free-text medical records","volume":"8","author":"Neamatullah","year":"2008","journal-title":"BMC Med. Inform. Decis. Mak."},{"key":"2023020115204664500_btv076-B29","doi-asserted-by":"crossref","first-page":"37","DOI":"10.31887\/DCNS.2010.12.1\/aneed","article-title":"Whole genome association studies in complex diseases: where do we stand?","volume":"12","author":"Need","year":"2010","journal-title":"Dialogues Clin. Neurosci."},{"key":"2023020115204664500_btv076-B30","doi-asserted-by":"crossref","first-page":"e1003405","DOI":"10.1371\/journal.pcbi.1003405","article-title":"Phenome-wide association studies on a quantitative trait: application to TPMT enzyme activity and thiopurine therapy in pharmacogenomics","volume":"9","author":"Neuraz","year":"2013","journal-title":"PLoS Comput. Biol."},{"key":"2023020115204664500_btv076-B31","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1007\/978-1-59745-547-3_9","article-title":"Mining biomedical data using MetaMap transfer (MMtx) and the unified medical language system (UMLS)","volume":"408","author":"Osborne","year":"2007","journal-title":"Methods Mol. Biol."},{"key":"2023020115204664500_btv076-B32","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1002\/gepi.20589","article-title":"The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery","volume":"35","author":"Pendergrass","year":"2011","journal-title":"Genet. Epidemiol."},{"key":"2023020115204664500_btv076-B33","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1186\/1756-0381-5-5","article-title":"Visually integrating and exploring high throughput phenome-wide association study (PheWAS) results using PheWAS-View","volume":"5","author":"Pendergrass","year":"2012","journal-title":"BioData Min."},{"key":"2023020115204664500_btv076-B34","doi-asserted-by":"crossref","first-page":"e1003087","DOI":"10.1371\/journal.pgen.1003087","article-title":"Phenome-wide association study (PheWAS) for detection of pleiotropy within the population architecture using genomics and epidemiology (PAGE) network","volume":"9","author":"Pendergrass","year":"2013","journal-title":"PLoS Genet."},{"key":"2023020115204664500_btv076-B40","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1038\/nbt.3183","article-title":"Opportunities for drug repositioning from phenome-wide association studies","volume":"33","author":"Rastegar-Mojarad","year":"2015","journal-title":"Nature Biotechnology"},{"key":"2023020115204664500_btv076-B35","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1161\/CIRCULATIONAHA.112.000604","article-title":"Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk","volume":"127","author":"Ritchie","year":"2013","journal-title":"Circulation"},{"key":"2023020115204664500_btv076-B36","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1007\/s00439-013-1355-7","article-title":"A genome- and phenome-wide association study to identify genetic variants influencing platelet count and volume and their pleiotropic effects","volume":"133","author":"Shameer","year":"2014","journal-title":"Hum. Genet."},{"key":"2023020115204664500_btv076-B37","doi-asserted-by":"crossref","first-page":"e19586","DOI":"10.1371\/journal.pone.0019586","article-title":"Knowledge-driven multi-locus analysis reveals gene-gene interactions influencing HDL cholesterol level in two independent EMR-linked biobanks","volume":"6","author":"Turner","year":"2011","journal-title":"PLoS One"},{"key":"2023020115204664500_btv076-B38","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1016\/j.ajhg.2011.11.029","article-title":"Five years of GWAS discovery","volume":"90","author":"Visscher","year":"2012","journal-title":"Am. J. Hum. Genet."},{"key":"2023020115204664500_btv076-B39","article-title":"Phenome-wide association studies (PheWASs) for functional variants","volume":"2014","author":"Ye","year":"2014","journal-title":"Eur. J. Hum. Genet."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/12\/1981\/49014029\/bioinformatics_31_12_1981.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/12\/1981\/49014029\/bioinformatics_31_12_1981.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T00:03:41Z","timestamp":1675296221000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/12\/1981\/214153"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,2,4]]},"references-count":40,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2015,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv076","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,6,15]]},"published":{"date-parts":[[2015,2,4]]}}}