{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T15:02:13Z","timestamp":1769180533617,"version":"3.49.0"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2018,4,18]],"date-time":"2018-04-18T00:00:00Z","timestamp":1524009600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Vanderbilt Faculty Research Scholars Fund"},{"name":"JDM"},{"DOI":"10.13039\/100000968","name":"American Heart Association","doi-asserted-by":"publisher","award":["16FTF30130005"],"award-info":[{"award-number":["16FTF30130005"]}],"id":[{"id":"10.13039\/100000968","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Burroughs-Wellcome Innovation in Regulatory Science Award","award":["1015006"],"award-info":[{"award-number":["1015006"]}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"NCATS","doi-asserted-by":"publisher","award":["KL2 TR 000446"],"award-info":[{"award-number":["KL2 TR 000446"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"NLM","doi-asserted-by":"publisher","award":["R01-LM0010685"],"award-info":[{"award-number":["R01-LM0010685"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000057","name":"NIGMS","doi-asserted-by":"publisher","award":["R01-GM124109"],"award-info":[{"award-number":["R01-GM124109"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Vanderbilt University Medical Center\u2019s SD","award":["1S10RR025141-01"],"award-info":[{"award-number":["1S10RR025141-01"]}]},{"DOI":"10.13039\/100016220","name":"CTSA","doi-asserted-by":"crossref","award":["UL1TR000445"],"award-info":[{"award-number":["UL1TR000445"]}],"id":[{"id":"10.13039\/100016220","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Phenome-wide association studies (PheWAS) have been used to discover many genotype-phenotype relationships and have the potential to identify therapeutic and adverse drug outcomes using longitudinal data within electronic health records (EHRs). However, the statistical methods for PheWAS applied to longitudinal EHR medication data have not been established.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>In this study, we developed methods to address two challenges faced with reuse of EHR for this purpose: confounding by indication, and low exposure and event rates. We used Monte Carlo simulation to assess propensity score (PS) methods, focusing on two of the most commonly used methods, PS matching and PS adjustment, to address confounding by indication. We also compared two logistic regression approaches (the default of Wald versus Firth\u2019s penalized maximum likelihood, PML) to address complete separation due to sparse data with low exposure and event rates. PS adjustment resulted in greater power than PS matching, while controlling Type I error at 0.05. The PML method provided reasonable P-values, even in cases with complete separation, with well controlled Type I error rates. Using PS adjustment and the PML method, we identify novel latent drug effects in pediatric patients exposed to two common antibiotic drugs, ampicillin and gentamicin.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>R packages PheWAS and EHR are available at https:\/\/github.com\/PheWAS\/PheWAS and at CRAN (https:\/\/www.r-project.org\/), respectively. The R script for data processing and the main analysis is available at https:\/\/github.com\/choileena\/EHR.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty306","type":"journal-article","created":{"date-parts":[[2018,4,17]],"date-time":"2018-04-17T06:38:49Z","timestamp":1523947129000},"page":"2988-2996","source":"Crossref","is-referenced-by-count":16,"title":["Evaluating statistical approaches to leverage large clinical datasets for uncovering therapeutic and adverse medication effects"],"prefix":"10.1093","volume":"34","author":[{"given":"Leena","family":"Choi","sequence":"first","affiliation":[{"name":"Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Robert J","family":"Carroll","sequence":"additional","affiliation":[{"name":"Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Cole","family":"Beck","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Jonathan D","family":"Mosley","sequence":"additional","affiliation":[{"name":"Medicine, Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Dan M","family":"Roden","sequence":"additional","affiliation":[{"name":"Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA"},{"name":"Medicine, Vanderbilt University Medical Center, Nashville, TN, USA"},{"name":"Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Joshua C","family":"Denny","sequence":"additional","affiliation":[{"name":"Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA"},{"name":"Medicine, Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Sara L","family":"Van Driest","sequence":"additional","affiliation":[{"name":"Medicine, Vanderbilt University Medical Center, Nashville, TN, USA"},{"name":"Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,4,18]]},"reference":[{"key":"2023061313384546100_bty306-B1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/biomet\/71.1.1","article-title":"On the existence of maximum likelihood estimates in logistic regression models","volume":"71","author":"Albert","year":"1984","journal-title":"Biometrika"},{"key":"2023061313384546100_bty306-B2","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1016\/j.jclinepi.2014.08.011","article-title":"Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review","volume":"68","author":"Ali","year":"2015","journal-title":"J. Clin. Epidemiol"},{"key":"2023061313384546100_bty306-B3","doi-asserted-by":"crossref","first-page":"150","DOI":"10.1002\/pst.433","article-title":"Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies","volume":"10","author":"Austin","year":"2011","journal-title":"Pharm. Stat"},{"key":"2023061313384546100_bty306-B4","doi-asserted-by":"crossref","first-page":"1042","DOI":"10.1093\/jamia\/ocv046","article-title":"Birth month affects lifetime disease risk: a phenome-wide method","volume":"22","author":"Boland","year":"2015","journal-title":"J. Am. Med. Inform. Assoc"},{"key":"2023061313384546100_bty306-B5","doi-asserted-by":"crossref","first-page":"2375","DOI":"10.1093\/bioinformatics\/btu197","article-title":"R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment","volume":"30","author":"Carroll","year":"2014","journal-title":"Bioinformatics"},{"key":"2023061313384546100_bty306-B6","author":"Choi","year":"2011"},{"key":"2023061313384546100_bty306-B7","author":"Choi","year":"2017"},{"key":"2023061313384546100_bty306-B8","doi-asserted-by":"crossref","first-page":"e0121263.","DOI":"10.1371\/journal.pone.0121263","article-title":"Elucidating the foundations of statistical inference with 2 x 2 tables","volume":"10","author":"Choi","year":"2015","journal-title":"PLoS ONE"},{"key":"2023061313384546100_bty306-B9","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics"},{"key":"2023061313384546100_bty306-B10","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1016\/j.ajhg.2011.09.008","article-title":"Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies","volume":"89","author":"Denny","year":"2011","journal-title":"Am. J. Human Genet"},{"key":"2023061313384546100_bty306-B11","doi-asserted-by":"crossref","first-page":"1102","DOI":"10.1038\/nbt.2749","article-title":"Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data","volume":"31","author":"Denny","year":"2013","journal-title":"Nat. Biotechnol"},{"key":"2023061313384546100_bty306-B12","author":"Dupont","year":"2016"},{"key":"2023061313384546100_bty306-B13","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/biomet\/80.1.27","article-title":"Bias reduction of maximum likelihood estimates","volume":"80","author":"Firth","year":"1993","journal-title":"Biometrika"},{"key":"2023061313384546100_bty306-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Software"},{"key":"2023061313384546100_bty306-B15","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.yebeh.2015.08.014","article-title":"Comparative effectiveness of generic versus brand-name antiepileptic medications","volume":"52","author":"Gagne","year":"2015","journal-title":"Epilepsy Behav"},{"key":"2023061313384546100_bty306-B16","doi-asserted-by":"crossref","first-page":"630.","DOI":"10.1001\/jamapsychiatry.2016.0432","article-title":"Self-harm, unintentional injury, and suicide in bipolar disorder during maintenance mood stabilizer treatment","volume":"73","author":"Hayes","year":"2016","journal-title":"JAMA Psychiatry"},{"key":"2023061313384546100_bty306-B17","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1111\/imm.12195","article-title":"The challenges, advantages and future of phenome-wide association studies","volume":"141","author":"Hebbring","year":"2014","journal-title":"Immunology"},{"key":"2023061313384546100_bty306-B18","author":"Heinze","year":"2016"},{"key":"2023061313384546100_bty306-B19","doi-asserted-by":"crossref","first-page":"2409","DOI":"10.1002\/sim.1047","article-title":"A solution to the problem of separation in logistic regression","volume":"21","author":"Heinze","year":"2002","journal-title":"Stat. Med"},{"key":"2023061313384546100_bty306-B20","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1038\/mp.2015.126","article-title":"Phenome-wide analysis of genome-wide polygenic scores","volume":"21","author":"Krapohl","year":"2016","journal-title":"Mol. Psychiatry"},{"key":"2023061313384546100_bty306-B21","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1002\/art.37801","article-title":"Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non-rheumatoid arthritis controls","volume":"65","author":"Liao","year":"2013","journal-title":"Arthr. Rheumatism"},{"key":"2023061313384546100_bty306-B22","doi-asserted-by":"crossref","first-page":"e1003405.","DOI":"10.1371\/journal.pcbi.1003405","article-title":"Phenome-wide association studies on a quantitative trait: application to TPMT enzyme activity and thiopurine therapy in pharmacogenomics","volume":"9","author":"Neuraz","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023061313384546100_bty306-B23","author":"R Core Team","year":"2017"},{"key":"2023061313384546100_bty306-B24","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1038\/nbt.3183","article-title":"Opportunities for drug repositioning from phenome-wide association studies","volume":"33","author":"Rastegar-Mojarad","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023061313384546100_bty306-B25","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1161\/CIRCULATIONAHA.112.000604","article-title":"Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk","volume":"127","author":"Ritchie","year":"2013","journal-title":"Circulation"},{"key":"2023061313384546100_bty306-B26","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1038\/clpt.2008.89","article-title":"Development of a large-scale de-identified DNA biobank to enable personalized medicine","volume":"84","author":"Roden","year":"2008","journal-title":"Clin. Pharmacol. Ther"},{"key":"2023061313384546100_bty306-B27","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1080\/01621459.1987.10478441","article-title":"Model-based direct adjustment","volume":"82","author":"Rosenbaum","year":"1987","journal-title":"J. Am. Stat. Assoc"},{"key":"2023061313384546100_bty306-B28","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1093\/biomet\/70.1.41","article-title":"The central role of the propensity score in observational studies for causal effects","volume":"70","author":"Rosenbaum","year":"1983","journal-title":"Biometrika"},{"key":"2023061313384546100_bty306-B30","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1080\/00031305.1985.10479383","article-title":"Constructing a control group using multivariate matched sampling methods that incorporate the propensity score","volume":"39","author":"Rosenbaum","year":"2012","journal-title":"Am. Stat"},{"key":"2023061313384546100_bty306-B29","doi-asserted-by":"crossref","first-page":"516.","DOI":"10.1080\/01621459.1984.10478078","article-title":"Reducing bias in observational studies using subclassification on the propensity score","volume":"79","author":"Rosenbaum","year":"2012","journal-title":"J. Am. Stat. Assoc"},{"key":"2023061313384546100_bty306-B31","author":"Rothman","year":"2015"},{"key":"2023061313384546100_bty306-B32","doi-asserted-by":"crossref","first-page":"e76","DOI":"10.1038\/psp.2013.52","article-title":"Medication-wide association studies","volume":"2","author":"Ryan","year":"2013","journal-title":"CPT Pharm. Syst. Pharmacol"},{"key":"2023061313384546100_bty306-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v042.i07","article-title":"Multivariate and propensity score matching software with automated balance optimization: the matchingpackage for R","volume":"42","author":"Sekhon","year":"2011","journal-title":"J. Stat. Software"},{"key":"2023061313384546100_bty306-B34","doi-asserted-by":"crossref","first-page":"1176","DOI":"10.1002\/pds.1836","article-title":"Data mining on electronic health record databases for signal detection in pharmacovigilance: which events to monitor?","volume":"18","author":"Trifir\u00f2","year":"2009","journal-title":"Pharmacoepidemiol. Drug Saf"},{"key":"2023061313384546100_bty306-B35","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1197\/jamia.M3378","article-title":"MedEx: a medication information extraction system for clinical narratives","volume":"17","author":"Xu","year":"2010","journal-title":"J. Am. Med. Inform. Assoc"},{"key":"2023061313384546100_bty306-B36","doi-asserted-by":"crossref","first-page":"748","DOI":"10.1093\/jamia\/ocu018","article-title":"Personal health record use for children and health care utilization: propensity score-matched cohort analysis","volume":"22","author":"Zhou","year":"2015","journal-title":"J. Am. Med. Inform. Assoc"},{"key":"2023061313384546100_bty306-B37","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/17\/2988\/50581991\/bioinformatics_34_17_2988.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/17\/2988\/50581991\/bioinformatics_34_17_2988.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,6]],"date-time":"2024-07-06T02:30:58Z","timestamp":1720233058000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/17\/2988\/4975417"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,4,18]]},"references-count":37,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2018,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty306","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,9,1]]},"published":{"date-parts":[[2018,4,18]]}}}