{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T15:39:13Z","timestamp":1766158753335,"version":"3.37.3"},"reference-count":59,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,7,23]],"date-time":"2021-07-23T00:00:00Z","timestamp":1626998400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,23]],"date-time":"2021-07-23T00:00:00Z","timestamp":1626998400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Labeling clinical data from electronic health records (EHR) in health systems requires extensive knowledge of human expert, and painstaking review by clinicians. Furthermore, existing phenotyping algorithms are not uniformly applied across large datasets and can suffer from inconsistencies in case definitions across different algorithms. We describe here quantitative disease risk scores based on almost unsupervised methods that require minimal input from clinicians, can be applied to large datasets, and alleviate some of the main weaknesses of existing phenotyping algorithms. We show applications to phenotypic data on approximately 100,000 individuals in eMERGE, and focus on several complex diseases, including Chronic Kidney Disease, Coronary Artery Disease, Type 2 Diabetes, Heart Failure, and a few others. We demonstrate that relative to existing approaches, the proposed methods have higher prediction accuracy, can better identify phenotypic features relevant to the disease under consideration, can perform better at clinical risk stratification, and can identify undiagnosed cases based on phenotypic features available in the EHR. Using genetic data from the eMERGE-seq panel that includes sequencing data for 109 genes on 21,363 individuals from multiple ethnicities, we also show how the new quantitative disease risk scores help improve the power of genetic association studies relative to the standard use of disease phenotypes. The results demonstrate the effectiveness of quantitative disease risk scores derived from rich phenotypic EHR databases to provide a more meaningful characterization of clinical risk for diseases of interest beyond the prevalent binary (case-control) classification.<\/jats:p>","DOI":"10.1038\/s41746-021-00488-3","type":"journal-article","created":{"date-parts":[[2021,7,23]],"date-time":"2021-07-23T10:03:49Z","timestamp":1627034629000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Quantitative disease risk scores from EHR with applications to clinical risk stratification and genetic studies"],"prefix":"10.1038","volume":"4","author":[{"given":"Danqing","family":"Xu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9127-0054","authenticated-orcid":false,"given":"Chen","family":"Wang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6651-2725","authenticated-orcid":false,"given":"Atlas","family":"Khan","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7040-5204","authenticated-orcid":false,"given":"Ning","family":"Shang","sequence":"additional","affiliation":[]},{"given":"Zihuai","family":"He","sequence":"additional","affiliation":[]},{"given":"Adam","family":"Gordon","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6524-3471","authenticated-orcid":false,"given":"Iftikhar J.","family":"Kullo","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1905-8806","authenticated-orcid":false,"given":"Shawn","family":"Murphy","sequence":"additional","affiliation":[]},{"given":"Yizhao","family":"Ni","sequence":"additional","affiliation":[]},{"given":"Wei-Qi","family":"Wei","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2801-233X","authenticated-orcid":false,"given":"Ali","family":"Gharavi","sequence":"additional","affiliation":[]},{"given":"Krzysztof","family":"Kiryluk","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9624-0214","authenticated-orcid":false,"given":"Chunhua","family":"Weng","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9001-2026","authenticated-orcid":false,"given":"Iuliana","family":"Ionita-Laza","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,7,23]]},"reference":[{"key":"488_CR1","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1038\/gim.2013.72","volume":"15","author":"O Gottesman","year":"2013","unstructured":"Gottesman, O. et al. The electronic medical records and genomics (eMERGE) network: past, present, and future. Genet. Med. 15, 761 (2013).","journal-title":"Genet. Med."},{"key":"488_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1755-8794-4-13","volume":"4","author":"CA McCarty","year":"2011","unstructured":"McCarty, C. A. et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med. Genomics 4, 1\u201311 (2011).","journal-title":"BMC Med. Genomics"},{"key":"488_CR3","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1111\/j.1752-8062.2010.00175.x","volume":"3","author":"J Pulley","year":"2010","unstructured":"Pulley, J., Clayton, E., Bernard, G. R., Roden, D. M. & Masys, D. R. Principles of human subjects protections applied in an opt-out, de-identified biobank. Clin. Transl. Sci. 3, 42\u201348 (2010).","journal-title":"Clin. Transl. Sci."},{"key":"488_CR4","doi-asserted-by":"publisher","first-page":"906","DOI":"10.1038\/gim.2015.187","volume":"18","author":"DJ Carey","year":"2016","unstructured":"Carey, D. J. et al. The Geisinger MyCode community health initiative: an electronic health record\u2013linked biobank for precision medicine research. Genet. Med. 18, 906 (2016).","journal-title":"Genet. Med."},{"key":"488_CR5","unstructured":"Murphy, S. N., Mendis, M. E., Berkowitz, D. A., Kohane, I. & Chueh, H. C. Integration of clinical and genetic data in the i2b2 architecture. In AMIA Annual Symposium Proceedings, Vol. 2006, 1040 (American Medical Informatics Association, 2006)."},{"key":"488_CR6","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41586-018-0579-z","volume":"562","author":"C Bycroft","year":"2018","unstructured":"Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203\u2013209 (2018).","journal-title":"Nature"},{"key":"488_CR7","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1016\/j.cell.2019.02.039","volume":"177","author":"NS Abul-Husn","year":"2019","unstructured":"Abul-Husn, N. S. & Kenny, E. E. Personalized medicine and the power of electronic health records. Cell 177, 58\u201369 (2019).","journal-title":"Cell"},{"key":"488_CR8","doi-asserted-by":"publisher","first-page":"417","DOI":"10.1038\/nrg2999","volume":"12","author":"IS Kohane","year":"2011","unstructured":"Kohane, I. S. Using electronic health records to drive discovery in disease genomics. Nat. Rev. Genet. 12, 417\u2013428 (2011).","journal-title":"Nat. Rev. Genet."},{"key":"488_CR9","doi-asserted-by":"publisher","first-page":"R14","DOI":"10.1093\/hmg\/ddy081","volume":"27","author":"BN Wolford","year":"2018","unstructured":"Wolford, B. N., Willer, C. J. & Surakka, I. Electronic health records: the next wave of complex disease genetics. Hum. Mol. Genet. 27, R14\u2013R21 (2018).","journal-title":"Hum. Mol. Genet."},{"key":"488_CR10","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","volume":"23","author":"JC Kirby","year":"2016","unstructured":"Kirby, J. C. et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J. Am. Med. Inform. Assoc. 23, 1046\u20131052 (2016).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"488_CR11","doi-asserted-by":"publisher","first-page":"e319","DOI":"10.1136\/amiajnl-2013-001952","volume":"20","author":"R RL","year":"2013","unstructured":"RL, R. et al. A comparison of phenotype definitions for diabetes mellitus. J. Am. Med. Inform. Assoc. 20, e319\u2013e326 (2013).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"488_CR12","doi-asserted-by":"publisher","first-page":"872","DOI":"10.1038\/nrg2670","volume":"10","author":"R Plomin","year":"2009","unstructured":"Plomin, R., Haworth, C. M. & Davis, O. S. Common disorders are quantitative traits. Nat. Rev. Genet. 10, 872\u2013878 (2009).","journal-title":"Nat. Rev. Genet."},{"key":"488_CR13","doi-asserted-by":"publisher","first-page":"1369","DOI":"10.1007\/s00439-014-1466-9","volume":"133","author":"JA Sinnott","year":"2014","unstructured":"Sinnott, J. A. et al. Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records. Hum. Genet. 133, 1369\u20131382 (2014).","journal-title":"Hum. Genet."},{"key":"488_CR14","doi-asserted-by":"publisher","first-page":"1233","DOI":"10.1126\/science.aal4043","volume":"359","author":"L Bastarache","year":"2018","unstructured":"Bastarache, L. et al. Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science 359, 1233\u20131239 (2018).","journal-title":"Science"},{"key":"488_CR15","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1093\/jamia\/ocx111","volume":"25","author":"S Yu","year":"2018","unstructured":"Yu, S. et al. Enabling phenotypic big data with phenorm. J. Am. Med. Inform. Assoc. 25, 54\u201360 (2018).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"488_CR16","doi-asserted-by":"publisher","first-page":"1102","DOI":"10.1038\/nbt.2749","volume":"31","author":"JC Denny","year":"2013","unstructured":"Denny, J. C. et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31, 1102 (2013).","journal-title":"Nat. Biotechnol."},{"key":"488_CR17","doi-asserted-by":"publisher","first-page":"588","DOI":"10.1016\/j.ajhg.2019.07.018","volume":"105","author":"eMERGE Consortium.","year":"2019","unstructured":"eMERGE Consortium. Harmonizing clinical sequencing and interpretation for the eMERGE III network. Am. J. Hum. Genet. 105, 588\u2013605 (2019).","journal-title":"Am. J. Hum. Genet."},{"key":"488_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41746-021-00428-1","volume":"4","author":"N Shang","year":"2021","unstructured":"Shang, N. et al. Medical records-based chronic kidney disease phenotype for clinical care and \"big data\u201d observational and genetic studies. npj Digit. Med. 4, 1\u201313 (2021).","journal-title":"npj Digit. Med."},{"key":"488_CR19","doi-asserted-by":"publisher","first-page":"1219","DOI":"10.1038\/s41588-018-0183-z","volume":"50","author":"AV Khera","year":"2018","unstructured":"Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219\u20131224 (2018).","journal-title":"Nat. Genet."},{"key":"488_CR20","unstructured":"Denny, J. & Basford, M. Type 2 Diabetes - Demonstration Project https:\/\/phekb.org\/phenotype\/73 (2012)."},{"key":"488_CR21","unstructured":"Bielinski, S. J. Heart Failure (HF) with Differentiation between Preserved and Reduced Ejection Fraction https:\/\/phekb.org\/phenotype\/147 (2013)."},{"key":"488_CR22","unstructured":"Carlson, C. Dementia https:\/\/phekb.org\/phenotype\/10 (2012)."},{"key":"488_CR23","unstructured":"CHOP Phenotyping group, CHOP. Gastroesophageal Reflux Disease (GERD) Phenotype Algorithm https:\/\/phekb.org\/phenotype\/224 (2014)."},{"key":"488_CR24","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41586-018-0579-z","volume":"562","author":"C Bycroft","year":"2018","unstructured":"Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203\u2013209 (2018).","journal-title":"Nature"},{"key":"488_CR25","first-page":"351","volume":"26","author":"S Wager","year":"2013","unstructured":"Wager, S., Wang, S. & Liang, P. Dropout training as adaptive regularization. Adv. Neural Inf. Process. Syst.26, 351\u2013359 (2013).","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"488_CR26","doi-asserted-by":"publisher","first-page":"565","DOI":"10.1038\/gim.2013.73","volume":"15","author":"RC Green","year":"2013","unstructured":"Green, R. C. et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 15, 565\u2013574 (2013).","journal-title":"Genet. Med."},{"key":"488_CR27","doi-asserted-by":"publisher","first-page":"224","DOI":"10.1016\/j.ajhg.2012.06.007","volume":"91","author":"S Lee","year":"2012","unstructured":"Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J.Hum. Genet. 91, 224\u2013237 (2012).","journal-title":"Am. J.Hum. Genet."},{"key":"488_CR28","doi-asserted-by":"publisher","first-page":"340","DOI":"10.1016\/j.ajhg.2017.07.011","volume":"101","author":"Z He","year":"2017","unstructured":"He, Z., Xu, B., Lee, S. & Ionita-Laza, I. Unified sequence-based association tests allowing for multiple functional annotations and meta-analysis of noncoding variation in metabochip data. Am. J. Hum. Genet. 101, 340\u2013352 (2017).","journal-title":"Am. J. Hum. Genet."},{"key":"488_CR29","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1080\/01621459.2018.1554485","volume":"115","author":"Y Liu","year":"2020","unstructured":"Liu, Y. & Xie, J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. J. Am. Stat. Assoc. 115, 393\u2013402 (2020).","journal-title":"J. Am. Stat. Assoc."},{"key":"488_CR30","doi-asserted-by":"publisher","first-page":"434","DOI":"10.1038\/s41586-020-2308-7","volume":"581","author":"KJ Karczewski","year":"2020","unstructured":"Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434\u2013443 (2020).","journal-title":"Nature"},{"key":"488_CR31","doi-asserted-by":"publisher","first-page":"e164","DOI":"10.1093\/nar\/gkq603","volume":"38","author":"K Wang","year":"2010","unstructured":"Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164\u2013e164 (2010).","journal-title":"Nucleic Acids Res."},{"key":"488_CR32","doi-asserted-by":"publisher","first-page":"D733","DOI":"10.1093\/nar\/gkv1189","volume":"44","author":"NA O\u2019Leary","year":"2016","unstructured":"O\u2019Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733\u2013D745 (2016).","journal-title":"Nucleic Acids Res."},{"key":"488_CR33","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1038\/nmeth0410-248","volume":"7","author":"IA Adzhubei","year":"2010","unstructured":"Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248\u2013249 (2010).","journal-title":"Nat. Methods"},{"key":"488_CR34","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1161\/CIRCRESAHA.117.312086","volume":"122","author":"P van der Harst","year":"2018","unstructured":"van der Harst, P. & Verweij, N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 122, 433\u2013443 (2018).","journal-title":"Circ. Res."},{"key":"488_CR35","doi-asserted-by":"publisher","first-page":"1385","DOI":"10.1038\/ng.3913","volume":"49","author":"CP Nelson","year":"2017","unstructured":"Nelson, C. P. et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat. Genet. 49, 1385 (2017).","journal-title":"Nat. Genet."},{"key":"488_CR36","doi-asserted-by":"publisher","first-page":"1514","DOI":"10.1038\/s41588-018-0222-9","volume":"50","author":"D Klarin","year":"2018","unstructured":"Klarin, D. et al. Genetics of blood lipids among\u0303 300,000 multi-ethnic participants of the million veteran program. Nat. Genet. 50, 1514\u20131523 (2018).","journal-title":"Nat. Genet."},{"key":"488_CR37","doi-asserted-by":"publisher","first-page":"514","DOI":"10.1038\/s41586-019-1310-4","volume":"570","author":"GL Wojcik","year":"2019","unstructured":"Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514\u2013518 (2019).","journal-title":"Nature"},{"key":"488_CR38","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1016\/j.ajhg.2014.01.009","volume":"94","author":"GM Peloso","year":"2014","unstructured":"Peloso, G. M. et al. Association of low-frequency and rare coding-sequence variants with blood lipids and coronary heart disease in 56,000 whites and blacks. Am. J. Hum. Genet. 94, 223\u2013232 (2014).","journal-title":"Am. J. Hum. Genet."},{"key":"488_CR39","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1016\/j.ajhg.2014.01.010","volume":"94","author":"LA Lange","year":"2014","unstructured":"Lange, L. A. et al. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol. Am. J. Hum. Genet. 94, 233\u2013245 (2014).","journal-title":"Am. J. Hum. Genet."},{"key":"488_CR40","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1038\/nature13917","volume":"518","author":"R Do","year":"2015","unstructured":"Do, R. et al. Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction. Nature 518, 102\u2013106 (2015).","journal-title":"Nature"},{"key":"488_CR41","doi-asserted-by":"publisher","first-page":"F433","DOI":"10.1152\/ajprenal.00375.2015","volume":"310","author":"P Wahl","year":"2016","unstructured":"Wahl, P., Ducasa, G. M. & Fornoni, A. Systemic and renal lipids in kidney disease development and progression. Am. J. Physiol.-Renal Physiol. 310, F433\u2013F445 (2016).","journal-title":"Am. J. Physiol.-Renal Physiol."},{"key":"488_CR42","doi-asserted-by":"publisher","first-page":"1198","DOI":"10.1161\/CIRCRESAHA.118.314177","volume":"124","author":"SM Cheedipudi","year":"2019","unstructured":"Cheedipudi, S. M. et al. Genomic reorganization of lamin-associated domains in cardiac myocytes is associated with differential gene expression and DNA methylation in human dilated cardiomyopathy. Circ. Res. 124, 1198\u20131213 (2019).","journal-title":"Circ. Res."},{"key":"488_CR43","first-page":"e001603","volume":"10","author":"S Nishiuchi","year":"2017","unstructured":"Nishiuchi, S. et al. Gene-based risk stratification for cardiac disorders in LMNA mutation carriers. Circulation: Cardiovas. Genet. 10, e001603 (2017).","journal-title":"Circulation: Cardiovas. Genet."},{"key":"488_CR44","doi-asserted-by":"publisher","first-page":"458","DOI":"10.7326\/M18-2768","volume":"171","author":"G Peretto","year":"2019","unstructured":"Peretto, G. et al. Cardiac and neuromuscular features of patients with LMNA-related cardiomyopathy. Ann. Intern. Med. 171, 458\u2013463 (2019).","journal-title":"Ann. Intern. Med."},{"key":"488_CR45","doi-asserted-by":"publisher","first-page":"596","DOI":"10.1161\/CIRCRESAHA.116.308586","volume":"119","author":"T Matsuda","year":"2016","unstructured":"Matsuda, T. et al. NF2 activates Hippo signaling and promotes ischemia\/reperfusion injury in the heart. Circ. Res. 119, 596\u2013606 (2016).","journal-title":"Circ. Res."},{"key":"488_CR46","doi-asserted-by":"publisher","first-page":"2839","DOI":"10.1093\/ndt\/gfr795","volume":"27","author":"O-N Goek","year":"2012","unstructured":"Goek, O.-N. et al. Association of apolipoprotein A1 and B with kidney function and chronic kidney disease in two multiethnic population samples. Nephrol. Dial. Transplant. 27, 2839\u20132847 (2012).","journal-title":"Nephrol. Dial. Transplant."},{"key":"488_CR47","doi-asserted-by":"publisher","first-page":"552","DOI":"10.1038\/ajh.2009.41","volume":"22","author":"N Franceschini","year":"2009","unstructured":"Franceschini, N. et al. The association of cell cycle checkpoint 2 variants and kidney function: findings of the family blood pressure program and the atherosclerosis risk in communities study. Am.J. Hypertens. 22, 552\u2013558 (2009).","journal-title":"Am.J. Hypertens."},{"key":"488_CR48","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1161\/CIRCRESAHA.117.312086","volume":"122","author":"P van der Harst","year":"2018","unstructured":"van der Harst, P. & Verweij, N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 122, 433\u2013443 (2018).","journal-title":"Circ. Res."},{"key":"488_CR49","doi-asserted-by":"publisher","first-page":"634","DOI":"10.1038\/s41588-020-0621-6","volume":"52","author":"W Zhou","year":"2020","unstructured":"Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634\u2013639 (2020).","journal-title":"Nat. Genet."},{"key":"488_CR50","doi-asserted-by":"publisher","first-page":"1235","DOI":"10.1093\/jamia\/ocaa079","volume":"27","author":"Y Ahuja","year":"2020","unstructured":"Ahuja, Y. et al. sureLDA: a multi-disease automated phenotyping method for the electronic health record. J. Am. Med. Inform. Assoc. 27, 1235\u20131243 (2020).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"488_CR51","doi-asserted-by":"publisher","first-page":"662","DOI":"10.1016\/j.ajhg.2014.03.016","volume":"94","author":"H Aschard","year":"2014","unstructured":"Aschard, H. et al. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. Am. J. Hum. Genet. 94, 662\u2013676 (2014).","journal-title":"Am. J. Hum. Genet."},{"key":"488_CR52","doi-asserted-by":"crossref","unstructured":"Liu, Z. & Lin, X. A geometric perspective on the power of principal component association tests in multiple phenotype studies. J. Am. Stat. Assoc.114, 1\u221232 (2019).","DOI":"10.1080\/01621459.2018.1513363"},{"key":"488_CR53","doi-asserted-by":"crossref","unstructured":"Johnstone, I. M. On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 29, 295\u2212327 (2001).","DOI":"10.1214\/aos\/1009210544"},{"key":"488_CR54","doi-asserted-by":"publisher","first-page":"788","DOI":"10.1038\/44565","volume":"401","author":"DD Lee","year":"1999","unstructured":"Lee, D. D. & Seung, H. S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788\u2013791 (1999).","journal-title":"Nature"},{"key":"488_CR55","doi-asserted-by":"publisher","first-page":"1102","DOI":"10.1038\/nbt.2749","volume":"31","author":"JC Denny","year":"2013","unstructured":"Denny, J. C. et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31, 1102 (2013).","journal-title":"Nat. Biotechnol."},{"key":"488_CR56","doi-asserted-by":"publisher","first-page":"576","DOI":"10.1016\/j.ajhg.2015.09.001","volume":"97","author":"BJ Vilhj\u00e1lmsson","year":"2015","unstructured":"Vilhj\u00e1lmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576\u2013592 (2015).","journal-title":"Am. J. Hum. Genet."},{"key":"488_CR57","doi-asserted-by":"publisher","first-page":"723","DOI":"10.1038\/s41581-018-0067-6","volume":"14","author":"L Liu","year":"2018","unstructured":"Liu, L. & Kiryluk, K. Genome-wide polygenic risk predictors for kidney disease. Nat. Rev. Nephrol. 14, 723\u2013724 (2018).","journal-title":"Nat. Rev. Nephrol."},{"key":"488_CR58","doi-asserted-by":"publisher","first-page":"587","DOI":"10.1016\/j.cell.2019.03.028","volume":"177","author":"AVA Khera","year":"2019","unstructured":"Khera, A. V. A. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587\u2013596 (2019).","journal-title":"Cell"},{"key":"488_CR59","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1038\/ng.806","volume":"43","author":"MA DePristo","year":"2011","unstructured":"DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491 (2011).","journal-title":"Nat. Genet."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00488-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00488-3","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00488-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,3]],"date-time":"2022-12-03T18:56:56Z","timestamp":1670093816000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00488-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,23]]},"references-count":59,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["488"],"URL":"https:\/\/doi.org\/10.1038\/s41746-021-00488-3","relation":{},"ISSN":["2398-6352"],"issn-type":[{"type":"electronic","value":"2398-6352"}],"subject":[],"published":{"date-parts":[[2021,7,23]]},"assertion":[{"value":"9 March 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 May 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 July 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"116"}}