{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T20:02:48Z","timestamp":1773345768834,"version":"3.50.1"},"reference-count":141,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2018,9,14]],"date-time":"2018-09-14T00:00:00Z","timestamp":1536883200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100002661","name":"Fonds De La Recherche Scientifique - FNRS","doi-asserted-by":"publisher","award":["2.4609.11"],"award-info":[{"award-number":["2.4609.11"]}],"id":[{"id":"10.13039\/501100002661","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002661","name":"Fonds De La Recherche Scientifique - FNRS","doi-asserted-by":"publisher","award":["T.0180.13"],"award-info":[{"award-number":["T.0180.13"]}],"id":[{"id":"10.13039\/501100002661","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Walloon Excellence in Lifesciences and BIOtechnology"},{"DOI":"10.13039\/501100001659","name":"German Research Foundation","doi-asserted-by":"publisher","award":["DFG, FOR488"],"award-info":[{"award-number":["DFG, FOR488"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"German Research Foundation","doi-asserted-by":"publisher","award":["KO2250\/5-1"],"award-info":[{"award-number":["KO2250\/5-1"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"German Research Foundation","doi-asserted-by":"publisher","award":["KFO303"],"award-info":[{"award-number":["KFO303"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"German Research Foundation","doi-asserted-by":"publisher","award":["KO2250\/7-1"],"award-info":[{"award-number":["KO2250\/7-1"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002347","name":"Federal Ministry of Education and Research","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100010564","name":"German Center for Lung Research","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100010564","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100010447","name":"German Centre for Cardiovascular Research","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100010447","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,11,27]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Principal components (PCs) are widely used in statistics and refer to a relatively small number of uncorrelated variables derived from an initial pool of variables, while explaining as much of the total variance as possible. Also in statistical genetics, principal component analysis (PCA) is a popular technique. To achieve optimal results, a thorough understanding about the different implementations of PCA is required and their impact on study results, compared to alternative approaches. In this review, we focus on the possibilities, limitations and role of PCs in ancestry prediction, genome-wide association studies, rare variants analyses, imputation strategies, meta-analysis and epistasis detection. We also describe several variations of classic PCA that deserve increased attention in statistical genetics applications.<\/jats:p>","DOI":"10.1093\/bib\/bby081","type":"journal-article","created":{"date-parts":[[2018,8,31]],"date-time":"2018-08-31T11:26:25Z","timestamp":1535714785000},"page":"2200-2216","source":"Crossref","is-referenced-by-count":45,"title":["Principals about principal components in statistical genetics"],"prefix":"10.1093","volume":"20","author":[{"given":"Fentaw","family":"Abegaz","sequence":"first","affiliation":[{"name":"University of Li\u00e8ge (Belgium)"}]},{"given":"Kridsadakorn","family":"Chaichoompu","sequence":"additional","affiliation":[{"name":"Max Planck Institute of Psychiatry (Germany)"}]},{"given":"Emmanuelle","family":"G\u00e9nin","sequence":"additional","affiliation":[{"name":"Inserm in Brest (France)"}]},{"given":"David W","family":"Fardo","sequence":"additional","affiliation":[{"name":"University of Kentucky (USA)"}]},{"given":"Inke R","family":"K\u00f6nig","sequence":"additional","affiliation":[{"name":"Universit\u00e4t zu L\u00fcbeck, Germany"}]},{"given":"Jestinah M","family":"Mahachie John","sequence":"additional","affiliation":[{"name":"University of Liege, Belgium"}]},{"given":"Kristel","family":"Van Steen","sequence":"additional","affiliation":[{"name":"University of Li\u00e8ge working on Systems Genetics"}]}],"member":"286","published-online":{"date-parts":[[2018,9,14]]},"reference":[{"key":"2020011102353110400_ref1","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1002\/wics.101","article-title":"Principal component analysis","volume":"2","author":"Abdi","year":"2010","journal-title":"Wiley Interdiscip Rev Comput Stat"},{"key":"2020011102353110400_ref2","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1037\/h0070888","article-title":"Analysis of a complex of statistical variables into principal components","volume":"24","author":"Hotelling","year":"1933","journal-title":"J Educ Psychol"},{"key":"2020011102353110400_ref3","doi-asserted-by":"crossref","first-page":"300","DOI":"10.2307\/2348005","article-title":"A note on the use of principal components in regression","volume":"3","author":"Jolliffe","year":"1982","journal-title":"Appl Stat"},{"key":"2020011102353110400_ref4","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1007\/978-1-4757-1904-8_7","article-title":"Principal component analysis and factor analysis","author":"Jolliffe","year":"1986","journal-title":"Princ Compon Anal"},{"key":"2020011102353110400_ref5","doi-asserted-by":"crossref","first-page":"289","DOI":"10.2307\/1267793","article-title":"Collinearity and optimal restrictions on regression parameters for estimating responses","volume":"23","author":"Park","year":"1981","journal-title":"Technometrics"},{"key":"2020011102353110400_ref6","first-page":"467","article-title":"Quantitative monitoring of gene expression patterns with a complementary DNA microarray","author":"Schena","year":"1995"},{"key":"2020011102353110400_ref7","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1016\/j.gpb.2015.01.003","article-title":"Web resources for model organism studies","volume":"13","author":"Tang","year":"2015","journal-title":"Genom Proteom Bioinform"},{"key":"2020011102353110400_ref8","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.margen.2016.04.012","article-title":"Next-generation biology: sequencing and data analysis approaches for non-model organisms","volume":"30","author":"Fonseca","year":"2016","journal-title":"Mar Genomics"},{"key":"2020011102353110400_ref9","doi-asserted-by":"crossref","first-page":"646","DOI":"10.1038\/ng.139","article-title":"Interpreting principal component analyses of spatial population genetic variation","volume":"40","author":"Novembre","year":"2008","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref10","doi-asserted-by":"crossref","first-page":"904","DOI":"10.1038\/ng1847","article-title":"Principal components analysis corrects for stratification in genome-wide association studies","volume":"38","author":"Price","year":"2006","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref11","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1126\/science.8430313","article-title":"Demic expansions and human evolution","volume":"259","author":"Cavalli-Sforza","year":"1993","journal-title":"Science"},{"key":"2020011102353110400_ref12","doi-asserted-by":"crossref","first-page":"1241","DOI":"10.1016\/j.cub.2008.07.049","article-title":"Correlation between genetic and geographic structure in Europe","volume":"18","author":"Lao","year":"2008","journal-title":"Curr Biol"},{"key":"2020011102353110400_ref13","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/j.ajhg.2015.11.022","article-title":"Model-free estimation of recent genetic relatedness","volume":"98","author":"Conomos","year":"2016","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref14","doi-asserted-by":"crossref","first-page":"742","DOI":"10.1111\/j.1469-1809.2011.00681.x","article-title":"A novel method to detect gene\u2013gene interactions in structured populations: MDR-SP","volume":"75","author":"Niu","year":"2011","journal-title":"Ann Hum Genet"},{"key":"2020011102353110400_ref15","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1038\/ng0508-491","article-title":"Principal component analysis of genetic data","volume":"40","author":"Reich","year":"2008","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref16","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1146\/annurev-genom-082410-101510","article-title":"Population identification using genetic data","volume":"13","author":"Lawson","year":"2012","journal-title":"Annu Rev Genomics Hum Genet"},{"key":"2020011102353110400_ref17","doi-asserted-by":"crossref","first-page":"1164","DOI":"10.1093\/bioinformatics\/btm069","article-title":"pcaMethods\u2014a bioconductor package providing PCA methods for incomplete data","volume":"23","author":"Stacklies","year":"2007","journal-title":"Bioinformatics"},{"key":"2020011102353110400_ref18","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1186\/1471-2105-14-132","article-title":"Robust methods for population stratification in genome wide association studies","volume":"14","author":"Liu","year":"2013","journal-title":"BMC Bioinform"},{"key":"2020011102353110400_ref19","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1080\/10618600.2014.891461","article-title":"Integrating data transformation in principal components analysis","volume":"24","author":"Maadooliat","year":"2015","journal-title":"J Comput Graph Stat"},{"key":"2020011102353110400_ref20","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1080\/14786440109462720","article-title":"LIII. On lines and planes of closest fit to systems of points in space","volume":"2","author":"Pearson","year":"1901","journal-title":"Lond Edinb Dublin Philos Mag J Sci"},{"key":"2020011102353110400_ref21","volume-title":"Principal Component Analysis","author":"Jolliffe","year":"2002"},{"key":"2020011102353110400_ref22","doi-asserted-by":"crossref","first-page":"7719","DOI":"10.1073\/pnas.94.15.7719","article-title":"Genes, peoples, and languages","volume":"94","author":"Cavalli-Sforza","year":"1997","journal-title":"Proc Natl Acad Sci"},{"key":"2020011102353110400_ref23","first-page":"505","article-title":"Point: population stratification: a problem for case-control studies of candidate-gene associations?","volume":"11","author":"Thomas","year":"2002","journal-title":"Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol"},{"key":"2020011102353110400_ref24","first-page":"513","article-title":"Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer","volume":"11","author":"Wacholder","year":"2002","journal-title":"Cancer Epidemiol Prev Biomark"},{"key":"2020011102353110400_ref25","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1086\/432962","article-title":"Recent developments in genomewide association scans: a workshop summary and review","volume":"77","author":"Thomas","year":"2005","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref26","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1093\/biomet\/70.1.41","article-title":"The central role of the propensity score in observational studies for causal effects","volume":"70","author":"Rosenbaum","year":"1983","journal-title":"Biometrika"},{"key":"2020011102353110400_ref27","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1002\/gepi.20558","article-title":"Propensity score-based nonparametric test revealing genetic variants underlying bipolar disorder","volume":"35","author":"Jiang","year":"2011","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref28","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1002\/gepi.20419","article-title":"A propensity score approach to correction for bias due to population stratification using genetic and non-genetic factors","volume":"33","author":"Zhao","year":"2009","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref29","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1086\/302959","article-title":"Association mapping in structured populations","volume":"67","author":"Pritchard","year":"2000","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref30","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1111\/j.0006-341X.1999.00997.x","article-title":"Genomic control for association studies","volume":"55","author":"Devlin","year":"1999","journal-title":"Biometrics"},{"key":"2020011102353110400_ref31","doi-asserted-by":"crossref","first-page":"1486","DOI":"10.1002\/jrs.2646","article-title":"Raman spectroscopic discrimination of pigments and tempera paint model samples by principal component analysis on first-derivative spectra","volume":"41","author":"Navas","year":"2010","journal-title":"J Raman Spectrosc"},{"key":"2020011102353110400_ref32","doi-asserted-by":"crossref","first-page":"e190","DOI":"10.1371\/journal.pgen.0020190","article-title":"Population structure and eigenanalysis","volume":"2","author":"Patterson","year":"2006","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref33","doi-asserted-by":"crossref","first-page":"e160","DOI":"10.1371\/journal.pgen.0030160","article-title":"PCA-Correlated SNPs for structure identification in worldwide human populations","volume":"3","author":"Paschou","year":"2007","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref34","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1038\/ejhg.2008.210","article-title":"Investigation of the fine structure of European populations with applications to disease association studies","volume":"16","author":"Heath","year":"2008","journal-title":"Eur J Hum Genet EJHG"},{"key":"2020011102353110400_ref35","doi-asserted-by":"crossref","first-page":"e1002824","DOI":"10.1371\/journal.pgen.1002824","article-title":"Genome-wide association analysis in asthma subjects identifies SPATS2L as a novel bronchodilator response gene","volume":"8","author":"Himes","year":"2012","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref36","doi-asserted-by":"crossref","first-page":"1286","DOI":"10.1164\/rccm.201111-2061OC","article-title":"Genome-wide association identifies the T gene as a novel asthma pharmacogenetic locus","volume":"185","author":"Tantisira","year":"2012","journal-title":"Am J Respir Crit Care Med"},{"key":"2020011102353110400_ref37","doi-asserted-by":"crossref","first-page":"e12510","DOI":"10.1371\/journal.pone.0012510","article-title":"Theoretical formulation of principal components analysis to detect and correct for population stratification","volume":"5","author":"Ma","year":"2010","journal-title":"PLoS One"},{"key":"2020011102353110400_ref38","doi-asserted-by":"crossref","first-page":"e1001117","DOI":"10.1371\/journal.pgen.1001117","article-title":"Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis","volume":"6","author":"Engelhardt","year":"2010","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref39","doi-asserted-by":"crossref","first-page":"1421","DOI":"10.1534\/genetics.114.171314","article-title":"A novel and fast approach for population structure inference using kernel-PCA and optimization","volume":"198","author":"Popescu","year":"2014","journal-title":"Genetics"},{"key":"2020011102353110400_ref40","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nature22969","article-title":"Fine-mapping inflammatory bowel disease loci to single-variant resolution","volume":"547","author":"Huang","year":"2017","journal-title":"Nature"},{"key":"2020011102353110400_ref41","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1016\/j.ajhg.2016.02.012","article-title":"Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models","volume":"98","author":"Chen","year":"2016","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref42","doi-asserted-by":"crossref","first-page":"1709","DOI":"10.1534\/genetics.107.080101","article-title":"Efficient control of population structure in model organism association mapping","volume":"178","author":"Kang","year":"2008","journal-title":"Genetics"},{"key":"2020011102353110400_ref43","doi-asserted-by":"crossref","first-page":"1961","DOI":"10.1093\/genetics\/155.4.1961","article-title":"Estimating quantitative genetic parameters using sibships reconstructed from marker data","volume":"155","author":"Thomas","year":"2000","journal-title":"Genetics"},{"key":"2020011102353110400_ref44","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.1093\/genetics\/160.3.1203","article-title":"An estimator for pairwise relatedness using molecular markers","volume":"160","author":"Wang","year":"2002","journal-title":"Genetics"},{"key":"2020011102353110400_ref45","doi-asserted-by":"crossref","first-page":"e51","DOI":"10.1371\/journal.pgen.0030051","article-title":"Generalized analysis of molecular variance","volume":"3","author":"Nievergelt","year":"2007","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref46","doi-asserted-by":"crossref","first-page":"e4","DOI":"10.1371\/journal.pgen.0030004","article-title":"An arabidopsis example of association mapping in structured samples","volume":"3","author":"Zhao","year":"2007","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref47","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1038\/ng.548","article-title":"Variance component model to account for sample structure in genome-wide association studies","volume":"42","author":"Kang","year":"2010","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref48","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1038\/ng.2310","article-title":"Genome-wide efficient mixed model analysis for association studies","volume":"44","author":"Zhou","year":"2012","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref49","doi-asserted-by":"crossref","first-page":"833","DOI":"10.1038\/nmeth.1681","article-title":"FaST linear mixed models for genome-wide association studies","volume":"8","author":"Lippert","year":"2011","journal-title":"Nat Methods"},{"key":"2020011102353110400_ref50","doi-asserted-by":"crossref","first-page":"1294","DOI":"10.1093\/bioinformatics\/btm108","article-title":"GenABEL: an R library for genome-wide association analysis","volume":"23","author":"Aulchenko","year":"2007","journal-title":"Bioinformatics"},{"key":"2020011102353110400_ref51","doi-asserted-by":"crossref","first-page":"e75707","DOI":"10.1371\/journal.pone.0075707","article-title":"Correcting for population structure and kinship using the linear mixed model: theory and extensions","volume":"8","author":"Hoffman","year":"2013","journal-title":"PLoS One"},{"key":"2020011102353110400_ref52","doi-asserted-by":"crossref","first-page":"1526","DOI":"10.1093\/bioinformatics\/btt177","article-title":"A powerful and efficient set test for genetic markers that handles confounders","volume":"29","author":"Listgarten","year":"2013","journal-title":"Bioinformatics"},{"key":"2020011102353110400_ref53","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1038\/ng.2876","article-title":"Advantages and pitfalls in the application of mixed model association methods","volume":"46","author":"Yang","year":"2014","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref54","doi-asserted-by":"crossref","first-page":"1045","DOI":"10.1534\/genetics.114.164285","article-title":"Improving the power of GWAS and avoiding confounding from population stratification with PC-Select","volume":"197","author":"Tucker","year":"2014","journal-title":"Genetics"},{"key":"2020011102353110400_ref55","first-page":"73","article-title":"Recent advances in clustering: a brief survey","volume":"1","author":"Kotsiantis","year":"2004","journal-title":"WSEAS Trans Inf Sci Appl"},{"key":"2020011102353110400_ref56","doi-asserted-by":"crossref","first-page":"1579","DOI":"10.1214\/10-AOAS327","article-title":"Sparse logistic principal components analysis for binary data","volume":"4","author":"Lee","year":"2010","journal-title":"Ann Appl Stat"},{"key":"2020011102353110400_ref57","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1186\/1471-2156-11-108","article-title":"Clustering by genetic ancestry using genome-wide SNP data","volume":"11","author":"Solovieff","year":"2010","journal-title":"BMC Genet"},{"key":"2020011102353110400_ref58","first-page":"159","article-title":"Detecting stable clusters using principal component analysis","volume":"224","author":"Ben-Hur","year":"2003","journal-title":"Methods Mol Biol Clifton NJ"},{"key":"2020011102353110400_ref59","doi-asserted-by":"crossref","first-page":"e77720","DOI":"10.1371\/journal.pone.0077720","article-title":"Molecular reclassification of Crohn\u2019s disease: a cautionary note on population stratification","volume":"8","author":"Maus","year":"2013","journal-title":"PLoS One"},{"key":"2020011102353110400_ref60","doi-asserted-by":"crossref","DOI":"10.1002\/9781118391686","volume-title":"Methods of Multivariate Analysis","author":"Rencher","year":"2012."},{"key":"2020011102353110400_ref61","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4419-9650-3","volume-title":"An Introduction to Applied Multivariate Analysis with R","author":"Everitt","year":"2011."},{"key":"2020011102353110400_ref62","first-page":"160","article-title":"Discarding variables in a principal component analysis. I: artificial data","volume":"21","author":"Jolliffe","year":"1972","journal-title":"J R Stat Soc Ser C Appl Stat"},{"key":"2020011102353110400_ref63","doi-asserted-by":"crossref","first-page":"e93766","DOI":"10.1371\/journal.pone.0093766","article-title":"Fast principal component analysis of large-scale genome-wide data","volume":"9","author":"Abraham","year":"2014","journal-title":"PLoS One"},{"key":"2020011102353110400_ref64","doi-asserted-by":"crossref","first-page":"8140","DOI":"10.1038\/srep08140","article-title":"Highlighting nonlinear patterns in population genetics datasets","volume":"5","author":"Alanis-Lobato","year":"2015","journal-title":"Sci Rep"},{"key":"2020011102353110400_ref65","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1038\/nrg2813","article-title":"New approaches to population stratification in genome-wide association studies","volume":"11","author":"Price","year":"2010","journal-title":"Nat Rev Genet"},{"key":"2020011102353110400_ref66","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1016\/j.ajhg.2015.12.022","article-title":"Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia","volume":"98","author":"Galinsky","year":"2016","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref67","doi-asserted-by":"crossref","first-page":"e1002822","DOI":"10.1371\/journal.pcbi.1002822","article-title":"Chapter 11: genome-wide association studies","volume":"8","author":"Bush","year":"2012","journal-title":"PLoS Comput Biol"},{"key":"2020011102353110400_ref68","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1159\/000288706","article-title":"Quantification of population structure using correlated SNPs by shrinkage principal components","volume":"70","author":"Zou","year":"2010","journal-title":"Hum Hered"},{"key":"2020011102353110400_ref69","doi-asserted-by":"crossref","first-page":"e1003993","DOI":"10.1371\/journal.pgen.1003993","article-title":"Quantifying missing heritability at known GWAS loci","volume":"9","author":"Gusev","year":"2013","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref70"},{"key":"2020011102353110400_ref71","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1159\/000101422","article-title":"An R package for analysis of whole-genome association studies","volume":"64","author":"Clayton","year":"2007","journal-title":"Hum Hered"},{"key":"2020011102353110400_ref72","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1038\/nature11582","article-title":"Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease","volume":"491","author":"Jostins","year":"2012","journal-title":"Nature"},{"key":"2020011102353110400_ref73","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/BF01441146","article-title":"A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity","volume":"96","author":"Balding","year":"1995","journal-title":"Genetica"},{"key":"2020011102353110400_ref74","doi-asserted-by":"crossref","first-page":"418","DOI":"10.1111\/j.1469-1809.2010.00639.x","article-title":"A comparison of association methods correcting for population stratification in case\u2013control studies","volume":"75","author":"Wu","year":"2011","journal-title":"Ann Hum Genet"},{"key":"2020011102353110400_ref75","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1111\/j.1469-1809.1949.tb02451.x","article-title":"The genetical structure of populations","volume":"15","author":"Wright","year":"1951","journal-title":"Ann Eugen"},{"key":"2020011102353110400_ref76","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1207\/s15327906mbr0102_10","article-title":"The Scree Test for the number of factors","volume":"1","author":"Cattell","year":"1966","journal-title":"Multivar Behav Res"},{"key":"2020011102353110400_ref77","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1177\/001316446002000116","article-title":"The application of electonic computers to factor analysis","volume":"20","author":"Kaiser","year":"1960","journal-title":"Educ Psychol Meas"},{"key":"2020011102353110400_ref78","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1007\/BF02293557","article-title":"Determining the number of components from the matrix of partial correlations","volume":"41","author":"Velicer","year":"1976","journal-title":"Psychometrika"},{"key":"2020011102353110400_ref79","doi-asserted-by":"crossref","first-page":"2228","DOI":"10.1016\/j.csda.2007.07.015","article-title":"On the number of principal components: a test of dimensionality based on measurements of similarity between matrices","volume":"52","author":"Dray","year":"2008","journal-title":"Comput Stat Data Anal"},{"key":"2020011102353110400_ref80","doi-asserted-by":"crossref","first-page":"534","DOI":"10.1006\/nimg.1998.0425","article-title":"Generalizable patterns in neuroimaging: how many principal components?","volume":"9","author":"Hansen","year":"1999","journal-title":"NeuroImage"},{"key":"2020011102353110400_ref81","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1002\/gepi.20396","article-title":"Genetic background comparison using distance-based regression, with applications in population stratification evaluation and adjustment","volume":"33","author":"Li","year":"2009","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref82","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1186\/1471-2105-11-296","article-title":"Super-sparse principal component analyses for high-throughput genomic data","volume":"11","author":"Lee","year":"2010","journal-title":"BMC Bioinform"},{"key":"2020011102353110400_ref83","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1186\/1471-2156-12-64","article-title":"Choice of population structure informative principal components for adjustment in a case-control study","volume":"12","author":"Peloso","year":"2011","journal-title":"BMC Genet"},{"key":"2020011102353110400_ref84","doi-asserted-by":"crossref","first-page":"S108","DOI":"10.1371\/journal.pone.0002551","article-title":"Population substructure and control selection in genome-wide association studies","volume":"3","author":"Yu","year":"2008","journal-title":"PLoS One"},{"key":"2020011102353110400_ref85","doi-asserted-by":"crossref","first-page":"S108","DOI":"10.1186\/1753-6561-3-s7-s108","article-title":"Principal-component-based population structure adjustment in the North American Rheumatoid Arthritis Consortium data: impact of single-nucleotide polymorphism set and analysis method","volume":"3","author":"Peloso","year":"2009","journal-title":"BMC Proc"},{"key":"2020011102353110400_ref86","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1002\/gepi.20296","article-title":"Improved correction for population stratification in genome-wide association studies by identifying hidden population structures","volume":"32","author":"Li","year":"2008","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref87","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1198\/106186006X113430","article-title":"Sparse principal component analysis","volume":"15","author":"Zou","year":"2006","journal-title":"J Comput Graph Stat"},{"key":"2020011102353110400_ref88","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1080\/757584395","article-title":"Rotation of principal components: choice of normalization constraints","volume":"22","author":"Jolliffe","year":"1995","journal-title":"J Appl Stat"},{"key":"2020011102353110400_ref89","doi-asserted-by":"crossref","first-page":"e1000686","DOI":"10.1371\/journal.pgen.1000686","article-title":"A genealogical interpretation of principal components analysis","volume":"5","author":"McVean","year":"2009","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref90","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1080\/01621459.1985.10478181","article-title":"Projection-pursuit approach to robust dispersion matrices and principal components: primary theory and Monte Carlo","volume":"80","author":"Li","year":"1985","journal-title":"J Am Stat Assoc"},{"key":"2020011102353110400_ref91","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1016\/j.jmva.2004.08.002","article-title":"High breakdown estimators for principal components: the projection-pursuit approach revisited","volume":"95","author":"Croux","year":"2005","journal-title":"J Multivar Anal"},{"key":"2020011102353110400_ref92","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1016\/j.chemolab.2007.01.004","article-title":"Algorithms for Projection\u2013Pursuit robust principal component analysis","volume":"87","author":"Croux","year":"2007","journal-title":"Chemom Intell Lab Syst"},{"key":"2020011102353110400_ref93","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1016\/j.jmva.2007.06.007","article-title":"Sparse principal component analysis via regularized low rank matrix approximation","volume":"99","author":"Shen","year":"2008","journal-title":"J Multivar Anal"},{"key":"2020011102353110400_ref94","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1214\/09-AOAS281","article-title":"A spectral graph approach to discovering genetic ancestry","volume":"4","author":"Lee","year":"2010","journal-title":"Ann Appl Stat"},{"key":"2020011102353110400_ref95","doi-asserted-by":"crossref","first-page":"1373","DOI":"10.1162\/089976603321780317","article-title":"Laplacian Eigenmaps for dimensionality reduction and data representation","volume":"15","author":"Belkin","year":"2003","journal-title":"Neural Comput"},{"key":"2020011102353110400_ref96","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1016\/j.ajhg.2008.08.005","article-title":"The population reference sample, POPRES: a resource for population, disease, and pharmacological genetics research","volume":"83","author":"Nelson","year":"2008","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref97","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1037\/1082-989X.12.3.336","article-title":"Nonlinear principal components analysis: introduction and application","volume":"12","author":"Linting","year":"2007","journal-title":"Psychol Methods"},{"key":"2020011102353110400_ref98","article-title":"Dimensionality reduction for binary data through the projection of natural parameters","author":"Landgraf","year":"2015","journal-title":"ArXiv151006112 Stat"},{"key":"2020011102353110400_ref99","first-page":"617","article-title":"A generalization of principal component analysis to the exponential family","author":"Collins","year":"2001","journal-title":"Proc 14th Int Conf Neural Inf Process Syst Nat Synth"},{"key":"2020011102353110400_ref100","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.csda.2004.07.010","article-title":"Principal component analysis of binary data by iterated singular value decomposition","volume":"50","author":"Leeuw","year":"2006","journal-title":"Comput Stat Data Anal"},{"key":"2020011102353110400_ref101","first-page":"546431","article-title":"A generalized linear model for principal component analysis of binary data","author":"Schein","year":"2003","journal-title":"Proc 9th Int Workshop Artif Intell Stat"},{"key":"2020011102353110400_ref102","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1016\/j.patcog.2016.05.024","article-title":"Sparse exponential family principal component analysis","volume":"60","author":"Lu","year":"2016","journal-title":"Pattern Recognit"},{"key":"2020011102353110400_ref103","article-title":"Principal component analysis of binary genomics data","author":"Song","year":"2017","journal-title":"Brief Bioinform"},{"key":"2020011102353110400_ref104","doi-asserted-by":"crossref","DOI":"10.1201\/b17077","volume-title":"Introduction to Multivariate Analysis: Linear and Nonlinear Modeling","author":"Konishi","year":"2014"},{"key":"2020011102353110400_ref105","volume-title":"Fourth Edition","author":"Theodoridis","year":"2008"},{"key":"2020011102353110400_ref106","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1111\/1467-9868.00196","article-title":"Probabilistic principal component analysis","volume":"61","author":"Tipping","year":"1999","journal-title":"J R Stat Soc Ser B Stat Methodol"},{"key":"2020011102353110400_ref107","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1002\/cem.759","article-title":"Bayesian principal component analysis","volume":"16","author":"Nounou","year":"2002","journal-title":"J Chemom"},{"key":"2020011102353110400_ref108","first-page":"1089","article-title":"Bayesian Exponential Family PCA","volume":"21","author":"Mohamed","year":"2009","journal-title":"Adv Neural Inf Process Syst"},{"key":"2020011102353110400_ref109","first-page":"18","article-title":"Classification and regression by RandomForest","volume":"2","author":"Liaw","year":"2002","journal-title":"R News"},{"key":"2020011102353110400_ref110","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1534\/g3.112.005363","article-title":"Imputation of unordered markers and the impact on genomic selection accuracy","volume":"3","author":"Rutkoski","year":"2013","journal-title":"G3 GenesGenomesGenetics"},{"key":"2020011102353110400_ref111","volume-title":"ArXiv150804409","author":"Wright","year":"2015"},{"key":"2020011102353110400_ref112","doi-asserted-by":"crossref","first-page":"891","DOI":"10.1534\/g3.114.010942","article-title":"Genetic diversity analysis of highly incomplete SNP genotype data with imputations: an empirical assessment","volume":"4","author":"Fu","year":"2014","journal-title":"G3 GenesGenomesGenetics"},{"key":"2020011102353110400_ref113","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1016\/j.ajhg.2015.04.018","article-title":"Improved ancestry estimation for both genotyping and sequencing data using Projection Procrustes Analysis and Genotype Imputation","volume":"96","author":"Wang","year":"2015","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref114","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1038\/ng.1074","article-title":"Differential confounding of rare and common variants in spatially structured populations","volume":"44","author":"Mathieson","year":"2012","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref115","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1038\/ng.2924","article-title":"Ancestry estimation and control of population stratification for sequence-based association studies","volume":"46","author":"Wang","year":"2014","journal-title":"Nat Genet"},{"key":"2020011102353110400_ref116","doi-asserted-by":"crossref","first-page":"979","DOI":"10.1534\/genetics.113.154740","article-title":"Quantifying population genetic differentiation from next-generation sequencing data","volume":"195","author":"Fumagalli","year":"2013","journal-title":"Genetics"},{"key":"2020011102353110400_ref117","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1002\/gepi.21896","article-title":"Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness","volume":"39","author":"Conomos","year":"2015","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref118","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1086\/519497","article-title":"Case-control association testing with related individuals: a more powerful quasi-likelihood score test","volume":"81","author":"Thornton","year":"2007","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref119","doi-asserted-by":"crossref","first-page":"668","DOI":"10.1002\/gepi.20418","article-title":"Case-control association testing in the presence of unknown relationships","volume":"33","author":"Choi","year":"2009","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref120","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1016\/j.ajhg.2010.01.001","article-title":"ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure","volume":"86","author":"Thornton","year":"2010","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref121","doi-asserted-by":"crossref","first-page":"798","DOI":"10.1093\/bioinformatics\/btq025","article-title":"Correcting population stratification in genetic association studies using a phylogenetic approach","volume":"26","author":"Li","year":"2010","journal-title":"Bioinformatics"},{"key":"2020011102353110400_ref122","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1016\/j.ajhg.2007.10.009","article-title":"A unified association analysis approach for family and unrelated samples correcting for stratification","volume":"82","author":"Zhu","year":"2008","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref123","doi-asserted-by":"crossref","DOI":"10.1002\/9783527633654","volume-title":"A Statistical Approach to Genetic Epidemiology: Concepts and Applications, with an e-Learning Platform","author":"Ziegler","year":"2010"},{"key":"2020011102353110400_ref124","doi-asserted-by":"crossref","first-page":"2190","DOI":"10.1093\/bioinformatics\/btq340","article-title":"METAL: fast and efficient meta-analysis of genomewide association scans","volume":"26","author":"Willer","year":"2010","journal-title":"Bioinform Oxf Engl"},{"key":"2020011102353110400_ref125","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1186\/1471-2105-11-288","article-title":"GWAMA: software for genome-wide association meta-analysis","volume":"11","author":"M\u00e4gi","year":"2010","journal-title":"BMC Bioinform"},{"key":"2020011102353110400_ref126","doi-asserted-by":"crossref","first-page":"e1002491","DOI":"10.1371\/journal.pgen.1002491","article-title":"A meta-analysis and genome-wide association study of platelet count and mean platelet volume in african americans","volume":"8","author":"Qayyum","year":"2012","journal-title":"PLoS Genet"},{"key":"2020011102353110400_ref127","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1007\/s00122-011-1697-2","article-title":"Genome-wide association mapping of agronomic and morphologic traits in highly structured populations of barley cultivars","volume":"124","author":"Wang","year":"2012","journal-title":"Theor Appl Genet"},{"key":"2020011102353110400_ref128","doi-asserted-by":"crossref","first-page":"S92","DOI":"10.1002\/gepi.20657","article-title":"Regression and data mining methods for analyses of multiple rare variants in the Genetic Analysis Workshop 17 Mini-Exome Data","volume":"35","author":"Bailey-Wilson","year":"2011","journal-title":"Genet Epidemiol"},{"key":"2020011102353110400_ref129","doi-asserted-by":"crossref","first-page":"3324","DOI":"10.1093\/hmg\/ddl408","article-title":"Over representation of rare variants in a specific ethnic group may confuse interpretation of association analyses","volume":"15","author":"Keen-Kim","year":"2006","journal-title":"Hum Mol Genet"},{"key":"2020011102353110400_ref130","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1101\/gr.4346306","article-title":"Logistic regression protects against population structure in genetic association studies","volume":"16","author":"Setakis","year":"2006","journal-title":"Genome Res"},{"key":"2020011102353110400_ref131","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1214\/09-STS307","article-title":"Population structure and cryptic relatedness in genetic association studies","volume":"24","author":"Astle","year":"2009","journal-title":"Stat Sci"},{"key":"2020011102353110400_ref132","doi-asserted-by":"crossref","first-page":"e28845","DOI":"10.1371\/journal.pone.0028845","article-title":"Accounting for population stratification in practice: a comparison of the main strategies dedicated to genome-wide association studies","volume":"6","author":"Bouaziz","year":"2011","journal-title":"PLoS One"},{"key":"2020011102353110400_ref133","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/hdy.2010.91","article-title":"Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses","volume":"106","author":"Sillanp\u00e4\u00e4","year":"2011","journal-title":"Heredity"},{"key":"2020011102353110400_ref134","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1002\/bimj.200410214","article-title":"Testing for association in the presence of population stratification: a simulation study comparing the S-TDT, STRAT and the GC","volume":"48","author":"Wawro","year":"2006","journal-title":"Biom J Biom Z"},{"key":"2020011102353110400_ref135","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1097\/EDE.0b013e3182137e03","article-title":"Population stratification bias: more widespread than previously thought","volume":"22","author":"Kraft","year":"2011","journal-title":"Epidemiol Camb Mass"},{"key":"2020011102353110400_ref136","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1016\/j.ajhg.2010.01.026","article-title":"Using principal components of genetic variation for robust and powerful detection of gene-gene interactions in case-control and case-only studies","volume":"86","author":"Bhattacharjee","year":"2010","journal-title":"Am J Hum Genet"},{"key":"2020011102353110400_ref137","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1093\/ije\/dys183","article-title":"Correction for population stratification in random forest analysis","volume":"41:","author":"Zhao","year":"2012","journal-title":"Int J Epidemiol"},{"key":"2020011102353110400_ref138","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/bib\/bbr012","article-title":"Travelling the world of gene\u2013gene interactions","volume":"13","author":"Van Steen","year":"2012","journal-title":"Brief Bioinform"},{"key":"2020011102353110400_ref139","first-page":"24","volume-title":"MB-MDR: Model-Based Multifactor Dimensionality Reduction for Detecting Interactions in High-Dimensional Genomic Data Tech. Rep","author":"Calle","year":"(2008)"},{"key":"2020011102353110400_ref140","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1111\/j.1469-1809.2010.00604.x","article-title":"Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise","volume":"75","author":"Cattaert","year":"2011","journal-title":"Ann Hum Genet"},{"key":"2020011102353110400_ref141","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1093\/bib\/bbv038","article-title":"A roadmap to multifactor dimensionality reduction methods","volume":"17","author":"Gola","year":"2016","journal-title":"Brief Bioinform"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/20\/6\/2200\/31789463\/bby081.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/20\/6\/2200\/31789463\/bby081.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,31]],"date-time":"2022-08-31T01:08:55Z","timestamp":1661908135000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/20\/6\/2200\/5095727"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,9,14]]},"references-count":141,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2018,9,14]]},"published-print":{"date-parts":[[2019,11,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bby081","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,11]]},"published":{"date-parts":[[2018,9,14]]}}}