{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T16:25:31Z","timestamp":1764174331005},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2016,12,26]],"date-time":"2016-12-26T00:00:00Z","timestamp":1482710400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"name":"CSoI fellowship during the course of this work","award":["CCF-0939370"],"award-info":[{"award-number":["CCF-0939370"]}]},{"name":"NIH","award":["HG008140"],"award-info":[{"award-number":["HG008140"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis (PCA) and multidimensional scaling, or using explicit spatial probabilistic models of allele frequency evolution. We develop a general probabilistic model and an associated inference algorithm that unify the model-based and data-driven approaches to visualizing and inferring population structure. Our spatial inference algorithm can also be effectively applied to the problem of population stratification in genome-wide association studies (GWAS), where hidden population structure can create fictitious associations when population ancestry is correlated with both the genotype and the trait.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our algorithm Geographic Ancestry Positioning (GAP) relates local genetic distances between samples to their spatial distances, and can be used for visually discerning population structure as well as accurately inferring the spatial origin of individuals on a two-dimensional continuum. On both simulated and several real datasets from diverse human populations, GAP exhibits substantially lower error in reconstructing spatial ancestry coordinates compared to PCA. We also develop an association test that uses the ancestry coordinates inferred by GAP to accurately account for ancestry-induced correlations in GWAS. Based on simulations and analysis of a dataset of 10 metabolic traits measured in a Northern Finland cohort, which is known to exhibit significant population structure, we find that our method has superior power to current approaches.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>Our software is available at https:\/\/github.com\/anand-bhaskar\/gap.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw720","type":"journal-article","created":{"date-parts":[[2016,11,11]],"date-time":"2016-11-11T12:05:57Z","timestamp":1478865957000},"page":"879-885","source":"Crossref","is-referenced-by-count":7,"title":["Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies"],"prefix":"10.1093","volume":"33","author":[{"given":"Anand","family":"Bhaskar","sequence":"first","affiliation":[{"name":"Department of Genetics, Stanford University, Stanford, CA, USA"},{"name":"Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA"}]},{"given":"Adel","family":"Javanmard","sequence":"additional","affiliation":[{"name":"Marshall School of Business, University of Southern California, Los Angeles, CA, USA"}]},{"given":"Thomas A","family":"Courtade","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA"}]},{"given":"David","family":"Tse","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA"},{"name":"Department of Electrical Engineering, Stanford University, Stanford, CA, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,12,26]]},"reference":[{"key":"2023020204511772000_btw720-B1","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"1000 Genomes Project Consortium","year":"2010","journal-title":"Nature"},{"key":"2023020204511772000_btw720-B2","doi-asserted-by":"crossref","first-page":"905","DOI":"10.1089\/cmb.2015.0080","article-title":"A note on the relations between spatio-genetic models","volume":"22","author":"Baran","year":"2015","journal-title":"J. Comput. Biol"},{"key":"2023020204511772000_btw720-B3","doi-asserted-by":"crossref","first-page":"e1005703\u2013e1005703.","DOI":"10.1371\/journal.pgen.1005703","article-title":"A spatial framework for understanding population structure and admixture","volume":"12","author":"Bradburd","year":"2016","journal-title":"PLoS Genet"},{"key":"2023020204511772000_btw720-B4","doi-asserted-by":"crossref","first-page":"868","DOI":"10.1038\/ng1607","article-title":"Demonstrating stratification in a European American population","volume":"37","author":"Campbell","year":"2005","journal-title":"Nat. Genet"},{"key":"2023020204511772000_btw720-B5","volume-title":"The History and Geography of Human Genes","author":"Cavalli-Sforza","year":"1994"},{"key":"2023020204511772000_btw720-B6","doi-asserted-by":"crossref","first-page":"e1000500.","DOI":"10.1371\/journal.pgen.1000500","article-title":"The role of geography in human adaptation","volume":"5","author":"Coop","year":"2009","journal-title":"PLoS Genet"},{"key":"2023020204511772000_btw720-B7","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1111\/j.0006-341X.1999.00997.x","article-title":"Genomic control for association studies","volume":"55","author":"Devlin","year":"1999","journal-title":"Biometrics"},{"key":"2023020204511772000_btw720-B8","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1093\/bioinformatics\/btv641","article-title":"Probabilistic models of genetic variation in structured populations applied to global human studies","volume":"32","author":"Hao","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020204511772000_btw720-B9","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1126\/science.1243518","article-title":"A genetic atlas of human admixture history","volume":"343","author":"Hellenthal","year":"2014","journal-title":"Science"},{"key":"2023020204511772000_btw720-B10","doi-asserted-by":"crossref","first-page":"998","DOI":"10.1038\/nature06742","article-title":"Genotype, haplotype and copy-number variation in worldwide human populations","volume":"451","author":"Jakobsson","year":"2008","journal-title":"Nature"},{"key":"2023020204511772000_btw720-B11","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1093\/molbev\/mss259","article-title":"Anisotropic isolation by distance: the main orientations of human genetic differentiation","volume":"30","author":"Jay","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023020204511772000_btw720-B12","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1038\/ng.548","article-title":"Variance component model to account for sample structure in genome-wide association studies","volume":"42","author":"Kang","year":"2010","journal-title":"Nat. Genet"},{"key":"2023020204511772000_btw720-B13","doi-asserted-by":"crossref","first-page":"1241","DOI":"10.1016\/j.cub.2008.07.049","article-title":"Correlation between genetic and geographic structure in Europe","volume":"18","author":"Lao","year":"2008","journal-title":"Curr. Biol"},{"key":"2023020204511772000_btw720-B14","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1038\/nature13673","article-title":"Ancient human genomes suggest three ancestral populations for present-day Europeans","volume":"513","author":"Lazaridis","year":"2014","journal-title":"Nature"},{"key":"2023020204511772000_btw720-B15","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1093\/genetics\/74.1.175","article-title":"Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms","volume":"74","author":"Lewontin","year":"1973","journal-title":"Genetics"},{"key":"2023020204511772000_btw720-B16","doi-asserted-by":"crossref","first-page":"e1000686.","DOI":"10.1371\/journal.pgen.1000686","article-title":"A genealogical interpretation of principal components analysis","volume":"5","author":"McVean","year":"2009","journal-title":"PLoS Genet"},{"key":"2023020204511772000_btw720-B17","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1016\/j.ajhg.2008.08.005","article-title":"The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research","volume":"83","author":"Nelson","year":"2008","journal-title":"Am. J. Hum. Genet"},{"key":"2023020204511772000_btw720-B18","doi-asserted-by":"crossref","first-page":"646","DOI":"10.1038\/ng.139","article-title":"Interpreting principal component analyses of spatial population genetic variation","volume":"40","author":"Novembre","year":"2008","journal-title":"Nat. Genet"},{"key":"2023020204511772000_btw720-B19","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1038\/nature07331","article-title":"Genes mirror geography within Europe","volume":"456","author":"Novembre","year":"2008","journal-title":"Nature"},{"key":"2023020204511772000_btw720-B20","doi-asserted-by":"crossref","first-page":"1672","DOI":"10.1371\/journal.pgen.0030160","article-title":"PCA-correlated SNPs for structure identification in worldwide human populations","volume":"3","author":"Paschou","year":"2007","journal-title":"PLoS Genet"},{"key":"2023020204511772000_btw720-B21","doi-asserted-by":"crossref","first-page":"e190.","DOI":"10.1371\/journal.pgen.0020190","article-title":"Population structure and eigenanalysis","volume":"2","author":"Patterson","year":"2006","journal-title":"PLoS Genet"},{"key":"2023020204511772000_btw720-B22","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1093\/biomet\/66.3.403","article-title":"Logistic disease incidence models and case\u2013control studies","volume":"66","author":"Prentice","year":"1979","journal-title":"Biometrika"},{"key":"2023020204511772000_btw720-B23","doi-asserted-by":"crossref","first-page":"904","DOI":"10.1038\/ng1847","article-title":"Principal components analysis corrects for stratification in genome-wide association studies","volume":"38","author":"Price","year":"2006","journal-title":"Nat. Genet"},{"key":"2023020204511772000_btw720-B24","doi-asserted-by":"crossref","first-page":"e1000519.","DOI":"10.1371\/journal.pgen.1000519","article-title":"Sensitive detection of chromosomal segments of distinct ancestry in admixed populations","volume":"5","author":"Price","year":"2009","journal-title":"PLoS Genet"},{"key":"2023020204511772000_btw720-B25","doi-asserted-by":"crossref","first-page":"15942","DOI":"10.1073\/pnas.0507611102","article-title":"Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa","volume":"102","author":"Ramachandran","year":"2005","journal-title":"pnas"},{"key":"2023020204511772000_btw720-B26","doi-asserted-by":"crossref","first-page":"2915","DOI":"10.1093\/bioinformatics\/btu418","article-title":"Fast spatial ancestry via flexible allele frequency surfaces","volume":"30","author":"Ra\u00f1ola","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020204511772000_btw720-B27","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1038\/ng.271","article-title":"Genome-wide association analysis of metabolic traits in a birth cohort from a founder population","volume":"41","author":"Sabatti","year":"2009","journal-title":"Nat. Genet"},{"key":"2023020204511772000_btw720-B28","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1093\/biomet\/66.3.605","article-title":"On optimal and data-based histograms","volume":"66","author":"Scott","year":"1979","journal-title":"Biometrika"},{"key":"2023020204511772000_btw720-B29","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1038\/ng.3244","article-title":"Testing for genetic associations in arbitrarily structured populations","volume":"47","author":"Song","year":"2015","journal-title":"Nat. Genet"},{"key":"2023020204511772000_btw720-B30","doi-asserted-by":"crossref","first-page":"2319","DOI":"10.1126\/science.290.5500.2319","article-title":"A global geometric framework for nonlinear dimensionality reduction","volume":"290","author":"Tenenbaum","year":"2000","journal-title":"Science"},{"key":"2023020204511772000_btw720-B31","doi-asserted-by":"crossref","first-page":"14847","DOI":"10.1073\/pnas.0403170101","article-title":"Assigning African elephant DNA to geographic region of origin: applications to the ivory trade","volume":"101","author":"Wasser","year":"2004","journal-title":"PNAS"},{"key":"2023020204511772000_btw720-B32","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/ng.2285","article-title":"A model-based approach for analysis of spatial structure in genetic data","volume":"44","author":"Yang","year":"2012","journal-title":"Nat. Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/6\/879\/49038209\/bioinformatics_33_6_879.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/6\/879\/49038209\/bioinformatics_33_6_879.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T04:54:06Z","timestamp":1675313646000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/6\/879\/2736312"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2016,12,26]]},"references-count":32,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2017,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw720","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,3,15]]},"published":{"date-parts":[[2016,12,26]]}}}