{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,29]],"date-time":"2025-01-29T05:49:06Z","timestamp":1738129746899,"version":"3.33.0"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Inferring population structures using genetic data sampled from a group of individuals is a challenging task. Many methods either consider a fixed population number or ignore the correlation between populations. As a result, they can lose sensitivity and specificity in detecting subtle stratifications. In addition, when a large number of genetic markers are used, many existing algorithms perform rather inefficiently.<\/jats:p><jats:p>Result: We propose a new Bayesian method to infer population structures using multiple unlinked single nucleotide polymorphisms (SNPs). Our approach explicitly considers the population correlation through a tree hierarchy, and treat the population number as a random variable. Using both simulated and real datasets of worldwide samples, we demonstrate that an incorporated tree can consistently improve the power in detecting subtle population stratifications. A tree-based model often involves a large number of unknown parameters, and the corresponding estimation procedure can be highly inefficient. We further implement a partition method to analytically integrate out all nuisance parameters in the tree. As a result, our method can analyze large SNP datasets with significantly improved convergence rate.<\/jats:p><jats:p>Availability: \u00a0http:\/\/www.stat.psu.edu\/~yuzhang\/tips.tar<\/jats:p><jats:p>Contact: \u00a0yuzhang@stat.psu.edu<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn070","type":"journal-article","created":{"date-parts":[[2008,2,23]],"date-time":"2008-02-23T01:34:37Z","timestamp":1203730477000},"page":"965-971","source":"Crossref","is-referenced-by-count":5,"title":["Tree-guided Bayesian inference of population structures"],"prefix":"10.1093","volume":"24","author":[{"given":"Yu","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Statistics, The Pennsylvania State University, State College, PA, USA"}]}],"member":"286","published-online":{"date-parts":[[2008,2,22]]},"reference":[{"key":"2023020209531958600_B1","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/BF01441146","article-title":"A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identify and paternity","volume":"96","author":"Balding","year":"1995","journal-title":"Genetica"},{"key":"2023020209531958600_B2","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1093\/genetics\/163.1.367","article-title":"Bayesian analysis of genetic differentiation between populations","volume":"163","author":"Corander","year":"2003","journal-title":"Genetics"},{"key":"2023020209531958600_B3","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1017\/S001667230100502X","article-title":"A Bayesian approach to the identification of panmictic populations and the assignment of individuals","volume":"78","author":"Dawson","year":"2001","journal-title":"Genet. Res"},{"key":"2023020209531958600_B4","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1111\/j.0006-341X.1999.00997.x","article-title":"Genomic control for association studies","volume":"55","author":"Devlin","year":"1999","journal-title":"Biometrics"},{"key":"2023020209531958600_B5","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1111\/j.2517-6161.1994.tb01985.x","article-title":"Estimation of finite mixture distributions through Bayesian sampling","volume":"56","author":"Diebolt","year":"1994","journal-title":"J. Roy. Stat. Soc. Ser. B"},{"key":"2023020209531958600_B6","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1177\/1359786806066041","article-title":"Using ancestry-informative markers to define populations and detect population stratification","volume":"20","author":"Enoch","year":"2006","journal-title":"J. Psychopharmacol"},{"key":"2023020209531958600_B7","doi-asserted-by":"crossref","first-page":"2611","DOI":"10.1111\/j.1365-294X.2005.02553.x","article-title":"Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study","volume":"14","author":"Evanno","year":"2005","journal-title":"Mol. Ecol"},{"key":"2023020209531958600_B8","doi-asserted-by":"crossref","first-page":"1567","DOI":"10.1093\/genetics\/164.4.1567","article-title":"Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies","volume":"164","author":"Falush","year":"2003","journal-title":"Genetics"},{"key":"2023020209531958600_B9","doi-asserted-by":"crossref","first-page":"27","DOI":"10.2307\/2412810","article-title":"The number of evolutionary trees","volume":"27","author":"Felsenstein","year":"1978","journal-title":"Syst. Zool"},{"key":"2023020209531958600_B10","doi-asserted-by":"crossref","first-page":"805","DOI":"10.1534\/genetics.106.059923","article-title":"Bayesian clustering using hidden Markov random fields in spatial population genetics","volume":"174","author":"Francois","year":"2006","journal-title":"Genetics"},{"key":"2023020209531958600_B11","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1038\/ng1333","article-title":"Assessing the impact of population stratification on genetic association studies","volume":"36","author":"Freedman","year":"2004","journal-title":"Nat. Genet"},{"key":"2023020209531958600_B12","doi-asserted-by":"crossref","first-page":"711","DOI":"10.1093\/biomet\/82.4.711","article-title":"Reversible jump Markov chain Monte Carlo computation and Bayesian model determination","volume":"82","author":"Green","year":"1995","journal-title":"Biometrica"},{"key":"2023020209531958600_B13","doi-asserted-by":"crossref","first-page":"895","DOI":"10.1086\/521372","article-title":"A randomization test for controlling population stratification in whole-genome association studies","volume":"81","author":"Kimmel","year":"2007","journal-title":"Am J Hum Genet"},{"key":"2023020209531958600_B14","first-page":"98","article-title":"Case-control association tests correcting for population stratification","volume":"69","author":"Kohler","year":"2005","journal-title":"Am. J. Hum. Genet"},{"key":"2023020209531958600_B15","doi-asserted-by":"crossref","first-page":"2037","DOI":"10.1126\/science.8091226","article-title":"Genetic dissection of complex raits","volume":"265","author":"Lander","year":"1994","journal-title":"Science"},{"volume-title":"Monte Carlo Strategies in Scientific Computing","year":"2001","author":"Liu","key":"2023020209531958600_B16"},{"key":"2023020209531958600_B17","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1038\/ng1337","article-title":"The effects of human population structure on large genetic association studies","volume":"36","author":"Marchini","year":"2004","journal-title":"Nat. Genet"},{"key":"2023020209531958600_B18","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1139\/f05-224","article-title":"The Gibbs and split\u2013merge sampler for population mixture analysis from genetic data with incomplete baselines","volume":"63","author":"Pella","year":"2006","journal-title":"Can. J. Fish. Aquat. Sci"},{"key":"2023020209531958600_B19","doi-asserted-by":"crossref","first-page":"904","DOI":"10.1038\/ng1847","article-title":"Principal components analysis corrects for stratification in genome-wide association studies","volume":"38","author":"Price","year":"2006","journal-title":"Nat. Genet"},{"key":"2023020209531958600_B20","doi-asserted-by":"crossref","first-page":"945","DOI":"10.1093\/genetics\/155.2.945","article-title":"inference of Population structure using multilocus genotype data","volume":"155","author":"Pritchard","year":"2000","journal-title":"Genetics"},{"key":"2023020209531958600_B21","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1086\/519795","article-title":"PLINK: a toolset for whole-genome association and population-based linkage analysis","volume":"81","author":"Purcell","year":"2007","journal-title":"Am J Hum Genet"},{"key":"2023020209531958600_B22","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1534\/genetics.105.055335","article-title":"A general population-genetic model for the production by population structure of spurious genotype\u2013phenotype associations in discrete, admixed or spatially distributed populations","volume":"173","author":"Rosenberg","year":"2006","journal-title":"Genetics"},{"key":"2023020209531958600_B23","doi-asserted-by":"crossref","first-page":"2981","DOI":"10.1126\/science.1078311","article-title":"Genetic structure of human populations","volume":"298","author":"Rosenberg","year":"2002","journal-title":"Science"},{"key":"2023020209531958600_B24","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1086\/318195","article-title":"Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model","volume":"68","author":"Satten","year":"2001","journal-title":"Am. J. Hum. Genet"},{"key":"2023020209531958600_B25","first-page":"503","article-title":"An efficient and accurate graph-based method to detect population substructure","author":"Sridhar","year":"2007"},{"volume-title":"African Exodus: The Origins of Modern Humanity","year":"1996","author":"Stringer","key":"2023020209531958600_B26"},{"key":"2023020209531958600_B27","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1038\/nature02168","article-title":"The International HapMap Project","volume":"426","author":"The International HapMap Consortium","year":"2003","journal-title":"Nature"},{"key":"2023020209531958600_B28","doi-asserted-by":"crossref","first-page":"1299","DOI":"10.1038\/nature04226","article-title":"A haplotype map of the human genome","volume":"437","author":"The International HapMap Consortium","year":"2005","journal-title":"Nature"},{"key":"2023020209531958600_B29","first-page":"513","article-title":"Population stratification: a problem for case-control studies of candidate-gene associations?","volume":"11","author":"Thomas","year":"2002","journal-title":"Cancer Epidemiol. Biomarkers Prev"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/7\/965\/49046427\/bioinformatics_24_7_965.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/7\/965\/49046427\/bioinformatics_24_7_965.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,28]],"date-time":"2025-01-28T18:18:09Z","timestamp":1738088289000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/7\/965\/297985"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,2,22]]},"references-count":29,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2008,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn070","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2008,4,1]]},"published":{"date-parts":[[2008,2,22]]}}}