{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T08:45:04Z","timestamp":1778661904363,"version":"3.51.4"},"reference-count":10,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2016,3,9]],"date-time":"2016-03-09T00:00:00Z","timestamp":1457481600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2016,3,9]],"date-time":"2016-03-09T00:00:00Z","timestamp":1457481600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U19CA148127"],"award-info":[{"award-number":["U19CA148127"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Identifying subpopulations within a study and inferring intercontinental ancestry of the samples are important steps in genome wide association studies. Two software packages are widely used in analysis of substructure: Structure and Eigenstrat. Structure assigns each individual to a population by using a Bayesian method with multiple tuning parameters. It requires considerable computational time when dealing with thousands of samples and lacks the ability to create scores that could be used as covariates. Eigenstrat uses a principal component analysis method to model all sources of sampling variation. However, it does not readily provide information directly relevant to ancestral origin; the eigenvectors generated by Eigenstrat are sample specific and thus cannot be generalized to other individuals.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We developed FastPop, an efficient R package that fills the gap between Structure and Eigenstrat. It can: 1, generate PCA scores that identify ancestral origins and can be used for multiple studies; 2, infer ancestry information for data arising from two or more intercontinental origins. We demonstrate the use of FastPop using 2318 SNP markers selected from the genome based on high variability among European, Asian and West African (African) populations. We conducted an analysis of 505 Hapmap samples with European, African or Asian ancestry along with 19661 additional samples of unknown ancestry. The results from FastPop are highly consistent with those obtained by Structure across the 19661 samples we studied. The correlations of the results between FastPop and Structure are 0.99, 0.97 and 0.99 for European, African and Asian ancestry scores, respectively. Compared with Structure, FastPop is more efficient as it finished ancestry inference for 19661 samples in 16\u00a0min compared with 21\u201324\u00a0h required by Structure. FastPop also provided scores based on SNP weights so the scores of reference population can be applied to other studies provided the same set of markers are used. We also present application of the method for studying four continental populations (European, Asian, African, and Native American).<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>We developed an algorithm that can infer ancestries on data involving two or more intercontinental origins. It is efficient for analyzing large datasets. Additionally the PCA derived scores can be applied to multiple data sets to ensure the same ancestry analysis is applied to all studies.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-016-0965-1","type":"journal-article","created":{"date-parts":[[2016,3,9]],"date-time":"2016-03-09T03:52:20Z","timestamp":1457495540000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":49,"title":["FastPop: a rapid principal component derived method to infer intercontinental ancestry using genetic data"],"prefix":"10.1186","volume":"17","author":[{"given":"Yafang","family":"Li","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinyoung","family":"Byun","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guoshuai","family":"Cai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangjun","family":"Xiao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Younghun","family":"Han","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Olivier","family":"Cornelis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James E.","family":"Dinulos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Joe","family":"Dennis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Douglas","family":"Easton","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ivan","family":"Gorlov","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael F.","family":"Seldin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christopher I.","family":"Amos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2016,3,9]]},"reference":[{"key":"965_CR1","doi-asserted-by":"publisher","first-page":"997","DOI":"10.1111\/j.0006-341X.1999.00997.x","volume":"55","author":"B Devlin","year":"1999","unstructured":"Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55:997\u20131004.","journal-title":"Biometrics"},{"key":"965_CR2","doi-asserted-by":"crossref","first-page":"945","DOI":"10.1093\/genetics\/155.2.945","volume":"155","author":"JK Pritchard","year":"2000","unstructured":"Pritchard JK et al. Inference of population structure using multilocus geno-type data. Genetics. 2000;155:945\u201359.","journal-title":"Genetics"},{"key":"965_CR3","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1006\/tpbi.2001.1543","volume":"60","author":"JK Pritchard","year":"2001","unstructured":"Pritchard JK. Case\u2013control studies of association in structured or admixed populations. Theor Popul Biol. 2001;60:227\u201337.","journal-title":"Theor Popul Biol"},{"key":"965_CR4","doi-asserted-by":"publisher","first-page":"786","DOI":"10.1126\/science.356262","volume":"201","author":"P Menozzi","year":"1978","unstructured":"Menozzi P et al. Synthetic maps of human gene frequencies in Europeans. Science. 1978;201:786\u201392.","journal-title":"Science"},{"key":"965_CR5","first-page":"2074","volume":"4","author":"N Patterson","year":"2006","unstructured":"Patterson N, Price A, Reich D. Population structure and eigenanalysis. PLoS Genet. 2006;4:2074\u201393.","journal-title":"PLoS Genet"},{"key":"965_CR6","doi-asserted-by":"publisher","first-page":"904","DOI":"10.1038\/ng1847","volume":"38","author":"AL Price","year":"2006","unstructured":"Price AL et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904\u20139.","journal-title":"Nat Genet"},{"issue":"9","key":"965_CR7","doi-asserted-by":"publisher","first-page":"1679","DOI":"10.1101\/gr.2529604","volume":"14","author":"D Serre","year":"2004","unstructured":"Serre D, Paabo S. Evidence for gradients of human genetic diversity within and among continents. Genome Res. 2004;14(9):1679\u201385.","journal-title":"Genome Res"},{"issue":"6","key":"965_CR8","doi-asserted-by":"publisher","first-page":"926","DOI":"10.1016\/j.ajhg.2015.04.018","volume":"96","author":"C Wang","year":"2015","unstructured":"Wang C, Zhan X, Liang L, Abecasis GR, Lin X. Improved ancestry extimation for both genotyping and sequencing data using projection procrustes analysis and genotype imputation. Am J Hum Genet. 2015;96(6):926\u201337.","journal-title":"Am J Hum Genet"},{"issue":"4","key":"965_CR9","doi-asserted-by":"publisher","first-page":"e93766","DOI":"10.1371\/journal.pone.0093766","volume":"9","author":"G Abraham","year":"2014","unstructured":"Abraham G, Inouye M. Fast principal component analysis of large-scale genome-wide data. PLoS ONE. 2014;9(4):e93766. doi:10.1371\/journal.pone.0093766.","journal-title":"PLoS ONE"},{"key":"965_CR10","doi-asserted-by":"crossref","unstructured":"Galinsky KJ, Bhatia G, Loh P, Georgiev S, Mukherjee S, et al. Fast principal components analysis reveals independent evolution of ADH1B gene in Europe and East Asia. 2015. doi: http:\/\/dx.doi.org\/10.1101\/018143","DOI":"10.1101\/018143"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-0965-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-016-0965-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-0965-1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-0965-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T17:59:24Z","timestamp":1706810364000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-016-0965-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,3,9]]},"references-count":10,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2016,12]]}},"alternative-id":["965"],"URL":"https:\/\/doi.org\/10.1186\/s12859-016-0965-1","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,3,9]]},"assertion":[{"value":"14 July 2015","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2016","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 March 2016","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"122"}}