{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T23:17:39Z","timestamp":1773011859308,"version":"3.50.1"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2022,5,23]],"date-time":"2022-05-23T00:00:00Z","timestamp":1653264000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001732","name":"Danish National Research Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001732","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Lundbeck Foundation Fellowship","award":["R335-2019-2339"],"award-info":[{"award-number":["R335-2019-2339"]}]},{"name":"Bjarni J. Vilhj\u00e1lmsson"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,6,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Measuring genetic diversity is an important problem because increasing genetic diversity is a key to making new genetic discoveries, while also being a major source of confounding to be aware of in genetics studies.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Using the UK Biobank data, a prospective cohort study with deep genetic and phenotypic data collected on almost 500\u00a0000 individuals from across the UK, we carefully define 21 distinct ancestry groups from all four corners of the world. These ancestry groups can serve as a global reference of worldwide populations, with a handful of applications. Here, we develop a method that uses allele frequencies and principal components derived from these ancestry groups to effectively measure ancestry proportions from allele frequencies of any genetic dataset.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>This method is implemented in function snp_ancestry_summary of R package bigsnpr.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac348","type":"journal-article","created":{"date-parts":[[2022,5,23]],"date-time":"2022-05-23T08:37:38Z","timestamp":1653295058000},"page":"3477-3480","source":"Crossref","is-referenced-by-count":37,"title":["Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3138-2078","authenticated-orcid":false,"given":"Florian","family":"Priv\u00e9","sequence":"first","affiliation":[{"name":"National Centre for Register-based Research, Aarhus University , Aarhus 8210, Denmark"}]}],"member":"286","published-online":{"date-parts":[[2022,5,23]]},"reference":[{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature15393","article-title":"A global reference for human genetic variation","volume":"526","year":"2015","journal-title":"Nature"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1016\/j.ajhg.2021.05.016","article-title":"Summix: a method for detecting and adjusting for population structure in genetic summary data","volume":"108","author":"Arriaga-MacKenzie","year":"2021","journal-title":"Am. J. Hum. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1038\/s41586-020-2302-0","article-title":"A positively selected FBN1 missense variant reduces height in Peruvian individuals","volume":"582","author":"Asgari","year":"2020","journal-title":"Nature"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"208","DOI":"10.32614\/RJ-2021-048","article-title":"A unifying framework for parallel and distributed processing in R using futures","volume":"13","author":"Bengtsson","year":"2021","journal-title":"R J"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","DOI":"10.1126\/science.aay5012","article-title":"Insights into human genetic variation and population history from 929 diverse genomes","volume":"367","author":"Bergstr\u00f6m","year":"2020","journal-title":"Science"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1204","DOI":"10.1007\/s00125-019-4880-7","article-title":"Genome-wide association study of type 2 diabetes in africa","volume":"62","author":"Chen","year":"2019","journal-title":"Diabetologia"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1038\/s41588-018-0064-5","article-title":"A large electronic-health-record-based genome-wide study of serum lipids","volume":"50","author":"Hoffmann","year":"2018","journal-title":"Nat. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1038\/nature09298","article-title":"Integrating common and rare genetic variation in diverse human populations","volume":"467","year":"2010","journal-title":"Nature"},{"key":"2023041408095339100_","article-title":"FinnGen: unique genetic insights from combining isolated population and national health register data","author":"Kurki","year":"2022","journal-title":"medRxiv"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/ncomms10495","article-title":"New loci for body fat percentage reveal link between adiposity and cardiometabolic disease risk","volume":"7","author":"Lu","year":"2016","journal-title":"Nat. Commun"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"D896","DOI":"10.1093\/nar\/gkw1133","article-title":"The new NHGRI-EBI catalog of published genome-wide association studies (GWAS catalog)","volume":"45","author":"MacArthur","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1038\/nature18964","article-title":"The simons genome diversity project: 300 genomes from 142 diverse populations","volume":"538","author":"Mallick","year":"2016","journal-title":"Nature"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1038\/nature24284","article-title":"Association analysis identifies 65 new breast cancer risk loci","volume":"551","author":"Michailidou","year":"2017","journal-title":"Nature"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1121","DOI":"10.1038\/ng.3396","article-title":"A comprehensive 1000 genomes\u2013based genome-wide association Meta-analysis of coronary artery disease","volume":"47","author":"Nikpay","year":"2015","journal-title":"Nat. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1449","DOI":"10.1038\/ng.3424","article-title":"Multi-ethnic genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis","volume":"47","author":"Paternoster","year":"2015","journal-title":"Nat. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"2781","DOI":"10.1093\/bioinformatics\/bty185","article-title":"Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr","volume":"34","author":"Priv\u00e9","year":"2018","journal-title":"Bioinformatics"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"4449","DOI":"10.1093\/bioinformatics\/btaa520","article-title":"Efficient toolkit implementing best practices for principal component analysis of population genetic data","volume":"36","author":"Priv\u00e9","year":"2020","journal-title":"Bioinformatics"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.ajhg.2021.11.008","article-title":"Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort","volume":"109","author":"Priv\u00e9","year":"2022","journal-title":"Am. J. Hum. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1415","DOI":"10.1038\/s41588-021-00931-x","article-title":"A cross-population atlas of genetic associations for 220 human phenotypes","volume":"53","author":"Sakaue","year":"2021","journal-title":"Nat. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1038\/s41588-018-0142-8","article-title":"Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci","volume":"50","author":"Schumacher","year":"2018","journal-title":"Nat. Genet"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-016-1082-x","article-title":"Efficient analysis of large datasets and sex bias with admixture","volume":"17","author":"Shringarpure","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-021-21381-3","article-title":"Whole genome sequencing in the Middle Eastern qatari population identifies genetic associations with 45 clinically relevant traits","volume":"12","author":"Thareja","year":"2021","journal-title":"Nat. Commun"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1038\/s41586-021-03767-x","article-title":"Mapping the human genetic architecture of COVID-19","volume":"600","year":"2021","journal-title":"Nature"},{"key":"2023041408095339100_","article-title":"Genome-wide mega-analysis identifies 16 loci and highlights diverse biological mechanisms in the common epilepsies","volume":"9","year":"2018","journal-title":"Nat. Commun"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1459","DOI":"10.1038\/s41588-019-0504-x","article-title":"Target genes, variants, tissues and transcriptional pathways influencing human serum urate levels","volume":"51","author":"Tin","year":"2019","journal-title":"Nat. Genet"},{"key":"2023041408095339100_","volume-title":"quadprog: Functions to Solve Quadratic Programming Problems","author":"Turlach","year":"2019"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"1686","DOI":"10.21105\/joss.01686","article-title":"Welcome to the tidyverse","volume":"4","author":"Wickham","year":"2019","journal-title":"J Open Source Softw"},{"key":"2023041408095339100_","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1038\/s41586-019-1310-4","article-title":"Genetic analyses of diverse populations improves discovery for complex traits","volume":"570","author":"Wojcik","year":"2019","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac348\/43882788\/btac348.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/13\/3477\/49884085\/btac348.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/13\/3477\/49884085\/btac348.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T21:07:32Z","timestamp":1700600852000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/13\/3477\/6590645"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,5,23]]},"references-count":28,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2022,6,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac348","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.10.27.466078","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,7,1]]},"published":{"date-parts":[[2022,5,23]]}}}