{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:50Z","timestamp":1772138090775,"version":"3.50.1"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2018,10,29]],"date-time":"2018-10-29T00:00:00Z","timestamp":1540771200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010269","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["WT099772"],"award-info":[{"award-number":["WT099772"]}],"id":[{"id":"10.13039\/100010269","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000265","name":"MRC","doi-asserted-by":"publisher","award":["MC_UU_00002\/4"],"award-info":[{"award-number":["MC_UU_00002\/4"]}],"id":[{"id":"10.13039\/501100000265","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Methods for analysis of GWAS summary statistics have encouraged data sharing and democratized the analysis of different diseases. Ideal validation for such methods is application to simulated data, where some \u2018truth\u2019 is known. As GWAS increase in size, so does the computational complexity of such evaluations; standard practice repeatedly simulates and analyses genotype data for all individuals in an example study.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We have developed a novel method based on an alternative approach, directly simulating GWAS summary data, without individual data as an intermediate step. We mathematically derive the expected statistics for any set of causal variants and their effect sizes, conditional upon control haplotype frequencies (available from public reference datasets). Simulation of GWAS summary output can be conducted independently of sample size by simulating random variates about these expected values. Across a range of scenarios, our method, produces very similar output to that from simulating individual genotypes with a substantial gain in speed even for modest sample sizes. Fast simulation of GWAS summary statistics will enable more complete and rapid evaluation of summary statistic methods as well as opening new potential avenues of research in fine mapping and gene set enrichment analysis.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Our method is available under a GPL license as an R package from http:\/\/github.com\/chr1swallace\/simGWAS.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty898","type":"journal-article","created":{"date-parts":[[2018,10,26]],"date-time":"2018-10-26T15:50:14Z","timestamp":1540569014000},"page":"1901-1906","source":"Crossref","is-referenced-by-count":29,"title":["simGWAS: a fast method for simulation of large scale case\u2013control GWAS summary statistics"],"prefix":"10.1093","volume":"35","author":[{"given":"Mary D","family":"Fortune","sequence":"first","affiliation":[{"name":"MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK"},{"name":"Department of Medicine, University of Cambridge, Addenbrooke\u2019s Hospital, Cambridge, UK"}]},{"given":"Chris","family":"Wallace","sequence":"additional","affiliation":[{"name":"MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK"},{"name":"Department of Medicine, University of Cambridge, Addenbrooke\u2019s Hospital, Cambridge, UK"}]}],"member":"286","published-online":{"date-parts":[[2018,10,29]]},"reference":[{"key":"2023012713225544100_bty898-B1","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature15393","article-title":"A global reference for human genetic variation","volume":"526","year":"2015","journal-title":"Nature"},{"key":"2023012713225544100_bty898-B2","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1093\/bioinformatics\/btv546","article-title":"Approximately independent linkage disequilibrium blocks in human populations","volume":"32","author":"Berisa","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012713225544100_bty898-B3","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1038\/ng.3211","article-title":"LD Score regression distinguishes confounding from polygenicity in genome-wide association studies","volume":"47","author":"Bulik-Sullivan","year":"2015","journal-title":"Nat. Genet."},{"key":"2023012713225544100_bty898-B4","first-page":"3342","article-title":"VSEAMS: a pipeline for variant set enrichment analysis using summary GWAS data identifies IKZF3, BATF and ESRRA as key transcription factors in type 1 diabetes","volume":"30","author":"Burren","year":"2014","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023012713225544100_bty898-B5","first-page":"1593","article-title":"An atlas of genetic associations in UK Biobank","volume-title":"Nat. Genet.","author":"Canela-Xandri","year":"2018"},{"key":"2023012713225544100_bty898-B6","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1534\/genetics.115.176107","article-title":"Fine mapping causal variants with an approximate bayesian method using marginal test statistics","volume":"200","author":"Chen","year":"2015","journal-title":"Genetics"},{"key":"2023012713225544100_bty898-B7","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1002\/gepi.22092","article-title":"Predictive accuracy of combined genetic and environmental risk scores","volume":"42","author":"Dudbridge","year":"2018","journal-title":"Genet. Epidemiol."},{"key":"2023012713225544100_bty898-B8","doi-asserted-by":"crossref","first-page":"e41018","DOI":"10.1371\/journal.pone.0041018","article-title":"Comparison of methods for competitive tests of pathway analysis","volume":"7","author":"Evangelou","year":"2012","journal-title":"PLoS One"},{"key":"2023012713225544100_bty898-B9","doi-asserted-by":"crossref","first-page":"e1004383","DOI":"10.1371\/journal.pgen.1004383","article-title":"Bayesian test for colocalisation between pairs of genetic association studies using summary statistics","volume":"10","author":"Giambartolomei","year":"2014","journal-title":"PLoS Genet."},{"key":"2023012713225544100_bty898-B10","doi-asserted-by":"crossref","first-page":"e1004722","DOI":"10.1371\/journal.pgen.1004722","article-title":"Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies","volume":"10","author":"Kichaev","year":"2014","journal-title":"PLoS Genet."},{"key":"2023012713225544100_bty898-B11","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1093\/bioinformatics\/btm549","article-title":"GWAsimulator: a rapid whole-genome simulation program","volume":"24","author":"Li","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012713225544100_bty898-B12","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1038\/ng2088","article-title":"A new multipoint method for genome-wide association studies by imputation of genotypes","volume":"39","author":"Marchini","year":"2007","journal-title":"Nat. Genet."},{"key":"2023012713225544100_bty898-B13","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4899-3244-0","volume-title":"Generalized Linear Models. Generalized Linear Models","author":"McCullagh","year":"1983"},{"key":"2023012713225544100_bty898-B14","doi-asserted-by":"crossref","first-page":"2951","DOI":"10.1093\/bioinformatics\/bty197","article-title":"Phenotypesimulator: a comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships","volume":"34","author":"Meyer","year":"2018","journal-title":"Bioinformatics"},{"key":"2023012713225544100_bty898-B15","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1038\/nature24284","article-title":"Association analysis identifies 65 new breast cancer risk loci","volume":"551","author":"Michailidou","year":"2017","journal-title":"Nature"},{"key":"2023012713225544100_bty898-B16","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1002\/gepi.21953","article-title":"JAM: a Scalable Bayesian Framework for joint analysis of marginal SNP effects","volume":"40","author":"Newcombe","year":"2016","journal-title":"Genet. Epidemiol."},{"key":"2023012713225544100_bty898-B17","doi-asserted-by":"crossref","first-page":"e1000665","DOI":"10.1371\/journal.pgen.1000665","article-title":"Public access to genome-wide data: five views on balancing research with privacy and protection","volume":"5","year":"2009","journal-title":"PLoS Genet."},{"key":"2023012713225544100_bty898-B18","doi-asserted-by":"crossref","first-page":"1253","DOI":"10.2307\/2533494","article-title":"From genotypes to genes: doubling the sample size","volume":"53","author":"Sasieni","year":"1997","journal-title":"Biometrics"},{"key":"2023012713225544100_bty898-B19","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/s12859-017-2004-2","article-title":"Simulating autosomal genotypes with realistic linkage disequilibrium and a spiked-in genetic effect","volume":"19","author":"Shi","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023012713225544100_bty898-B20","doi-asserted-by":"crossref","first-page":"2304","DOI":"10.1093\/bioinformatics\/btr341","article-title":"HAPGEN2: simulation of multiple disease SNPs","volume":"27","author":"Su","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012713225544100_bty898-B21","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.ajhg.2015.05.016","article-title":"Disentangling the effects of colocalizing genomic annotations to functionally prioritize non-coding variants within complex-trait loci","volume":"97","author":"Trynka","year":"2015","journal-title":"Am. J. Human Genet."},{"key":"2023012713225544100_bty898-B22","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.ajhg.2017.06.005","article-title":"10 Years of GWAS discovery: biology, function, and translation","volume":"101","author":"Visscher","year":"2017","journal-title":"Am. J. Hum. Genet."},{"key":"2023012713225544100_bty898-B23","doi-asserted-by":"crossref","first-page":"e1005272","DOI":"10.1371\/journal.pgen.1005272","article-title":"Dissection of a complex disease susceptibility region using a Bayesian Stochastic Search Approach to fine mapping","volume":"11","author":"Wallace","year":"2015","journal-title":"PLoS Genet."},{"key":"2023012713225544100_bty898-B24","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1038\/ng.3768","article-title":"Genome-wide association analysis identifies novel blood pressure loci and offers biological insights into cardiovascular risk","volume":"49","author":"Warren","year":"2017","journal-title":"Nat. Genet."},{"key":"2023012713225544100_bty898-B25","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1038\/ng.3538","article-title":"Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets","volume":"48","author":"Zhu","year":"2016","journal-title":"Nat. Genet."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/11\/1901\/48934839\/bioinformatics_35_11_1901.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/11\/1901\/48934839\/bioinformatics_35_11_1901.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,6]],"date-time":"2023-09-06T08:31:50Z","timestamp":1693989110000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/11\/1901\/5146346"}},"subtitle":[],"editor":[{"given":"Oliver","family":"Stegle","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,10,29]]},"references-count":25,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2019,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty898","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/313023","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,6,1]]},"published":{"date-parts":[[2018,10,29]]}}}