{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,28]],"date-time":"2023-10-28T15:30:26Z","timestamp":1698507026576},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"17","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Summary: For many genome-wide association (GWA) studies individually genotyping one million or more SNPs provides a marginal increase in coverage at a substantial cost. Much of the information gained is redundant due to the correlation structure inherent in the human genome. Pooling-based GWA studies could benefit significantly by utilizing this redundancy to reduce noise, improve the accuracy of the observations and increase genomic coverage. We introduce a measure of correlation between individual genotyping and pooling, under the same framework that r2 provides a measure of linkage disequilibrium (LD) between pairs of SNPs. We then report a new non-haplotype multimarker multi-loci method that leverages the correlation structure between SNPs in the human genome to increase the efficacy of pooling-based GWA studies. We first give a theoretical framework and derivation of our multimarker method. Next, we evaluate simulations using this multimarker approach in comparison to single marker analysis. Finally, we experimentally evaluate our method using different pools of HapMap individuals on the Illumina 450S Duo, Illumina 550K and Affymetrix 5.0 platforms for a combined total of 1 333 631 SNPs. Our results show that use of multimarker analysis reduces noise specific to pooling-based studies, allows for efficient integration of multiple microarray platforms and provides more accurate measures of significance than single marker analysis. Additionally, this approach can be extended to allow for imputing the association significance for SNPs not directly observed using neighboring SNPs in LD. This multimarker method can now be used to cost-effectively complete pooling-based GWA studies with multiple platforms across over one million SNPs and to impute neighboring SNPs weighted for the loss of information due to pooling.<\/jats:p>\n               <jats:p>Contact: \u00a0dcraig@tgen.org<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn333","type":"journal-article","created":{"date-parts":[[2008,7,11]],"date-time":"2008-07-11T00:25:11Z","timestamp":1215735911000},"page":"1896-1902","source":"Crossref","is-referenced-by-count":16,"title":["Multimarker analysis and imputation of multiple platform pooling-based genome-wide association studies"],"prefix":"10.1093","volume":"24","author":[{"given":"Nils","family":"Homer","sequence":"first","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"},{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"Waibhav D.","family":"Tembe","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"Szabolcs","family":"Szelinger","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"Margot","family":"Redman","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"Dietrich A.","family":"Stephan","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"John V.","family":"Pearson","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"Stanley F.","family":"Nelson","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]},{"given":"David","family":"Craig","sequence":"additional","affiliation":[{"name":"1 Translational Genomics Research Institute (TGen), Phoenix, AZ 85004 and 2Department of Computer Science, University of California, Los Angeles CA 90095-7088, USA"}]}],"member":"286","published-online":{"date-parts":[[2008,7,10]]},"reference":[{"key":"2023020211092463900_B1","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1046\/j.1469-1809.2002.00125.x","article-title":"Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design","volume":"66","author":"Barratt","year":"2002","journal-title":"Ann. Hum. Genet"},{"key":"2023020211092463900_B2","doi-asserted-by":"crossref","DOI":"10.1038\/ng.163","article-title":"Common sequence variants on 20q11.22 confer melanoma susceptibility","author":"Brown","year":"2008","journal-title":"Nat. Genet."},{"key":"2023020211092463900_B3","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1186\/1471-2164-6-138","article-title":"Identification of disease causing loci using an array-based genotyping approach on pooled DNA","volume":"6","author":"Craig","year":"2005","journal-title":"BMC Genomics"},{"key":"2023020211092463900_B4","doi-asserted-by":"crossref","first-page":"690","DOI":"10.1002\/gepi.20180","article-title":"Imputation methods to improve inference in SNP association studies","volume":"30","author":"Dai","year":"2006","journal-title":"Genet. Epidemiol."},{"key":"2023020211092463900_B5","article-title":"A potential locus for end-stage renal disease in type 2 diabetes identified by a pooling-based genome-wide association study","volume-title":"Diabetes.","author":"Hanson","year":"2007"},{"key":"2023020211092463900_B6","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1186\/1479-7364-1-6-421","article-title":"Application of pooled genotyping to scan candidate regions for association with HDL cholesterol levels","volume":"1","author":"Hinds","year":"2004","journal-title":"Hum. Genomics"},{"key":"2023020211092463900_B7","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1007\/s004390000397","article-title":"Cheap, accurate and rapid allele frequency estimation of single nucleotide polymorphisms by primer extension and DHPLC in DNA pools","volume":"107","author":"Hoogendoorn","year":"2000","journal-title":"Hum. Genet."},{"key":"2023020211092463900_B8","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1093\/bioinformatics\/btl536","article-title":"SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays","volume":"23","author":"Hua","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020211092463900_B9","doi-asserted-by":"crossref","first-page":"546","DOI":"10.1093\/biostatistics\/kxl028","article-title":"Bayesian method for gene detection and mapping, using a case and control design and DNA pooling","volume":"8","author":"Johnson","year":"2007","journal-title":"Biostatistics"},{"key":"2023020211092463900_B10","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btm435","article-title":"HAPLOPOOL: improving haplotype frequency estimation through DNA pools and phylogenetic modeling","author":"Kirkpatrick","year":"2007","journal-title":"Bioinformatics."},{"key":"2023020211092463900_B11","doi-asserted-by":"crossref","first-page":"3841","DOI":"10.1002\/sim.1996","article-title":"Application of DNA pooling to large studies of disease","volume":"23","author":"Law","year":"2004","journal-title":"Stat. Med."},{"key":"2023020211092463900_B12","doi-asserted-by":"crossref","first-page":"e74","DOI":"10.1093\/nar\/gnf070","article-title":"SNP genotyping on pooled DNAs: comparison of genotyping technologies and a semi automated method for data storage and analysis","volume":"30","author":"Le Hellard","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023020211092463900_B13","doi-asserted-by":"crossref","first-page":"501","DOI":"10.1038\/sj.ejhg.5201768","article-title":"Most pooling variation in array-based DNA pooling is attributable to array error rather than pool construction error","volume":"15","author":"Macgregor","year":"2007","journal-title":"Eur. J. Hum. Genet."},{"key":"2023020211092463900_B14","doi-asserted-by":"crossref","first-page":"e55","DOI":"10.1093\/nar\/gkl136","article-title":"Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates","volume":"34","author":"Macgregor","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023020211092463900_B15","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1038\/ng2088","article-title":"A new multipoint method for genome-wide association studies by imputation of genotypes","volume":"39","author":"Marchini,J.","year":"2007","journal-title":"Nat. Genet."},{"key":"2023020211092463900_B16","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1016\/j.schres.2005.01.006","article-title":"Investigation of the apolipoprotein-L (APOL) gene family and schizophrenia using a novel DNA pooling strategy for public database SNPs","volume":"76","author":"McGhee","year":"2005","journal-title":"Schizophr. Res."},{"key":"2023020211092463900_B17","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1186\/1471-2164-6-52","article-title":"Genotyping DNA pools on microarrays: tackling the QTL problem of large samples and large numbers of SNPs","volume":"6","author":"Meaburn","year":"2005","journal-title":"BMC Genomics"},{"key":"2023020211092463900_B18","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1086\/513320","article-title":"Identification of a novel risk locus for progressive supranuclear palsy by a pooled genomewide scan of 500,288 single-nucleotide polymorphisms","volume":"80","author":"Melquist","year":"2007","journal-title":"Am. J. Hum. Genet."},{"key":"2023020211092463900_B19","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1126\/science.1129837","article-title":"Common Kibra alleles are associated with human memory performance","volume":"314","author":"Papassotiropoulos","year":"2006","journal-title":"Science"},{"key":"2023020211092463900_B20","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1086\/510686","article-title":"Identification of the genetic basis for complex disorders by use of pooling-based genomewide single-nucleotide-polymorphism association studies","volume":"80","author":"Pearson","year":"2007","journal-title":"Am. J. Hum. Genet."},{"key":"2023020211092463900_B21","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1086\/321275","article-title":"Linkage disequilibrium in humans: models and data","volume":"69","author":"Pritchard","year":"2001","journal-title":"Am. J. Hum. Genet."},{"key":"2023020211092463900_B22","doi-asserted-by":"crossref","first-page":"e114","DOI":"10.1371\/journal.pgen.0030114","article-title":"Imputation-based analysis of association studies: candidate regions and quantitative traits","volume":"3","author":"Servin","year":"2007","journal-title":"PLoS Genet."},{"key":"2023020211092463900_B23","doi-asserted-by":"crossref","first-page":"862","DOI":"10.1038\/nrg930","article-title":"DNA Pooling: a tool for large-scale association studies","volume":"3","author":"Sham","year":"2002","journal-title":"Nat. Rev. Genet."},{"key":"2023020211092463900_B24","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/sj.gene.6364359","article-title":"Genomic DNA pooling for whole-genome association scans in complex disease: empirical demonstration of efficacy in rheumatoid arthritis","volume":"8","author":"Steer","year":"2007","journal-title":"Genes Immun."},{"key":"2023020211092463900_B25","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nature05911","article-title":"Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls","volume":"447","year":"2007","journal-title":"Nature"},{"key":"2023020211092463900_B26","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1002\/gepi.10195","article-title":"On the use of DNA pooling to estimate haplotype frequencies","volume":"24","author":"Wang","year":"2003","journal-title":"Genet. Epidemiol."},{"key":"2023020211092463900_B27","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1534\/genetics.104.032052","article-title":"New adjustment factors and sample size calculation in a DNA-pooling experiment with preferential amplification","volume":"169","author":"Yang","year":"2005","journal-title":"Genetics"},{"key":"2023020211092463900_B28","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1186\/1471-2105-7-233","article-title":"PDA: pooled DNA analyzer","volume":"7","author":"Yang","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020211092463900_B29","doi-asserted-by":"crossref","first-page":"683","DOI":"10.1086\/513109","article-title":"Leveraging the HapMap correlation structure in association studies","volume":"80","author":"Zaitlen","year":"2007","journal-title":"Am. J. Hum. Genet"},{"key":"2023020211092463900_B30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/gepi.10277","article-title":"The impacts of errors in individual genotyping and DNA pooling on association studies","volume":"26","author":"Zou","year":"2004","journal-title":"Genet. Epidemiol"},{"key":"2023020211092463900_B31","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1046\/j.1529-8817.2005.00164.x","article-title":"Family-based association tests for different family structures using pooled DNA","volume":"69","author":"Zou","year":"2005","journal-title":"Ann. Hum. Genet"},{"key":"2023020211092463900_B32","doi-asserted-by":"crossref","first-page":"1747","DOI":"10.1534\/genetics.105.042648","article-title":"Two-stage designs in case-control association analysis","volume":"173","author":"Zuo","year":"2006","journal-title":"Genetics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/17\/1896\/49050947\/bioinformatics_24_17_1896.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/17\/1896\/49050947\/bioinformatics_24_17_1896.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T13:10:30Z","timestamp":1675343430000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/17\/1896\/262021"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7,10]]},"references-count":32,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2008,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn333","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,9,1]]},"published":{"date-parts":[[2008,7,10]]}}}