{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,8]],"date-time":"2025-10-08T22:36:40Z","timestamp":1759963000310},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"22","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The search for genetic variants that are linked to complex diseases such as cancer, Parkinson's;, or Alzheimer's; disease, may lead to better treatments. Since haplotypes can serve as proxies for hidden variants, one method of finding the linked variants is to look for case-control associations between the haplotypes and disease. Finding these associations requires a high-quality estimation of the haplotype frequencies in the population. To this end, we present, HaploPool, a method of estimating haplotype frequencies from blocks of consecutive SNPs.<\/jats:p><jats:p>Results: HaploPool leverages the efficiency of DNA pools and estimates the population haplotype frequencies from pools of disjoint sets, each containing two or three unrelated individuals. We study the trade-off between pooling efficiency and accuracy of haplotype frequency estimates. For a fixed genotyping budget, HaploPool performs favorably on pools of two individuals as compared with a state-of-the-art non-pooled phasing method, PHASE. Of independent interest, HaploPool can be used to phase non-pooled genotype data with an accuracy approaching that of PHASE.<\/jats:p><jats:p>We compared our algorithm to three programs that estimate haplotype frequencies from pooled data. HaploPool is an order of magnitude more efficient (at least six times faster), and considerably more accurate than previous methods. In contrast to previous methods, HaploPool performs well with missing data, genotyping errors and long haplotype blocks (of between 5 and 25 SNPs).<\/jats:p><jats:p>Availability: The HaploPool software is available at: http:\/\/haplopool.icsi.berkeley.edu\/haplopool\/<\/jats:p><jats:p>Contact: \u00a0bbkirk@eecs.berkeley.edu<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm435","type":"journal-article","created":{"date-parts":[[2007,9,26]],"date-time":"2007-09-26T02:14:39Z","timestamp":1190772879000},"page":"3048-3055","source":"Crossref","is-referenced-by-count":18,"title":["H<scp>aplo<\/scp>P<scp>ool<\/scp>: improving haplotype frequency estimation through DNA pools and phylogenetic modeling"],"prefix":"10.1093","volume":"23","author":[{"given":"Bonnie","family":"Kirkpatrick","sequence":"first","affiliation":[{"name":"1 Department of Electrical Engineering and Computer Sciences, UC Berkeley, CA, 2Computer Science Department, Universidad Rey Juan Carlos, Madrid, Spain and 3International Computer Science Institute, Berkeley, CA, USA"}]},{"given":"Carlos Santos","family":"Armendariz","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering and Computer Sciences, UC Berkeley, CA, 2Computer Science Department, Universidad Rey Juan Carlos, Madrid, Spain and 3International Computer Science Institute, Berkeley, CA, USA"}]},{"given":"Richard M.","family":"Karp","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering and Computer Sciences, UC Berkeley, CA, 2Computer Science Department, Universidad Rey Juan Carlos, Madrid, Spain and 3International Computer Science Institute, Berkeley, CA, USA"},{"name":"1 Department of Electrical Engineering and Computer Sciences, UC Berkeley, CA, 2Computer Science Department, Universidad Rey Juan Carlos, Madrid, Spain and 3International Computer Science Institute, Berkeley, CA, USA"}]},{"given":"Eran","family":"Halperin","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering and Computer Sciences, UC Berkeley, CA, 2Computer Science Department, Universidad Rey Juan Carlos, Madrid, Spain and 3International Computer Science Institute, Berkeley, CA, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,9,25]]},"reference":[{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"734","DOI":"10.1086\/515512","article-title":"Association mapping of disease loci, by use of a pooled DNA genomic screen","volume":"61","author":"Barcellos","year":"1997","journal-title":"Am. J. Hum. Genet"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1046\/j.1469-1809.2002.00125.x","article-title":"Identication of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design","volume":"66","author":"Barratt","year":"2002","journal-title":"Ann. Hum. Genet"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1093\/bioinformatics\/bth457","article-title":"Haploview: analysis and visualization of LD and haplotype maps","volume":"21","author":"Barrett","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"e129","DOI":"10.1093\/nar\/gkl700","article-title":"Using DNA pools for genotyping trios","volume":"34","author":"Beckman","year":"2006","journal-title":"Nucleic Acid Res."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"446","DOI":"10.1038\/nature02623","article-title":"Mapping complex disease loci in whole-genome association studies","volume":"429","author":"Carlson","year":"2004","journal-title":"Nature"},{"key":"2023041208255854100_","article-title":"Transferability of tag SNPs to capture common genetic variation in DNA repair genes across multiple populations","author":"de Bakker","year":"2006","journal-title":"Pac. Symp. Biocomput"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1101\/gr.10.2.258","article-title":"High-throughput SNP allele-frequency determination in pooled DNA samples by kinetic PCR","volume":"10","author":"Germer","year":"2000","journal-title":"Genome Res."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"1842","DOI":"10.1093\/bioinformatics\/bth149","article-title":"Haplotype reconstruction from genotype data using imperfect phylogeny","volume":"20","author":"Eskin","year":"2004","journal-title":"Bioinformatics"},{"key":"2023041208255854100_","first-page":"10","article-title":"Perfect phylogeny and haplotype assignment. In","author":"Halperin","year":"2004"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1186\/1471-2105-4-14","article-title":"SNP haplotype tagging from DNA pools of two individuals","volume":"4","author":"Hoh","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"384","DOI":"10.1086\/346116","article-title":"Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data","volume":"72","author":"Ito","year":"2003","journal-title":"Am J. Hum. Genet."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/0304-4149(82)90011-4","article-title":"The coalescent","volume":"13","author":"Kingman","year":"1982","journal-title":"Stoch. Proc. Appl."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"e74","DOI":"10.1093\/nar\/gnf070","article-title":"SNP genotyping on pooled DNAs: comparison of genotyping technologies and a semi automated method for data storage and analysis","volume":"30","author":"Le Hellard","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"2213","DOI":"10.1093\/genetics\/165.4.2213","article-title":"Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data","volume":"165","author":"Li","year":"2003","journal-title":"Genet."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1002\/gepi.10200","article-title":"On the advantage of haplotype analysis in the presence of multiple disease susceptibility alleles","volume":"23","author":"Morris","year":"2002","journal-title":"Genet. Epidemiol."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1007\/s00439-002-0706-6","article-title":"Universal, robust, highly quantitative snp allele frequency measurement in DNA pools","volume":"110","author":"Norton","year":"2002","journal-title":"Hum. Genet."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"1719","DOI":"10.1126\/science.1065573","article-title":"Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21","volume":"294","author":"Patil","year":"2001","journal-title":"Science"},{"key":"2023041208255854100_","first-page":"237","article-title":"Resolution of haplotypes and haplotype frequencies from SNP genotypes of pooled samples. In","author":"Pe\u2019er","year":"2003"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"862","DOI":"10.1038\/nrg930","article-title":"DNA pooling: a tool for large-scale association studies","volume":"3","author":"Sham","year":"2002","journal-title":"Nat. Rev. Genet."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"978","DOI":"10.1086\/319501","article-title":"A new statistical method for haplotype reconstruction from population data","volume":"68","author":"Stephens","year":"2003","journal-title":"Am. J. Hum. Genet."},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1038\/nature02168","article-title":"The International HapMap Consortium (2003) The international HapMap project","volume":"426","journal-title":"Nature"},{"key":"2023041208255854100_","doi-asserted-by":"crossref","first-page":"7225","DOI":"10.1073\/pnas.1237858100","article-title":"Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA","volume":"100","author":"Yang","year":"2003","journal-title":"Proc. Natl Acad Sci."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/22\/3048\/49857311\/bioinformatics_23_22_3048.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/22\/3048\/49857311\/bioinformatics_23_22_3048.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,14]],"date-time":"2023-05-14T07:04:13Z","timestamp":1684047853000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/22\/3048\/207826"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,9,25]]},"references-count":22,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2007,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm435","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,11,15]]},"published":{"date-parts":[[2007,9,25]]}}}