{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T16:32:19Z","timestamp":1771259539098,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2315,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing.<\/jats:p>\n               <jats:p>Results: We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80\u201385% of SNPs identified using individual sequencing while achieving a low false discovery rate (3\u20135%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP.<\/jats:p>\n               <jats:p>Availability: Implementation of this method is available at http:\/\/polymorphism.scripps.edu\/\u223cvbansal\/software\/CRISP\/<\/jats:p>\n               <jats:p>Contact: \u00a0vbansal@scripps.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq214","type":"journal-article","created":{"date-parts":[[2010,6,7]],"date-time":"2010-06-07T07:28:13Z","timestamp":1275895693000},"page":"i318-i324","source":"Crossref","is-referenced-by-count":152,"title":["A statistical method for the detection of variants from next-generation resequencing of DNA pools"],"prefix":"10.1093","volume":"26","author":[{"given":"Vikas","family":"Bansal","sequence":"first","affiliation":[{"name":"Scripps Genomic Medicine, Scripps Translational Science Institute, La Jolla, CA 92037, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,6,1]]},"reference":[{"key":"2023012508045912100_B1","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1101\/gr.100040.109","article-title":"Accurate detection and genotyping of SNPs utilizing population sequencing data","volume":"10","author":"Bansal","year":"2010","journal-title":"Genome Res."},{"key":"2023012508045912100_B2","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1038\/nature07517","article-title":"Accurate whole human genome sequencing using reversible terminator chemistry","volume":"456","author":"Bentley","year":"2008","journal-title":"Nature"},{"key":"2023012508045912100_B3","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1214\/aoms\/1177729330","article-title":"A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations","volume":"23","author":"Chernoff","year":"1952","journal-title":"Ann. Math. Stat."},{"key":"2023012508045912100_B4","doi-asserted-by":"crossref","first-page":"e105","DOI":"10.1093\/nar\/gkn425","article-title":"Substantial biases in ultra-short read data sets from high-throughput DNA sequencing","volume":"36","author":"Dohm","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012508045912100_B5","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1038\/nmeth.1307","article-title":"Quantification of rare allelic variants from pooled genomic DNA","volume":"6","author":"Druley","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012508045912100_B6","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1093\/bioinformatics\/btn173","article-title":"Optimal pooling for genome re-sequencing with ultra-high-throughput short-read technologies","volume":"24","author":"Hajirasouliha","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012508045912100_B7","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1038\/ejhg.2008.182","article-title":"SNP frequency estimation using massively parallel sequencing of pooled DNA","volume":"17","author":"Ingman","year":"2009","journal-title":"Eur. J. Hum. Genet."},{"key":"2023012508045912100_B8","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1038\/nature08211","article-title":"A highly annotated whole-genome sequence of a Korean individual","volume":"460","author":"Kim","year":"2009","journal-title":"Nature"},{"key":"2023012508045912100_B9","doi-asserted-by":"crossref","first-page":"2283","DOI":"10.1093\/bioinformatics\/btp373","article-title":"VarScan: variant detection in massively parallel sequencing of individual and pooled samples","volume":"25","author":"Koboldt","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508045912100_B10","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012508045912100_B11","doi-asserted-by":"crossref","first-page":"e254","DOI":"10.1371\/journal.pbio.0050254","article-title":"The diploid genome sequence of an individual human","volume":"5","author":"Levy","year":"2007","journal-title":"PLoS Biol."},{"key":"2023012508045912100_B12","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012508045912100_B13","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows-wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508045912100_B14","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The Sequence Alignment\/Map format and SAMtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508045912100_B15","doi-asserted-by":"crossref","first-page":"1966","DOI":"10.1093\/bioinformatics\/btp336","article-title":"Soap2: an improved ultrafast tool for short read alignment","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508045912100_B16","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1038\/456018a","article-title":"Personal genomes: the case of the missing heritability","volume":"456","author":"Maher","year":"2008","journal-title":"Nature"},{"key":"2023012508045912100_B17","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/nature08494","article-title":"Finding the missing heritability of complex diseases","volume":"461","author":"Manolio","year":"2009","journal-title":"Nature"},{"key":"2023012508045912100_B18","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1080\/03610918008812182","article-title":"A network algorithm for the exact treatment of the 2\u00d7 k contingency table","volume":"9","author":"Mehta","year":"1980","journal-title":"Commun. Stat. Simul. Comput."},{"key":"2023012508045912100_B19","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1126\/science.1167728","article-title":"Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes","volume":"324","author":"Nejentsev","year":"2009","journal-title":"Science"},{"key":"2023012508045912100_B20","doi-asserted-by":"crossref","first-page":"1703","DOI":"10.1002\/humu.21122","article-title":"Deep sequencing to reveal new variants in pooled DNA samples","volume":"30","author":"Out","year":"2009","journal-title":"Hum. Mutat."},{"key":"2023012508045912100_B21","doi-asserted-by":"crossref","first-page":"1254","DOI":"10.1101\/gr.088559.108","article-title":"Overlapping pools for high-throughput targeted resequencing","volume":"19","author":"Prabhu","year":"2009","journal-title":"Genome Res."},{"key":"2023012508045912100_B22","doi-asserted-by":"crossref","first-page":"e1000386","DOI":"10.1371\/journal.pcbi.1000386","article-title":"SHRiMP: accurate mapping of short color-space reads","volume":"5","author":"Rumble","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012508045912100_B23","doi-asserted-by":"crossref","first-page":"862","DOI":"10.1038\/nrg930","article-title":"DNA Pooling: a tool for large-scale association studies","volume":"3","author":"Sham","year":"2002","journal-title":"Nat. Rev. Genet."},{"key":"2023012508045912100_B24","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1038\/nbt0108-65","article-title":"Genome resequencing and genetic variation","volume":"26","author":"Stratton","year":"2008","journal-title":"Nat. Biotechnol."},{"key":"2023012508045912100_B25","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1038\/nature07484","article-title":"The diploid genome sequence of an asian individual","volume":"456","author":"Wang","year":"2008","journal-title":"Nature"},{"key":"2023012508045912100_B26","doi-asserted-by":"crossref","first-page":"872","DOI":"10.1038\/nature06884","article-title":"The complete genome of an individual by massively parallel DNA sequencing","volume":"452","author":"Wheeler","year":"2008","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/12\/i318\/48857868\/bioinformatics_26_12_i318.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/12\/i318\/48857868\/bioinformatics_26_12_i318.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T08:08:35Z","timestamp":1674634115000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/12\/i318\/285976"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,6,1]]},"references-count":26,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2010,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq214","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,6,15]]},"published":{"date-parts":[[2010,6,1]]}}}