{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,18]],"date-time":"2024-07-18T09:49:49Z","timestamp":1721296189708},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"20","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Detecting single-nucleotide polymorphism (SNP) in pooled sequencing data is more challenging than in individual sequencing because of sampling variations across pools. To effectively differentiate SNP signal from sequencing error, appropriate estimation of the sequencing error is necessary. In this article, we propose an empirical Bayes mixture (EBM) model for SNP detection and allele frequency estimation in pooled sequencing data.<\/jats:p><jats:p>Results: The proposed model reliably learns the error distribution by pooling information across pools and genomic positions. In addition, the proposed EBM model builds in characteristics unique to the pooled sequencing data, boosting the sensitivity of SNP detection. For large-scale inference in SNP detection, the EBM model provides a flexible and robust way for estimation and control of local false discovery rate. We demonstrate the performance of the proposed method through simulation studies and real data application.<\/jats:p><jats:p>Availability: Implementation of this method is available at https:\/\/sites.google.com\/site\/zhouby98<\/jats:p><jats:p>Contact: \u00a0baiyu.zhou@einstein.yu.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts501","type":"journal-article","created":{"date-parts":[[2012,8,23]],"date-time":"2012-08-23T00:15:38Z","timestamp":1345680938000},"page":"2569-2575","source":"Crossref","is-referenced-by-count":6,"title":["An empirical Bayes mixture model for SNP detection in pooled sequencing data"],"prefix":"10.1093","volume":"28","author":[{"given":"Baiyu","family":"Zhou","sequence":"first","affiliation":[{"name":"Department of Epidemiology & Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA"}]}],"member":"286","published-online":{"date-parts":[[2012,8,22]]},"reference":[{"key":"2023012513142379200_bts501-B1","doi-asserted-by":"crossref","first-page":"i318","DOI":"10.1093\/bioinformatics\/btq214","article-title":"A statistical method for the detection of variants from next-generation resequencing of DNA pools","volume":"26","author":"Bansal","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012513142379200_bts501-B2","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1101\/gr.100040.109","article-title":"Accurate detection and genotyping of SNPs utilizing population sequencing data","volume":"20","author":"Bansal","year":"2010","journal-title":"Genome Res."},{"key":"2023012513142379200_bts501-B3","doi-asserted-by":"crossref","first-page":"e18353","DOI":"10.1371\/journal.pone.0018353","article-title":"Efficient and cost effective population resequencing by pooling and in-solution hybridization","volume":"6","author":"Bansal","year":"2011","journal-title":"PLoS One"},{"key":"2023012513142379200_bts501-B4","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. B."},{"key":"2023012513142379200_bts501-B5","doi-asserted-by":"crossref","first-page":"695","DOI":"10.1038\/ng.f.136","article-title":"Common and rare variants in multifactorial susceptibility to common diseases","volume":"40","author":"Bodmer","year":"2008","journal-title":"Nat Genet."},{"key":"2023012513142379200_bts501-B6","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1214\/07-AOAS138","article-title":"In-season prediction of batting averages: a field test of empirical Bayes and Bayes methodologies","volume":"2","author":"Brown","year":"2008","journal-title":"Ann. Appl. Statist."},{"key":"2023012513142379200_bts501-B7","doi-asserted-by":"crossref","first-page":"1810","DOI":"10.1073\/pnas.0508483103","article-title":"Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels","volume":"103","author":"Cohen","year":"2006","journal-title":"Proc. Natl. Acad. Sci. USA."},{"key":"2023012513142379200_bts501-B31","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1038\/nmeth.1307","article-title":"Quantification of rare allelic variants from pooled genomic DNA","volume":"6","author":"Druley","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012513142379200_bts501-B8","doi-asserted-by":"crossref","first-page":"1351","DOI":"10.1214\/009053606000001460","article-title":"Size, power and false discovery rates","volume":"35","author":"Efron","year":"2007","journal-title":"Ann. Statist."},{"key":"2023012513142379200_bts501-B9","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1198\/016214501753382129","article-title":"Empirical bayes analysis of a microarray experiment","volume":"96","author":"Efron","year":"2001","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012513142379200_bts501-B10","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1080\/01621459.1975.10479864","article-title":"Data analysis using Stein\u2019s estimator and its generalizations","volume":"70","author":"Efron","year":"1975","journal-title":"J. Amer. Stat. Assoc."},{"key":"2023012513142379200_bts501-B11","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1214\/aos\/1015362191","article-title":"Multiple hypotheses testing and expected number of type I errors","volume":"30","author":"Finner","year":"2002","journal-title":"Ann. Stat."},{"key":"2023012513142379200_bts501-B12","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1038\/ng.118","article-title":"Rare independent mutations in renal salt handling genes contribute to blood pressure variation","volume":"40","author":"Ji","year":"2008","journal-title":"Nat. Genet."},{"key":"2023012513142379200_bts501-B13","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1002\/gepi.20501","article-title":"Design of association studies with pooled or un-pooled next-generation sequencing data","volume":"34","author":"Kim","year":"2010","journal-title":"Genet. Epidemiol."},{"key":"2023012513142379200_bts501-B14","doi-asserted-by":"crossref","first-page":"2283","DOI":"10.1093\/bioinformatics\/btp373","article-title":"VarScan: variant detection in massively parallel sequencing of individual and pooled samples","volume":"25","author":"Koboldt","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012513142379200_bts501-B15","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012513142379200_bts501-B16","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows-Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012513142379200_bts501-B17","doi-asserted-by":"crossref","first-page":"2694","DOI":"10.1093\/bioinformatics\/bth310","article-title":"A mixture model for estimating the local false discovery rate in DNA microarray analysis","volume":"20","author":"Liao","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012513142379200_bts501-B18","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/nature08494","article-title":"Finding the missing heritability of complex diseases","volume":"461","author":"Manolio","year":"2009","journal-title":"Nature"},{"key":"2023012513142379200_bts501-B19","doi-asserted-by":"crossref","first-page":"2803","DOI":"10.1093\/bioinformatics\/btq526","article-title":"SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies","volume":"26","author":"Martin","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012513142379200_bts501-B20","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1093\/biomet\/80.2.267","article-title":"Maximum likelihood estimation via the ECM algorithm: a general framework","volume":"80","author":"Meng","year":"1993","journal-title":"Biometrika"},{"key":"2023012513142379200_bts501-B21","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1214\/09-AOAS276","article-title":"An empirical Bayes mixture method for effect size and false discovery rate estimation","volume":"4","author":"Muralidharan","year":"2010","journal-title":"Ann. Appl. Stat."},{"key":"2023012513142379200_bts501-B22","article-title":"A cross-sample statistical model for SNP detection in short-read sequencing data","author":"Muralidharan","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012513142379200_bts501-B23","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1126\/science.1167728","article-title":"Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes","volume":"324","author":"Nejentsev","year":"2009","journal-title":"Science"},{"key":"2023012513142379200_bts501-B24","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/s10142-003-0085-7","article-title":"A mixture model approach to detecting differentially expressed genes with microarray data","volume":"3","author":"Pan","year":"2003","journal-title":"Funct. Integr. Genomics"},{"key":"2023012513142379200_bts501-B25","doi-asserted-by":"crossref","first-page":"1066","DOI":"10.1038\/ng.952","article-title":"Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease","volume":"43","author":"Rivas","year":"2011","journal-title":"Nat. Genet."},{"key":"2023012513142379200_bts501-B26","first-page":"157","article-title":"An empirical Bayes approach to statistics","volume-title":"Proc. Thrid Berkeley Sympos. Math. Statist. Probab. 1","author":"Robbins","year":"1954"},{"key":"2023012513142379200_bts501-B27","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1111\/1467-9868.00346","article-title":"A direct approach to false discovery rates","volume":"64","author":"Storey","year":"2002","journal-title":"J. R. Stat. Soc. B"},{"key":"2023012513142379200_bts501-B28","doi-asserted-by":"crossref","first-page":"492","DOI":"10.1002\/gepi.20502","article-title":"Resequencing of pooled DNA for detecting disease associations with rare variants","volume":"34","author":"Wang","year":"2010","journal-title":"Genet. Epidemiol."},{"key":"2023012513142379200_bts501-B29","doi-asserted-by":"crossref","first-page":"e132","DOI":"10.1093\/nar\/gkr599","article-title":"SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data","volume":"39","author":"Wei","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012513142379200_bts501-B30","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1214\/11-AOAS527","article-title":"Improving sequence-based genotype calls with linkage disequilibrium and pedigree information","volume":"6","author":"Zhou","year":"2012","journal-title":"Ann. Appl. Stat."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/20\/2569\/48876037\/bioinformatics_28_20_2569.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/20\/2569\/48876037\/bioinformatics_28_20_2569.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,28]],"date-time":"2024-04-28T16:31:50Z","timestamp":1714321910000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/20\/2569\/205456"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,8,22]]},"references-count":31,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2012,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts501","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,10,15]]},"published":{"date-parts":[[2012,8,22]]}}}