{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T19:55:12Z","timestamp":1778529312981,"version":"3.51.4"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation : In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations.<\/jats:p>\n               <jats:p>Results : A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%.<\/jats:p>\n               <jats:p>Availability : The VirVarSeq is available, together with a user\u2019s guide and test data, at sourceforge: http:\/\/sourceforge.net\/projects\/virtools\/?source=directory<\/jats:p>\n               <jats:p>Contact : bie.verbist@ugent.be<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu587","type":"journal-article","created":{"date-parts":[[2014,9,2]],"date-time":"2014-09-02T00:17:48Z","timestamp":1409617068000},"page":"94-101","source":"Crossref","is-referenced-by-count":53,"title":["VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering"],"prefix":"10.1093","volume":"31","author":[{"given":"Bie M.P.","family":"Verbist","sequence":"first","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kim","family":"Thys","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Joke","family":"Reumers","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yves","family":"Wetzels","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Koen","family":"Van der Borght","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Willem","family":"Talloen","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeroen","family":"Aerssens","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lieven","family":"Clement","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Olivier","family":"Thas","sequence":"additional","affiliation":[{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"},{"name":"1 Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, 2 Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, 3 Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and 4 University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2014,8,31]]},"reference":[{"key":"2023020116152307300_btu587-B1","doi-asserted-by":"crossref","first-page":"329","DOI":"10.3389\/fmicb.2012.00329","article-title":"Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data","volume":"3","author":"Beerenwinkel","year":"2012","journal-title":"Front. Microbiol."},{"key":"2023020116152307300_btu587-B2","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1016\/j.coviro.2011.07.008","article-title":"Ultra-deep sequencing for the analysis of viral populations","volume":"1","author":"Beerenwinkel","year":"2011","journal-title":"Curr. Opin. Virol."},{"key":"2023020116152307300_btu587-B3","doi-asserted-by":"crossref","first-page":"e19461","DOI":"10.1371\/journal.pone.0019461","article-title":"Added value of deep sequencing relative to population sequencing in heavily pre-treated HIV-1-infected subjects","volume":"6","author":"Codoner","year":"2011","journal-title":"PLoS One"},{"key":"2023020116152307300_btu587-B4","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1186\/1471-2105-13-303","article-title":"Improved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model","volume":"13","author":"De Beuf","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023020116152307300_btu587-B5","doi-asserted-by":"crossref","first-page":"1871","DOI":"10.1093\/infdis\/jiu340","article-title":"Deep sequencing of the HCV NS3\/4A region confirms low prevalence of telaprevir-resistant variants both at baseline and end of study","volume":"210","author":"Dierynck","year":"2014","journal-title":"J. Infect. Dis."},{"key":"2023020116152307300_btu587-B6","doi-asserted-by":"crossref","first-page":"e105","DOI":"10.1093\/nar\/gkn425","article-title":"Substantial biases in ultra-short read data sets from high-throughput DNA sequencing","volume":"36","author":"Dohm","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023020116152307300_btu587-B7","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1128\/MMBR.05023-11","article-title":"Viral quasispecies evolution","volume":"76","author":"Domingo","year":"2012","journal-title":"Microbiol. Mol. Biol. Rev."},{"key":"2023020116152307300_btu587-B8","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1101\/gr.8.3.186","article-title":"Base-calling of automated sequencer traces using phred. II. Error probabilities","volume":"8","author":"Ewing","year":"1998","journal-title":"Genome Res."},{"key":"2023020116152307300_btu587-B9","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1086\/655397","article-title":"Minority variants of drug-resistant HIV","volume":"202","author":"Gianella","year":"2010","journal-title":"J. Infect. Dis."},{"key":"2023020116152307300_btu587-B10","doi-asserted-by":"crossref","first-page":"e1002529","DOI":"10.1371\/journal.ppat.1002529","article-title":"Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection","volume":"8","author":"Henn","year":"2012","journal-title":"PloS Pathog."},{"key":"2023020116152307300_btu587-B11","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows-Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020116152307300_btu587-B12","doi-asserted-by":"crossref","first-page":"e1002417","DOI":"10.1371\/journal.pcbi.1002417","article-title":"Highly sensitive and specific detection of rare variants in mixed viral populations from massively parallel sequence data","volume":"8","author":"Macalalad","year":"2012","journal-title":"PLoS Comput. Biol."},{"key":"2023020116152307300_btu587-B13","doi-asserted-by":"crossref","first-page":"571","DOI":"10.2307\/2531869","article-title":"Fitting mixture models to grouped and truncated data via the EM algorithm","volume":"44","author":"McLachlan","year":"1988","journal-title":"Biometrics"},{"key":"2023020116152307300_btu587-B14","doi-asserted-by":"crossref","first-page":"R112","DOI":"10.1186\/gb-2011-12-11-r112","article-title":"Evaluation of genomic high-throughput sequencing data generated on illumina HiSeq and genome analyzer systems","volume":"12","author":"Minoche","year":"2011","journal-title":"Genome Biol."},{"key":"2023020116152307300_btu587-B15","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1038\/nrg2986","article-title":"Genotype and SNP calling from next-generation sequencing data","volume":"12","author":"Nielsen","year":"2011","journal-title":"Nat. Rev. Genet."},{"key":"2023020116152307300_btu587-B16","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1186\/1743-422X-10-350","article-title":"Stable HIV-1 integrase diversity during initial HIV-1 RNA decay suggests complete blockade of plasma HIV-1 replication by effective raltegravir-containing salvage therapy","volume":"10","author":"Noguera-Julian","year":"2013","journal-title":"Virol. J."},{"issue":"Pt 10","key":"2023020116152307300_btu587-B17","first-page":"2152","article-title":"Genome-wide patterns of intrahuman dengue virus diversity reveal associations with Viral Phylogenetic Clade and Interhost Diversity","volume":"93","author":"Parameswaran","year":"2012","journal-title":"J. Virol."},{"key":"2023020116152307300_btu587-B18","doi-asserted-by":"crossref","first-page":"2837","DOI":"10.1038\/srep02837","article-title":"Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges","volume":"3","author":"Prosperi","year":"2013","journal-title":"Sci. Rep."},{"key":"2023020116152307300_btu587-B19","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1038\/nbt.2053","article-title":"Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing","volume":"30","author":"Reumers","year":"2012","journal-title":"Nat. Biotechnol."},{"key":"2023020116152307300_btu587-B20","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1016\/j.antiviral.2014.02.011","article-title":"Antiviral therapy of hepatitis C in 2014: Do we need resistance testing?","volume":"105","author":"Schneider","year":"2014","journal-title":"Antiviral Res."},{"key":"2023020116152307300_btu587-B21","first-page":"431","article-title":"Benchmarking of viral haplotype reconstruction programmes: an overview of the capacities and limitations of currently available programmes \n              Brief","volume":"15","author":"Shirmer","year":"2014","journal-title":"Bioinform."},{"key":"2023020116152307300_btu587-B22","article-title":"Evaluating the use of the Illumina deep sequencing platform for the detection of minority variants in HIV and HCV","author":"Thys","year":"2014","journal-title":"J. Virol. Methods"},{"key":"2023020116152307300_btu587-B23","doi-asserted-by":"crossref","first-page":"p1","DOI":"10.1371\/journal.pone.0086771","article-title":"Prevalence and evolution of low frequency HIV drug resistance mutations detected by ultra deep sequencing in patients experiencing first line antiretroviral therapy failure","volume":"9","author":"Vandenhende","year":"2014","journal-title":"PLoS One"},{"key":"2023020116152307300_btu587-B24","doi-asserted-by":"crossref","first-page":"11189","DOI":"10.1093\/nar\/gks918","article-title":"LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets","volume":"40","author":"Wilm","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023020116152307300_btu587-B25","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1186\/1471-2105-12-119","article-title":"ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data","volume":"12","author":"Zagordi","year":"2011","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/1\/94\/49010982\/bioinformatics_31_1_94.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/1\/94\/49010982\/bioinformatics_31_1_94.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T00:25:19Z","timestamp":1675297519000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/1\/94\/2365438"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8,31]]},"references-count":25,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2015,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu587","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,1,1]]},"published":{"date-parts":[[2014,8,31]]}}}