{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T17:41:02Z","timestamp":1774892462752,"version":"3.50.1"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2021,1,20]],"date-time":"2021-01-20T00:00:00Z","timestamp":1611100800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012390","name":"SystemsX.ch","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100012390","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>High-throughput sequencing technologies are used increasingly not only in viral genomics research but also in clinical surveillance and diagnostics. These technologies facilitate the assessment of the genetic diversity in intra-host virus populations, which affects transmission, virulence and pathogenesis of viral infections. However, there are two major challenges in analysing viral diversity. First, amplification and sequencing errors confound the identification of true biological variants, and second, the large data volumes represent computational limitations.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>To support viral high-throughput sequencing studies, we developed V-pipe, a bioinformatics pipeline combining various state-of-the-art statistical models and computational tools for automated end-to-end analyses of raw sequencing reads. V-pipe supports quality control, read mapping and alignment, low-frequency mutation calling, and inference of viral haplotypes. For generating high-quality read alignments, we developed a novel method, called ngshmmalign, based on profile hidden Markov models and tailored to small and highly diverse viral genomes. V-pipe also includes benchmarking functionality providing a standardized environment for comparative evaluations of different pipeline configurations. We demonstrate this capability by assessing the impact of three different read aligners (Bowtie 2, BWA MEM, ngshmmalign) and two different variant callers (LoFreq, ShoRAH) on the performance of calling single-nucleotide variants in intra-host virus populations. V-pipe supports various pipeline configurations and is implemented in a modular fashion to facilitate adaptations to the continuously changing technology landscape.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availabilityand implementation<\/jats:title>\n                    <jats:p>V-pipe is freely available at https:\/\/github.com\/cbg-ethz\/V-pipe.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab015","type":"journal-article","created":{"date-parts":[[2021,1,8]],"date-time":"2021-01-08T18:49:12Z","timestamp":1610131752000},"page":"1673-1680","source":"Crossref","is-referenced-by-count":81,"title":["V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data"],"prefix":"10.1093","volume":"37","author":[{"given":"Susana","family":"Posada-C\u00e9spedes","sequence":"first","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , 4058 Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics , 4058 Basel, Switzerland"}]},{"given":"David","family":"Seifert","sequence":"additional","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , 4058 Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics , 4058 Basel, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7561-0810","authenticated-orcid":false,"given":"Ivan","family":"Topolsky","sequence":"additional","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , 4058 Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics , 4058 Basel, Switzerland"}]},{"given":"Kim Philipp","family":"Jablonski","sequence":"additional","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , 4058 Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics , 4058 Basel, Switzerland"}]},{"given":"Karin J","family":"Metzner","sequence":"additional","affiliation":[{"name":"Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich , 8091 Zurich, Switzerland"},{"name":"Institute of Medical Virology, University of Zurich , 8091 Zurich, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0573-6119","authenticated-orcid":false,"given":"Niko","family":"Beerenwinkel","sequence":"additional","affiliation":[{"name":"Department of Biosystems Science and Engineering, ETH Zurich , 4058 Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics , 4058 Basel, Switzerland"}]}],"member":"286","published-online":{"date-parts":[[2021,1,20]]},"reference":[{"key":"2023051709553585200_btab015-B1","volume-title":"FastQC a Quality Control Tool for High Throughput Sequence Data","author":"Andrews","year":"2019"},{"key":"2023051709553585200_btab015-B2","doi-asserted-by":"crossref","first-page":"e1001022","DOI":"10.1371\/journal.pcbi.1001022","article-title":"The evolutionary analysis of emerging low frequency HIV-1 CXCR4 using variants through time\u2013an ultra-deep approach","volume":"6","author":"Archer","year":"2010","journal-title":"PLoS Comput. Biol"},{"key":"2023051709553585200_btab015-B3","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1101\/gr.215038.116","article-title":"De novo assembly of viral quasispecies using overlap graphs","volume":"27","author":"Baaijens","year":"2017","journal-title":"Genome Res"},{"key":"2023051709553585200_btab015-B4","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.jcv.2013.03.003","article-title":"Next-generation sequencing technologies in diagnostic virology","volume":"58","author":"Barzon","year":"2013","journal-title":"J. Clin. Virol"},{"key":"2023051709553585200_btab015-B5","doi-asserted-by":"crossref","first-page":"329","DOI":"10.3389\/fmicb.2012.00329","article-title":"Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data","volume":"3","author":"Beerenwinkel","year":"2012","journal-title":"Front Microbiol"},{"key":"2023051709553585200_btab015-B6","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1111\/1469-0691.12056","article-title":"Next-generation sequencing technology in clinical virology","volume":"19","author":"Capobianchi","year":"2013","journal-title":"Clin. Microbiol. Infect"},{"key":"2023051709553585200_btab015-B7","doi-asserted-by":"crossref","first-page":"e115","DOI":"10.1093\/nar\/gku537","article-title":"Full-length haplotype reconstruction to infer the structure of heterogeneous virus populations","volume":"42","author":"Di Giallonardo","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023051709553585200_btab015-B8","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.virusres.2004.11.003","article-title":"Quasispecies dynamics and RNA virus extinction","volume":"107","author":"Domingo","year":"2005","journal-title":"Virus Res"},{"key":"2023051709553585200_btab015-B9","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1038\/nrg2323","article-title":"Rates of evolutionary change in viruses: patterns and determinants","volume":"9","author":"Duffy","year":"2008","journal-title":"Nat. Rev. Genet"},{"key":"2023051709553585200_btab015-B10","doi-asserted-by":"crossref","first-page":"e1006235","DOI":"10.1371\/journal.ppat.1006235","article-title":"Extra-epitopic hepatitis C virus polymorphisms confer resistance to broadly neutralizing antibodies by modulating binding to scavenger receptor B1","volume":"13","author":"El-Diwany","year":"2017","journal-title":"PLoS Pathog"},{"key":"2023051709553585200_btab015-B11","doi-asserted-by":"crossref","first-page":"104277","DOI":"10.1016\/j.meegid.2020.104277","article-title":"Evaluation of haplotype callers for next-generation sequencing of viruses","volume":"82","author":"Eliseev","year":"2020","journal-title":"Infect. Genet. Evol"},{"key":"2023051709553585200_btab015-B12","doi-asserted-by":"crossref","first-page":"2354","DOI":"10.1126\/science.1070441","article-title":"Diversity considerations in HIV-1 vaccine selection","volume":"296","author":"Gaschen","year":"2002","journal-title":"Science"},{"key":"2023051709553585200_btab015-B13","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1038\/nrg.2016.49","article-title":"Coming of age: ten years of next-generation sequencing technologies","volume":"17","author":"Goodwin","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023051709553585200_btab015-B14","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.virol.2014.09.019","article-title":"Development of a virus detection and discovery pipeline using next generation sequencing","volume":"471\u2013473","author":"Ho","year":"2014","journal-title":"Virology"},{"key":"2023051709553585200_btab015-B15","doi-asserted-by":"crossref","first-page":"2029","DOI":"10.1093\/bioinformatics\/bty919","article-title":"Measurement error and variant-calling in deep Illumina sequencing of HIV","volume":"35","author":"Howison","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B16","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"ART: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B17","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1016\/j.jviromet.2016.11.008","article-title":"MinVar: a rapid and versatile tool for HIV-1 drug resistance genotyping by deep sequencing","volume":"240","author":"Huber","year":"2017","journal-title":"J. Virol. Methods"},{"key":"2023051709553585200_btab015-B18","doi-asserted-by":"crossref","first-page":"886","DOI":"10.1093\/bioinformatics\/btu754","article-title":"ViQuaS: an improved reconstruction pipeline for viral quasispecies spectra generated by next-generation sequencing","volume":"31","author":"Jayasundara","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B19","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1093\/molbev\/mst010","article-title":"MAFFT multiple sequence alignment software version 7: improvements in Performance and Usability","volume":"30","author":"Katoh","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023051709553585200_btab015-B20","doi-asserted-by":"crossref","first-page":"2520","DOI":"10.1093\/bioinformatics\/bts480","article-title":"Snakemake \u2013 a scalable bioinformatics workflow engine","volume":"28","author":"K\u00f6ster","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B21","doi-asserted-by":"crossref","first-page":"e10256","DOI":"10.1371\/journal.pone.0010256","article-title":"Characterization of quasispecies of pandemic 2009 influenza A virus (A\/H1N1\/2009) by de novo sequencing using a next-generation DNA sequencer","volume":"5","author":"Kuroda","year":"2010","journal-title":"PLoS One"},{"key":"2023051709553585200_btab015-B22","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023051709553585200_btab015-B23","doi-asserted-by":"crossref","first-page":"e1001005","DOI":"10.1371\/journal.ppat.1001005","article-title":"Quasispecies theory and the behavior of RNA viruses","volume":"6","author":"Lauring","year":"2010","journal-title":"PLoS Pathog"},{"key":"2023051709553585200_btab015-B24","doi-asserted-by":"crossref","first-page":"1634","DOI":"10.1038\/s41598-020-58544-z","article-title":"Performance comparison of next generation sequencing analysis pipelines for HIV-1 drug resistance testing","volume":"10","author":"Lee","year":"2020","journal-title":"Sci Rep"},{"key":"2023051709553585200_btab015-B25","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.jtbi.2009.07.038","article-title":"Modeling sequence evolution in acute HIV-1 infection","volume":"261","author":"Lee","year":"2009","journal-title":"J. Theor. Biol"},{"key":"2023051709553585200_btab015-B26","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","author":"Li","year":"2013","journal-title":"arXiv:1303.3997"},{"key":"2023051709553585200_btab015-B27","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B28","doi-asserted-by":"crossref","first-page":"23774","DOI":"10.1038\/srep23774","article-title":"VIP: an integrated pipeline for metagenomics of virus identification and discovery","volume":"6","author":"Li","year":"2016","journal-title":"Sci. Rep"},{"key":"2023051709553585200_btab015-B29","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1093\/bioinformatics\/btx702","article-title":"ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads","volume":"34","author":"Maarala","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B30","doi-asserted-by":"crossref","first-page":"i329","DOI":"10.1093\/bioinformatics\/btu295","article-title":"Accurate viral population assembly from ultra-deep sequencing data","volume":"30","author":"Mangul","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B31","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1016\/j.antiviral.2018.07.020","article-title":"Comparison of antiviral resistance across acute and chronic viral infections","volume":"158","author":"Mason","year":"2018","journal-title":"Antiviral Res"},{"key":"2023051709553585200_btab015-B32","doi-asserted-by":"crossref","first-page":"501","DOI":"10.1186\/1471-2164-14-501","article-title":"Accurate single nucleotide variant detection in viral populations by combining probabilistic clustering with a statistical test of strand bias","volume":"14","author":"McElroy","year":"2013","journal-title":"BMC Genomics"},{"key":"2023051709553585200_btab015-B33","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1101\/gr.171934.113","article-title":"A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples","volume":"24","author":"Naccache","year":"2014","journal-title":"Genome Res"},{"key":"2023051709553585200_btab015-B34","doi-asserted-by":"crossref","first-page":"963","DOI":"10.1126\/science.1683006","article-title":"Antigenic diversity thresholds and the development of AIDS","volume":"254","author":"Nowak","year":"1991","journal-title":"Science"},{"key":"2023051709553585200_btab015-B35","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023051709553585200_btab015-B36","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1038\/ng.3479","article-title":"Quantifying influenza virus diversity and transmission in humans","volume":"48","author":"Poon","year":"2016","journal-title":"Nat. Genet"},{"key":"2023051709553585200_btab015-B37","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.virusres.2016.09.016","article-title":"Recent advances in inferring viral diversity from high-throughput sequencing data","volume":"239","author":"Posada-C\u00e9spedes","year":"2017","journal-title":"Virus Res"},{"key":"2023051709553585200_btab015-B38","doi-asserted-by":"crossref","first-page":"O157","DOI":"10.1111\/1469-0691.12367","article-title":"Quasispecies tropism and compartmentalization in gut and peripheral blood during early and chronic phases of HIV-1 infection: possible correlation with immune activation markers","volume":"20","author":"Rozera","year":"2014","journal-title":"Clin. Microbiol. Infect"},{"key":"2023051709553585200_btab015-B39","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1093\/bioinformatics\/btr026","article-title":"Quality control and preprocessing of metagenomic datasets","volume":"27","author":"Schmieder","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051709553585200_btab015-B40","doi-asserted-by":"crossref","first-page":"8970","DOI":"10.1038\/s41598-019-45328-3","article-title":"A MiSeq-HyDRA platform for enhanced HIV drug resistance genotyping and surveillance","volume":"9","author":"Taylor","year":"2019","journal-title":"Sci. Rep"},{"key":"2023051709553585200_btab015-B41","doi-asserted-by":"crossref","first-page":"e1003515","DOI":"10.1371\/journal.pcbi.1003515","article-title":"Viral quasispecies assembly via maximal clique enumeration","volume":"10","author":"T\u00f6pfer","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023051709553585200_btab015-B42","doi-asserted-by":"crossref","first-page":"e5683","DOI":"10.1371\/journal.pone.0005683","article-title":"Quantitative deep sequencing reveals dynamic HIV-1 escape and large population shifts during CCR5 antagonist therapy in vivo","volume":"4","author":"Tsibris","year":"2009","journal-title":"PLoS One"},{"key":"2023051709553585200_btab015-B43","doi-asserted-by":"publisher","first-page":"1545","DOI":"10.1101\/gr.247064.118","article-title":"Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis","volume":"29","author":"Viehweger","year":"2019","journal-title":"Genome Research"},{"key":"2023051709553585200_btab015-B44","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1038\/nature04388","article-title":"Quasispecies diversity determines pathogenesis through cooperative interactions in a viral population","volume":"439","author":"Vignuzzi","year":"2006","journal-title":"Nature"},{"key":"2023051709553585200_btab015-B45","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/s13742-015-0060-y","article-title":"VirAmp: a galaxy-based viral genome assembly pipeline","volume":"4","author":"Wan","year":"2015","journal-title":"Gigascience"},{"key":"2023051709553585200_btab015-B46","doi-asserted-by":"crossref","first-page":"11189","DOI":"10.1093\/nar\/gks918","article-title":"LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets","volume":"40","author":"Wilm","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023051709553585200_btab015-B47","doi-asserted-by":"crossref","first-page":"vey007","DOI":"10.1093\/ve\/vey007","article-title":"Easy and accurate reconstruction of whole HIV genomes from short-read sequence data with shiver","volume":"4","author":"Wymant","year":"2018","journal-title":"Virus Evol"},{"key":"2023051709553585200_btab015-B48","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1186\/1471-2164-13-475","article-title":"De novo assembly of highly diverse viral populations","volume":"13","author":"Yang","year":"2012","journal-title":"BMC Genomics"},{"key":"2023051709553585200_btab015-B49","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1186\/1471-2105-12-119","article-title":"ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data","volume":"12","author":"Zagordi","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051709553585200_btab015-B50","doi-asserted-by":"crossref","first-page":"e11282","DOI":"10.7554\/eLife.11282","article-title":"Population genomics of intrapatient HIV-1 evolution","volume":"4","author":"Zanini","year":"2015","journal-title":"eLife"},{"key":"2023051709553585200_btab015-B51","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.virol.2017.01.005","article-title":"VirusSeeker, a computational pipeline for virus discovery and virome composition analysis","volume":"503","author":"Zhao","year":"2017","journal-title":"Virology"},{"key":"2023051709553585200_btab015-B52","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1016\/j.virol.2016.10.017","article-title":"VirusDetect: an automated pipeline for efficient virus discovery using deep sequencing of small RNAs","volume":"500","author":"Zheng","year":"2017","journal-title":"Virology"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab015\/36179981\/btab015.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1673\/50361329\/btab015.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1673\/50361329\/btab015.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T06:39:45Z","timestamp":1684305585000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/12\/1673\/6104816"}},"subtitle":[],"editor":[{"given":"Jinbo","family":"Xu","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1,20]]},"references-count":52,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2021,7,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab015","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.06.09.142919","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,6,15]]},"published":{"date-parts":[[2021,1,20]]}}}