{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T03:42:31Z","timestamp":1771904551809,"version":"3.50.1"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T00:00:00Z","timestamp":1600041600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"European Union\u2019s Horizon 2020","award":["690941"],"award-info":[{"award-number":["690941"]}]},{"name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","award":["TIN2016-78011-C4-1-R"],"award-info":[{"award-number":["TIN2016-78011-C4-1-R"]}]},{"name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","award":["TIN2016-77158-C4-3-R"],"award-info":[{"award-number":["TIN2016-77158-C4-3-R"]}]},{"name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","award":["FPU17\/02742"],"award-info":[{"award-number":["FPU17\/02742"]}]},{"DOI":"10.13039\/501100010801","name":"Xunta de Galicia","doi-asserted-by":"crossref","award":["ED431C 2017\/58"],"award-info":[{"award-number":["ED431C 2017\/58"]}],"id":[{"id":"10.13039\/501100010801","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100010801","name":"Xunta de Galicia","doi-asserted-by":"crossref","award":["ED431G\/01"],"award-info":[{"award-number":["ED431G\/01"]}],"id":[{"id":"10.13039\/501100010801","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100010801","name":"Xunta de Galicia","doi-asserted-by":"crossref","award":["IN848D-2017-2350417"],"award-info":[{"award-number":["IN848D-2017-2350417"]}],"id":[{"id":"10.13039\/501100010801","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100010801","name":"Xunta de Galicia","doi-asserted-by":"crossref","award":["IN852A 2018\/14"],"award-info":[{"award-number":["IN852A 2018\/14"]}],"id":[{"id":"10.13039\/501100010801","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["308030"],"award-info":[{"award-number":["308030"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["314170"],"award-info":[{"award-number":["314170"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["323233"],"award-info":[{"award-number":["323233"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>RNA viruses exhibit a high mutation rate and thus they exist in infected cells as a population of closely related strains called viral quasispecies. The viral quasispecies assembly problem asks to characterize the quasispecies present in a sample from high-throughput sequencing data. We study the de novo version of the problem, where reference sequences of the quasispecies are not available. Current methods for assembling viral quasispecies are either based on overlap graphs or on de Bruijn graphs. Overlap graph-based methods tend to be accurate but slow, whereas de Bruijn graph-based methods are fast but less accurate.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present viaDBG, which is a fast and accurate de Bruijn graph-based tool for de novo assembly of viral quasispecies. We first iteratively correct sequencing errors in the reads, which allows us to use large k-mers in the de Bruijn graph. To incorporate the paired-end information in the graph, we also adapt the paired de Bruijn graph for viral quasispecies assembly. These features enable the use of long-range information in contig construction without compromising the speed of de Bruijn graph-based approaches. Our experimental results show that viaDBG is both accurate and fast, whereas previous methods are either fast or accurate but not both. In particular, viaDBG has comparable or better accuracy than SAVAGE, while being at least nine times faster. Furthermore, the speed of viaDBG is comparable to PEHaplo but viaDBG is able to retrieve also low abundance quasispecies, which are often missed by PEHaplo.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>viaDBG is implemented in C++ and it is publicly available at https:\/\/bitbucket.org\/bfreirec1\/viadbg. All datasets used in this article are publicly available at https:\/\/bitbucket.org\/bfreirec1\/data-viadbg\/.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa782","type":"journal-article","created":{"date-parts":[[2020,9,2]],"date-time":"2020-09-02T19:15:38Z","timestamp":1599074138000},"page":"473-481","source":"Crossref","is-referenced-by-count":15,"title":["Inference of viral quasispecies with a paired de\u00a0Bruijn\u00a0graph"],"prefix":"10.1093","volume":"37","author":[{"given":"Borja","family":"Freire","sequence":"first","affiliation":[{"name":"Department of Computer Science and Information Technologies, Facultade de Inform\u00e1tica, Universidade da Coru\u00f1a , Centro de investigaci\u00f3n CITIC, A Coru\u00f1a, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4616-0774","authenticated-orcid":false,"given":"Susana","family":"Ladra","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Technologies, Facultade de Inform\u00e1tica, Universidade da Coru\u00f1a , Centro de investigaci\u00f3n CITIC, A Coru\u00f1a, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8727-0980","authenticated-orcid":false,"given":"Jose R","family":"Param\u00e1","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Technologies, Facultade de Inform\u00e1tica, Universidade da Coru\u00f1a , Centro de investigaci\u00f3n CITIC, A Coru\u00f1a, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0756-543X","authenticated-orcid":false,"given":"Leena","family":"Salmela","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Helsinki Institute for Information Technology, University of Helsinki , Helsinki, Finland"}]}],"member":"286","published-online":{"date-parts":[[2020,9,14]]},"reference":[{"key":"2023051706074287200_btaa782-B1","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1089\/cmb.2017.0249","article-title":"aBayesQR: a Byesian method for reconstruction of viral populations characterized by low diversity","volume":"25","author":"Ahn","year":"2018","journal-title":"J. Comput. Biol"},{"key":"2023051706074287200_btaa782-B2","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1101\/gr.215038.116","article-title":"De novo assembly of viral quasispecies using overlap graphs","volume":"27","author":"Baaijens","year":"2017","journal-title":"Genome Res"},{"key":"2023051706074287200_btaa782-B3","doi-asserted-by":"crossref","first-page":"5086","DOI":"10.1093\/bioinformatics\/btz443","article-title":"Full-length de novo viral quasispecies assembly through variation graph construction","volume":"35","author":"Baaijens","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051706074287200_btaa782-B4","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1089\/cmb.2012.0021","article-title":"SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing","volume":"19","author":"Bankevich","year":"2012","journal-title":"J. Comput. Biol"},{"key":"2023051706074287200_btaa782-B5","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1016\/j.ygeno.2017.12.007","article-title":"QSdpR: viral quasispecies reconstruction via correlation clustering","volume":"110","author":"Barik","year":"2018","journal-title":"Genomics"},{"key":"2023051706074287200_btaa782-B6","doi-asserted-by":"crossref","first-page":"2927","DOI":"10.1093\/bioinformatics\/bty202","article-title":"De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051706074287200_btaa782-B7","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1128\/MMBR.05023-11","article-title":"Viral quasispecies evolution","volume":"76","author":"Domingo","year":"2012","journal-title":"Microbiol. Mol. Biol. Rev"},{"key":"2023051706074287200_btaa782-B8","doi-asserted-by":"crossref","first-page":"12204","DOI":"10.1038\/ncomms12204","article-title":"A rhesus macaque model of Asian-lineage Zika virus infection","volume":"7","author":"Dudley","year":"2016","journal-title":"Nat. Commun"},{"key":"2023051706074287200_btaa782-B9","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1038\/nrg2323","article-title":"Rates of evolutionary change in viruses: patterns and determinants","volume":"9","author":"Duffy","year":"2008","journal-title":"Nat. Rev. Genet"},{"key":"2023051706074287200_btaa782-B10","doi-asserted-by":"crossref","first-page":"e115","DOI":"10.1093\/nar\/gku537","article-title":"Full-length haplotype reconstruction to infer the structure of heterogeneous virus populations","volume":"42","author":"Giallonardo","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023051706074287200_btaa782-B11","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780199211128.001.0001","volume-title":"The Evolution and Emergence of RNA Viruses","author":"Holmes","year":"2009"},{"key":"2023051706074287200_btaa782-B12","doi-asserted-by":"crossref","first-page":"886","DOI":"10.1093\/bioinformatics\/btu754","article-title":"ViQuaS: an improved reconstruction pipeline for viral quasispecies spectra generated by next-generation sequencing","volume":"31","author":"Jayasundara","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051706074287200_btaa782-B13","doi-asserted-by":"publisher","DOI":"10.1101\/264242","article-title":"CliqueSNV: scalable reconstruction of intra-host viral populations from NGS reads","author":"Knyazev","year":"2019"},{"key":"2023051706074287200_btaa782-B14","article-title":"Maximum likelihood de novo reconstruction of viral populations using paired end sequencing data","author":"Malhotra","year":"2015","journal-title":"arXiv e-Prints"},{"key":"2023051706074287200_btaa782-B15","doi-asserted-by":"crossref","first-page":"10","DOI":"10.14806\/ej.17.1.200","article-title":"Cutadapt removes adapter sequences from high-throughput sequencing reads","volume":"17","author":"Martin","year":"2011","journal-title":"EMBnet J"},{"key":"2023051706074287200_btaa782-B16","doi-asserted-by":"crossref","first-page":"1625","DOI":"10.1089\/cmb.2011.0151","article-title":"Paired de Bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers","volume":"18","author":"Medvedev","year":"2011","journal-title":"J. Comput. Biol"},{"key":"2023051706074287200_btaa782-B17","doi-asserted-by":"crossref","first-page":"1088","DOI":"10.1093\/bioinformatics\/btv697","article-title":"MetaQUAST: evaluation of metagenome assemblies","volume":"32","author":"Mikheenko","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051706074287200_btaa782-B18","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1038\/nrg3367","article-title":"Sequence assembly demystified","volume":"14","author":"Nagarajan","year":"2013","journal-title":"Nat. Rev. Genet"},{"key":"2023051706074287200_btaa782-B19","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1101\/gr.213959.116","article-title":"metaSPAdes: a new versatile metagenomic assembler","volume":"27","author":"Nurk","year":"2017","journal-title":"Genome Res"},{"key":"2023051706074287200_btaa782-B20","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.virusres.2016.09.016","article-title":"Recent advances in inferring viral diversity from high-throughput sequencing data","volume":"239","author":"Posada-Cespedes","year":"2017","journal-title":"Virus Res"},{"key":"2023051706074287200_btaa782-B21","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1109\/TCBB.2013.145","article-title":"HIV haplotype inference using a propagating Dirichlet process mixture model","volume":"11","author":"Prabhakaran","year":"2014","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform"},{"key":"2023051706074287200_btaa782-B22","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1093\/bioinformatics\/btr627","article-title":"QuRe: software for viral quasispecies reconstruction from next-generation sequencing data","volume":"28","author":"Prosperi","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051706074287200_btaa782-B23","doi-asserted-by":"crossref","first-page":"3506","DOI":"10.1093\/bioinformatics\/btu538","article-title":"LoRDEC: accurate and efficient long read error correction","volume":"30","author":"Salmela","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051706074287200_btaa782-B24","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1089\/cmb.2012.0232","article-title":"Probabilistic inference of viral quasispecies subject to recombination","volume":"20","author":"T\u00f6pfer","year":"2013","journal-title":"J. Comput. Biol"},{"key":"2023051706074287200_btaa782-B25","doi-asserted-by":"crossref","first-page":"e1003515","DOI":"10.1371\/journal.pcbi.1003515","article-title":"Viral quasispecies assembly via maximal clique enumeration","volume":"10","author":"T\u00f6pfer","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023051706074287200_btaa782-B26","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1186\/1471-2105-12-119","article-title":"ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data","volume":"12","author":"Zagordi","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051706074287200_btaa782-B27","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1093\/bioinformatics\/btt593","article-title":"PEAR: a fast and accurate Illumina Paired-End reAd mergeR","volume":"30","author":"Zhang","year":"2014","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa782\/34894745\/btaa782.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/4\/473\/50359835\/btaa782.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/4\/473\/50359835\/btaa782.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T02:46:37Z","timestamp":1723517197000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/4\/473\/5905473"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,9,14]]},"references-count":27,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa782","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,2,15]]},"published":{"date-parts":[[2020,9,14]]}}}