{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T14:35:48Z","timestamp":1768142148714,"version":"3.49.0"},"publisher-location":"Cham","reference-count":37,"publisher":"Springer Nature Switzerland","isbn-type":[{"value":"9783031291180","type":"print"},{"value":"9783031291197","type":"electronic"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,3]],"date-time":"2023-04-03T00:00:00Z","timestamp":1680480000000},"content-version":"vor","delay-in-days":92,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>With the high mutation rate in viruses, a mixture of closely related viral strains (called viral quasispecies) often co-infect an individual host. Reconstructing individual strains from viral quasispecies is a key step to characterizing the viral population, revealing strain-level genetic variability, and providing insights into biomedical and clinical studies. Reference-based approaches of reconstructing viral strains suffer from the lack of high-quality references due to high mutation rates and biased variant calling introduced by a selected reference. De novo methods require no references but face challenges due to errors in reads, the high similarity of quasispecies, and uneven abundance of strains.<\/jats:p><jats:p>In this paper, we propose VStrains, a de novo approach for reconstructing strains from viral quasispecies. VStrains incorporates contigs, paired-end reads, and coverage information to iteratively extract the strain-specific paths from assembly graphs. We benchmark VStrains against multiple state-of-the-art de novo and reference-based approaches on both simulated and real datasets. Experimental results demonstrate that VStrains achieves the best overall performance on both simulated and real datasets under a comprehensive set of metrics such as genome fraction, duplication ratio, NGA50, error rate, <jats:italic>etc<\/jats:italic>.<\/jats:p><jats:p><jats:bold>Availability:<\/jats:bold> VStrains is freely available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/metagentools\/VStrains\">https:\/\/github.com\/<\/jats:ext-link><jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/metagentools\/VStrains\">MetaGenTools\/VStrains<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/978-3-031-29119-7_1","type":"book-chapter","created":{"date-parts":[[2023,4,3]],"date-time":"2023-04-03T10:09:17Z","timestamp":1680516557000},"page":"3-20","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["VStrains: De Novo Reconstruction of\u00a0Viral Strains via\u00a0Iterative Path Extraction from\u00a0Assembly Graphs"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4413-9122","authenticated-orcid":false,"given":"Runpeng","family":"Luo","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6339-2644","authenticated-orcid":false,"given":"Yu","family":"Lin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,4,3]]},"reference":[{"issue":"14","key":"1_CR1","doi-asserted-by":"publisher","first-page":"4126","DOI":"10.1093\/bioinformatics\/btaa490","volume":"36","author":"D Antipov","year":"2020","unstructured":"Antipov, D., Raiko, M., Lapidus, A., Pevzner, P.A.: Metaviral spades: assembly of viruses from metagenomic data. Bioinformatics 36(14), 4126\u20134129 (2020)","journal-title":"Bioinformatics"},{"issue":"1","key":"1_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-021-02566-x","volume":"23","author":"D Antipov","year":"2022","unstructured":"Antipov, D., Rayko, M., Kolmogorov, M., Pevzner, P.A.: viralFlye: assembling viruses and identifying their hosts from long-read metagenomics data. Genome Biol. 23(1), 1\u201321 (2022)","journal-title":"Genome Biol."},{"issue":"5","key":"1_CR3","doi-asserted-by":"publisher","first-page":"835","DOI":"10.1101\/gr.215038.116","volume":"27","author":"JA Baaijens","year":"2017","unstructured":"Baaijens, J.A., El Aabidine, A.Z., Rivals, E., Sch\u00f6nhuth, A.: De novo assembly of viral quasispecies using overlap graphs. Genome Res. 27(5), 835\u2013848 (2017)","journal-title":"Genome Res."},{"key":"1_CR4","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1007\/978-3-030-45257-5_14","volume-title":"Research in Computational Molecular Biology","author":"JA Baaijens","year":"2020","unstructured":"Baaijens, J.A., Stougie, L., Sch\u00f6nhuth, A.: Strain-aware assembly of genomes from\u00a0mixed samples using flow variation\u00a0graphs. In: Schwartz, R. (ed.) RECOMB 2020. LNCS, vol. 12074, pp. 221\u2013222. Springer, Cham (2020). https:\/\/doi.org\/10.1007\/978-3-030-45257-5_14"},{"issue":"5","key":"1_CR5","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1089\/cmb.2012.0021","volume":"19","author":"A Bankevich","year":"2012","unstructured":"Bankevich, A., et al.: SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19(5), 455\u2013477 (2012)","journal-title":"J. Comput. Biol."},{"issue":"13","key":"1_CR6","doi-asserted-by":"publisher","first-page":"2131","DOI":"10.1093\/bioinformatics\/btv124","volume":"31","author":"S Benidt","year":"2015","unstructured":"Benidt, S., Nettleton, D.: SimSeq: a nonparametric approach to simulation of RNA-sequence datasets. Bioinformatics 31(13), 2131\u20132140 (2015)","journal-title":"Bioinformatics"},{"issue":"9","key":"1_CR7","doi-asserted-by":"publisher","first-page":"giz100","DOI":"10.1093\/gigascience\/giz100","volume":"8","author":"E Bushmanova","year":"2019","unstructured":"Bushmanova, E., Antipov, D., Lapidus, A., Prjibelski, A.D.: rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. GigaScience 8(9), giz100 (2019)","journal-title":"GigaScience"},{"issue":"17","key":"1_CR8","doi-asserted-by":"publisher","first-page":"2927","DOI":"10.1093\/bioinformatics\/bty202","volume":"34","author":"J Chen","year":"2018","unstructured":"Chen, J., Zhao, Y., Sun, Y.: De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding. Bioinformatics 34(17), 2927\u20132935 (2018)","journal-title":"Bioinformatics"},{"issue":"10","key":"1_CR9","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0257521","volume":"16","author":"C Delahaye","year":"2021","unstructured":"Delahaye, C., Nicolas, J.: Sequencing DNA with nanopores: troubles and biases. PLoS ONE 16(10), e0257521 (2021)","journal-title":"PLoS ONE"},{"issue":"2","key":"1_CR10","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1128\/MMBR.05023-11","volume":"76","author":"E Domingo","year":"2012","unstructured":"Domingo, E., Sheldon, J., Perales, C.: Viral quasispecies evolution. Microbiol. Mol. Biol. Rev. 76(2), 159\u2013216 (2012)","journal-title":"Microbiol. Mol. Biol. Rev."},{"issue":"8","key":"1_CR11","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.3000003","volume":"16","author":"S Duffy","year":"2018","unstructured":"Duffy, S.: Why are RNA virus mutation rates so damn high? PLoS Biol. 16(8), e3000003 (2018)","journal-title":"PLoS Biol."},{"issue":"4","key":"1_CR12","doi-asserted-by":"publisher","first-page":"473","DOI":"10.1093\/bioinformatics\/btaa782","volume":"37","author":"B Freire","year":"2021","unstructured":"Freire, B., Ladra, S., Param\u00e1, J.R., Salmela, L.: Inference of viral quasispecies with a paired de Bruijn graph. Bioinformatics 37(4), 473\u2013481 (2021)","journal-title":"Bioinformatics"},{"issue":"1","key":"1_CR13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-021-02426-8","volume":"22","author":"A Fritz","year":"2021","unstructured":"Fritz, A.: Haploflow: strain-resolved de novo assembly of viral genomes. Genome Biol. 22(1), 1\u201319 (2021). https:\/\/doi.org\/10.1186\/s13059-021-02426-8","journal-title":"Genome Biol."},{"issue":"14","key":"1_CR14","doi-asserted-by":"publisher","first-page":"e115","DOI":"10.1093\/nar\/gku537","volume":"42","author":"FD Giallonardo","year":"2014","unstructured":"Giallonardo, F.D., et al.: Full-length haplotype reconstruction to infer the structure of heterogeneous virus populations. Nucleic Acids Res. 42(14), e115 (2014)","journal-title":"Nucleic Acids Res."},{"key":"1_CR15","doi-asserted-by":"crossref","unstructured":"Jablonski, K.P., Beerenwinkel, N.: Computational methods for viral quasispecies assembly. In: Virus Bioinformatics, pp. 51\u201364. Chapman and Hall\/CRC (2021)","DOI":"10.1201\/9781003097679-4"},{"key":"1_CR16","doi-asserted-by":"crossref","unstructured":"Ke, Z., Vikalo, H.: A convolutional auto-encoder for haplotype assembly and viral quasispecies reconstruction. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 13493\u201313503 (2020)","DOI":"10.1101\/2020.09.29.318642"},{"key":"1_CR17","doi-asserted-by":"crossref","unstructured":"Ke, Z., Vikalo, H.: A graph auto-encoder for haplotype assembly and viral quasispecies reconstruction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 719\u2013726 (2020)","DOI":"10.1609\/aaai.v34i01.5414"},{"issue":"5","key":"1_CR18","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1038\/s41587-019-0072-8","volume":"37","author":"M Kolmogorov","year":"2019","unstructured":"Kolmogorov, M., Yuan, J., Lin, Y., Pevzner, P.A.: Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37(5), 540\u2013546 (2019)","journal-title":"Nat. Biotechnol."},{"issue":"5","key":"1_CR19","doi-asserted-by":"publisher","first-page":"722","DOI":"10.1101\/gr.215087.116","volume":"27","author":"S Koren","year":"2017","unstructured":"Koren, S., Walenz, B.P., Berlin, K., Miller, J.R., Bergman, N.H., Phillippy, A.M.: Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27(5), 722\u2013736 (2017)","journal-title":"Genome Res."},{"issue":"18","key":"1_CR20","doi-asserted-by":"publisher","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","volume":"34","author":"H Li","year":"2018","unstructured":"Li, H.: Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34(18), 3094\u20133100 (2018)","journal-title":"Bioinformatics"},{"issue":"1","key":"1_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-022-02609-x","volume":"23","author":"H Liao","year":"2022","unstructured":"Liao, H., Cai, D., Sun, Y.: VirStrain: a strain identification tool for RNA viruses. Genome Biol. 23(1), 1\u201328 (2022)","journal-title":"Genome Biol."},{"issue":"1","key":"1_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-021-02587-6","volume":"23","author":"X Luo","year":"2022","unstructured":"Luo, X., Kang, X., Sch\u00f6nhuth, A.: Strainline: full-length de novo viral haplotype reconstruction from noisy long reads. Genome Biol. 23(1), 1\u201327 (2022)","journal-title":"Genome Biol."},{"issue":"11","key":"1_CR23","doi-asserted-by":"publisher","first-page":"1625","DOI":"10.1089\/cmb.2011.0151","volume":"18","author":"P Medvedev","year":"2011","unstructured":"Medvedev, P., Pham, S., Chaisson, M., Tesler, G., Pevzner, P.: Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers. J. Comput. Biol. 18(11), 1625\u20131634 (2011)","journal-title":"J. Comput. Biol."},{"issue":"1","key":"1_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/bioinformatics\/btab597","volume":"38","author":"D Meleshko","year":"2021","unstructured":"Meleshko, D., Hajirasouliha, I., Korobeynikov, A.: coronaSPAdes: from biosynthetic gene clusters to RNA viral assemblies. Bioinformatics 38(1), 1\u20138 (2021)","journal-title":"Bioinformatics"},{"issue":"7","key":"1_CR25","doi-asserted-by":"publisher","first-page":"1088","DOI":"10.1093\/bioinformatics\/btv697","volume":"32","author":"A Mikheenko","year":"2016","unstructured":"Mikheenko, A., Saveliev, V., Gurevich, A.: MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32(7), 1088\u20131090 (2016)","journal-title":"Bioinformatics"},{"key":"1_CR26","doi-asserted-by":"publisher","first-page":"523","DOI":"10.3389\/fmicb.2019.00523","volume":"10","author":"K Moelling","year":"2019","unstructured":"Moelling, K., Broecker, F.: Viruses and evolution-viruses first? A personal perspective. Front. Microbiol. 10, 523 (2019)","journal-title":"Front. Microbiol."},{"issue":"2","key":"1_CR27","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1089\/cmb.1995.2.275","volume":"2","author":"EW Myers","year":"1995","unstructured":"Myers, E.W.: Toward simplifying and accurately formulating fragment assembly. J. Comput. Biol. 2(2), 275\u2013290 (1995)","journal-title":"J. Comput. Biol."},{"issue":"5","key":"1_CR28","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1101\/gr.213959.116","volume":"27","author":"S Nurk","year":"2017","unstructured":"Nurk, S., Meleshko, D., Korobeynikov, A., Pevzner, P.A.: metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27(5), 824\u2013834 (2017)","journal-title":"Genome Res."},{"issue":"17","key":"1_CR29","doi-asserted-by":"publisher","first-page":"9748","DOI":"10.1073\/pnas.171285098","volume":"98","author":"PA Pevzner","year":"2001","unstructured":"Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. 98(17), 9748\u20139753 (2001)","journal-title":"Proc. Natl. Acad. Sci."},{"issue":"1","key":"1_CR30","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1109\/TCBB.2013.145","volume":"11","author":"S Prabhakaran","year":"2013","unstructured":"Prabhakaran, S., Rey, M., Zagordi, O., Beerenwinkel, N., Roth, V.: HIV haplotype inference using a propagating dirichlet process mixture model. IEEE\/ACM Trans. Comput. Biol. Bioinf. 11(1), 182\u2013191 (2013)","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinf."},{"issue":"8","key":"1_CR31","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1038\/nrg2583","volume":"10","author":"OG Pybus","year":"2009","unstructured":"Pybus, O.G., Rambaut, A.: Evolutionary analysis of the dynamics of viral infectious disease. Nat. Rev. Genet. 10(8), 540\u2013550 (2009)","journal-title":"Nat. Rev. Genet."},{"issue":"2","key":"1_CR32","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1038\/s41592-019-0669-3","volume":"17","author":"J Ruan","year":"2020","unstructured":"Ruan, J., Li, H.: Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17(2), 155\u2013158 (2020)","journal-title":"Nat. Methods"},{"key":"1_CR33","doi-asserted-by":"crossref","unstructured":"Stoler, N., Nekrutenko, A.: Sequencing error profiles of Illumina sequencing instruments. NAR Genomics Bioinform. 3(1), lqab019 (2021)","DOI":"10.1093\/nargab\/lqab019"},{"issue":"3","key":"1_CR34","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1003515","volume":"10","author":"A T\u00f6pfer","year":"2014","unstructured":"T\u00f6pfer, A., Marschall, T., Bull, R.A., Luciani, F., Sch\u00f6nhuth, A., Beerenwinkel, N.: Viral quasispecies assembly via maximal clique enumeration. PLoS Comput. Biol. 10(3), e1003515 (2014)","journal-title":"PLoS Comput. Biol."},{"issue":"1","key":"1_CR35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-021-24515-9","volume":"12","author":"R Vicedomini","year":"2021","unstructured":"Vicedomini, R., Quince, C., Darling, A.E., Chikhi, R.: Strainberry: automated strain separation in low-complexity metagenomes using long reads. Nat. Commun. 12(1), 1\u201314 (2021)","journal-title":"Nat. Commun."},{"key":"1_CR36","unstructured":"Xue, H., Rajan, V., Lin, Y.: Graph coloring via neural networks for haplotype assembly and viral quasispecies reconstruction. In: Advances in Neural Information Processing Systems (NeurIPS) (2022, to appear)"},{"issue":"12","key":"1_CR37","doi-asserted-by":"publisher","first-page":"2103","DOI":"10.1016\/j.cell.2022.04.035","volume":"185","author":"D Yamasoba","year":"2022","unstructured":"Yamasoba, D., et al.: Virological characteristics of the SARS-CoV-2 Omicron BA.2 spike. Cell 185(12), 2103\u20132115 (2022)","journal-title":"Cell"}],"container-title":["Lecture Notes in Computer Science","Research in Computational Molecular Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-29119-7_1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,3]],"date-time":"2023-04-03T10:23:29Z","timestamp":1680517409000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-29119-7_1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031291180","9783031291197"],"references-count":37,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-29119-7_1","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"3 April 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"RECOMB","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Research in Computational Molecular Biology","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Istanbul","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"T\u00fcrkiye","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2023","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"16 April 2023","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"19 April 2023","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"27","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"recomb2023","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"http:\/\/recomb2023.bilkent.edu.tr\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Single-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Easy Chair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"188","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"11","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"33","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"6% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"6","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Yes","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}