{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:10:56Z","timestamp":1772165456179,"version":"3.50.1"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"S12","license":[{"start":{"date-parts":[[2020,7,1]],"date-time":"2020-07-01T00:00:00Z","timestamp":1593561600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,24]],"date-time":"2020-07-24T00:00:00Z","timestamp":1595548800000},"content-version":"vor","delay-in-days":23,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Graph-based representation of genome assemblies has been recently used in different contexts \u2014 from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts \u2014 in particular the analysis of metagenomic sequencing data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-020-03590-7","type":"journal-article","created":{"date-parts":[[2020,7,23]],"date-time":"2020-07-23T19:04:18Z","timestamp":1595531058000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["SPAligner: alignment of long diverged molecular sequences to assembly graphs"],"prefix":"10.1186","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2253-1702","authenticated-orcid":false,"given":"Tatiana","family":"Dvorkina","sequence":"first","affiliation":[]},{"given":"Dmitry","family":"Antipov","sequence":"additional","affiliation":[]},{"given":"Anton","family":"Korobeynikov","sequence":"additional","affiliation":[]},{"given":"Sergey","family":"Nurk","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,7,24]]},"reference":[{"key":"3590_CR1","doi-asserted-by":"crossref","unstructured":"Nurk S, Bankevich A, Antipov D, Gurevich A, Korobeynikov A, Lapidus A, et al. Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads In In: Deng M, Jiang R, Sun F, Zhang X, editors. Research in Computational Molecular Biology, vol. 7821. Berlin Heidelberg: Springer. p. 158\u2013170. Available from: http:\/\/link.springer.com\/10.1007\/978-3-642-37195-0_13.","DOI":"10.1007\/978-3-642-37195-0_13"},{"key":"3590_CR2","doi-asserted-by":"crossref","unstructured":"Chikhi R, Rizk G. Space-Efficient and Exact de Bruijn Graph Representation Based on a Bloom Filter. In: WABI. vol. 7534 of Lecture Notes in Computer Science. Springer. p. 236\u2013248.","DOI":"10.1007\/978-3-642-33122-0_19"},{"key":"3590_CR3","doi-asserted-by":"crossref","unstructured":"Li D, Liu CM, Luo R, Sadakane K, Lam TW. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics; 31(10):1674\u20131676. Available from: http:\/\/dx.doi.org\/10.1093\/bioinformatics\/btv033.","DOI":"10.1093\/bioinformatics\/btv033"},{"key":"3590_CR4","doi-asserted-by":"crossref","unstructured":"Garrison E, Sir\u00e9n J, Novak AM, Hickey G, Eizenga JM, Dawson ET, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol; 36(875). Available from: http:\/\/dx.doi.org\/10.1038\/nbt.4227.","DOI":"10.1038\/nbt.4227"},{"key":"3590_CR5","doi-asserted-by":"publisher","unstructured":"Heydari M, Miclotte G, Van de Peer Y, Fostier J. BrownieAligner: accurate alignment of Illumina sequencing data to de Bruijn graphs. BMC Bioinformatics; 19(1). https:\/\/doi.org\/10.1186\/s12859-018-2319-7.","DOI":"10.1186\/s12859-018-2319-7"},{"key":"3590_CR6","unstructured":"Jain C, Zhang H, Gao Y, Aluru S. On the Complexity of Sequence to Graph Alignment. Available from: http:\/\/biorxiv.org\/lookup\/doi\/10.1101\/522912."},{"key":"3590_CR7","doi-asserted-by":"publisher","unstructured":"Kavya VNS, Tayal K, Srinivasan R, Sivadasan N. Sequence Alignment on Directed Graphs. https:\/\/doi.org\/10.1089\/cmb.2017.0264.","DOI":"10.1089\/cmb.2017.0264"},{"key":"3590_CR8","doi-asserted-by":"crossref","unstructured":"Limasset A, Cazaux B, Rivals E, Peterlongo P. Read mapping on de Bruijn graphs. BMC Bioinformatics; 17(1). http:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-016-1103-9.","DOI":"10.1186\/s12859-016-1103-9"},{"issue":"7","key":"3590_CR9","doi-asserted-by":"publisher","first-page":"1009","DOI":"10.1093\/bioinformatics\/btv688","volume":"32","author":"D Antipov","year":"2016","unstructured":"Antipov D, Korobeynikov A, McLean JS, Pevzner PA. hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics. 2016; 32(7):1009\u201315. doi:10.1093\/bioinformatics\/btv688.","journal-title":"Bioinformatics"},{"issue":"6","key":"3590_CR10","doi-asserted-by":"publisher","first-page":"e1005595","DOI":"10.1371\/journal.pcbi.1005595","volume":"13","author":"RR Wick","year":"2017","unstructured":"Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017; 13(6):e1005595. https:\/\/doi.org\/10.1371\/journal.pcbi.1005595.","journal-title":"PLoS Comput Biol"},{"issue":"24","key":"3590_CR11","doi-asserted-by":"publisher","first-page":"3506","DOI":"10.1093\/bioinformatics\/btu538","volume":"30","author":"L Salmela","year":"2014","unstructured":"Salmela L, Rivals E. LoRDEC: accurate and efficient long read error correction. Bioinformatics. 2014; 30(24):3506\u201314. doi:10.1093\/bioinformatics\/btu538.","journal-title":"Bioinformatics"},{"issue":"13","key":"3590_CR12","doi-asserted-by":"publisher","first-page":"i105","DOI":"10.1093\/bioinformatics\/bty279","volume":"34","author":"S Garg","year":"2018","unstructured":"Garg S, Rautiainen M, Novak AM, Garrison E, Durbin R, Marschall T. A graph-based approach to diploid genome assembly. Bioinformatics. 2018; 34(13):i105\u201314. doi:10.1093\/bioinformatics\/bty279.","journal-title":"Bioinformatics"},{"key":"3590_CR13","doi-asserted-by":"crossref","unstructured":"Rautiainen M, M\u00e4kinen V, Marschall T. Bit-parallel sequence-to-graph alignment. Bioinformatics. 2019. https:\/\/academic.oup.com\/bioinformatics\/advance-article\/doi\/10.1093\/bioinformatics\/btz162\/5372677.","DOI":"10.1101\/323063"},{"key":"3590_CR14","unstructured":"Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. 2013. https:\/\/arxiv.org\/abs\/1303.3997."},{"issue":"1","key":"3590_CR15","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1006\/jagm.1999.1063","volume":"35","author":"A Amir","year":"2000","unstructured":"Amir A, Lewenstein M, Lewenstein N. Pattern Matching in Hypertext. J Algorithms. 2000; 35(1):82\u201399. https:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S0196677499910635.","journal-title":"J Algorithms"},{"key":"3590_CR16","doi-asserted-by":"crossref","unstructured":"Myers EW. AnO(ND) difference algorithm and its variations. 1986; 1(1):251\u201366. http:\/\/link.springer.com\/10.1007\/BF01840446.","DOI":"10.1007\/BF01840446"},{"issue":"3","key":"3590_CR17","doi-asserted-by":"publisher","first-page":"705","DOI":"10.1016\/0022-2836(82)90398-9","volume":"162","author":"O Gotoh","year":"1982","unstructured":"Gotoh O. An improved algorithm for matching biological sequences. J Mol Biol. 1982; 162(3):705\u20138. https:\/\/linkinghub.elsevier.com\/retrieve\/pii\/0022283682903989.","journal-title":"J Mol Biol"},{"issue":"1","key":"3590_CR18","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1145\/375360.375365","volume":"33","author":"G Navarro","year":"2001","unstructured":"Navarro G. A guided tour to approximate string matching. ACM Comput Surv (CSUR). 2001; 33(1):31\u201388. http:\/\/portal.acm.org\/citation.cfm?doid=375360.375365.","journal-title":"ACM Comput Surv (CSUR)"},{"key":"3590_CR19","unstructured":"Rautiainen M, Marschall T. Aligning sequences to general graphs in (+) time. http:\/\/biorxiv.org\/lookup\/doi\/10.1101\/216127."},{"key":"3590_CR20","unstructured":"Pearson WR. Selecting the Right Similarity-Scoring Matrix: Selecting the Right Similarity-Scoring Matrix In In: Bateman A, Pearson WR, Stein LD, Stormo GD, Yates JR, editors. Current Protocols in Bioinformatics. Wiley. p. 3.5.1\u20139. http:\/\/doi.wiley.com\/10.1002\/0471250953.bi0305s43."},{"issue":"1","key":"3590_CR21","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1186\/s12859-016-0930-z","volume":"17","author":"J Daily","year":"2016","unstructured":"Daily J. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics. 2016; 17(1):81. https:\/\/doi.org\/10.1186\/s12859-016-0930-z.","journal-title":"BMC Bioinformatics"},{"key":"3590_CR22","unstructured":"Sir\u00e9n J. Indexing Variation Graphs:13\u201327. http:\/\/arxiv.org\/abs\/1604.06605."},{"key":"3590_CR23","unstructured":"Rautiainen M, Marschall T. GraphAligner: Rapid and Versatile Sequence-to-Graph Alignment. http:\/\/biorxiv.org\/lookup\/doi\/10.1101\/810812."},{"key":"3590_CR24","doi-asserted-by":"crossref","unstructured":"Mar\u00e7ais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: A fast and versatile genome alignment system. 2018; 14(1):e1005944. https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1005944.","DOI":"10.1371\/journal.pcbi.1005944"},{"issue":"3","key":"3590_CR25","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1038\/nrg3367","volume":"14","author":"N Nagarajan","year":"2013","unstructured":"Nagarajan N, Pop M. Sequence assembly demystified. Nat Rev Genet. 2013; 14(3):157\u201367. http:\/\/www.nature.com\/articles\/nrg3367.","journal-title":"Nat Rev Genet"},{"key":"3590_CR26","doi-asserted-by":"crossref","unstructured":"Barnum TP, Figueroa IA, Carlstr\u00f6m CI, Lucas LN, Engelbrektson AL, Coates JD. Genome-resolved metagenomics identifies genetic mobility, metabolic interactions, and unexpected diversity in perchlorate-reducing communities; 12(6):1568\u201381. http:\/\/www.nature.com\/articles\/s41396-018-0081-5.","DOI":"10.1038\/s41396-018-0081-5"},{"issue":"4","key":"3590_CR27","doi-asserted-by":"publisher","first-page":"534","DOI":"10.1101\/gr.183012.114","volume":"25","author":"I Sharon","year":"2015","unstructured":"Sharon I, Kertesz M, Hug LA, Pushkarev D, Blauwkamp TA, Castelle CJ, et al. Accurate, multi-kb reads resolve complex populations and detect rare microorganisms. Genome Res. 2015; 25(4):534\u201343. http:\/\/genome.cshlp.org\/lookup\/doi\/10.1101\/gr.183012.114.","journal-title":"Genome Res"},{"issue":"6","key":"3590_CR28","doi-asserted-by":"publisher","first-page":"1882","DOI":"10.1111\/1462-2920.12086","volume":"15","author":"M Shakya","year":"2013","unstructured":"Shakya M, Quince C, Campbell JH, Yang ZK, Schadt CW, Podar M. Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities: Metagenomic and rRNA diversity characterization. Environ Microbiol. 2013; 15(6):1882\u201399. http:\/\/doi.wiley.com\/10.1111\/1462-2920.12086.","journal-title":"Environ Microbiol"},{"issue":"5","key":"3590_CR29","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1101\/gr.213959.116","volume":"27","author":"S Nurk","year":"2017","unstructured":"Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 2017; 27(5):824\u201334. http:\/\/genome.cshlp.org\/lookup\/doi\/10.1101\/gr.213959.116.","journal-title":"Genome Res"},{"key":"3590_CR30","unstructured":"Awad S, Irber L, Brown CT. Evaluating Metagenome Assembly on a Simple Defined Community with Many Strain Variants. http:\/\/biorxiv.org\/lookup\/doi\/10.1101\/155358."},{"issue":"1","key":"3590_CR31","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1093\/nar\/28.1.45","volume":"28","author":"A Bairoch","year":"2000","unstructured":"Bairoch A. The SWISS-PROT protein sequence database and its supplement TrEMBL in. Nucleic Acids Res. 2000; 28(1):45\u201348. https:\/\/academic.oup.com\/nar\/article-lookup\/doi\/10.1093\/nar\/28.1.45.","journal-title":"Nucleic Acids Res"},{"key":"3590_CR32","doi-asserted-by":"crossref","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. 1990; 215(3):403\u2013410. https:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S0022283605803602.","DOI":"10.1016\/S0022-2836(05)80360-2"},{"key":"3590_CR33","doi-asserted-by":"crossref","unstructured":"Ng C, Tay M, Tan B, Le TH, Haller L, Chen H, et al. Characterization of Metagenomes in Urban Aquatic Compartments Reveals High Prevalence of Clinically Relevant Antibiotic Resistance Genes in Wastewaters. Front Microbiol. 2017; 8. http:\/\/journal.frontiersin.org\/article\/10.3389\/fmicb.2017.02200\/full.","DOI":"10.3389\/fmicb.2017.02200"},{"key":"3590_CR34","unstructured":"Feldgarden M, Brover V, Haft DH, Prasad AB, Slotta DJ, Tolstoy I, et al. Using the NCBI AMRFinder Tool to Determine Antimicrobial Resistance Genotype-Phenotype Correlations Within a Collection of NARMS Isolates. http:\/\/biorxiv.org\/lookup\/doi\/10.1101\/550707."},{"issue":"20","key":"3590_CR35","doi-asserted-by":"publisher","first-page":"3350","DOI":"10.1093\/bioinformatics\/btv383","volume":"31","author":"RR Wick","year":"2015","unstructured":"Wick RR, Schultz MB, Zobel J, Holt KE. Bandage: interactive visualization of de novo genome assemblies: Fig. 1. Bioinformatics. 2015; 31(20):3350\u20133352. https:\/\/academic.oup.com\/bioinformatics\/article-lookup\/doi\/10.1093\/bioinformatics\/btv383.","journal-title":"Bioinformatics"},{"issue":"6","key":"3590_CR36","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1038\/s41592-018-0001-7","volume":"15","author":"FJ Sedlazeck","year":"2018","unstructured":"Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018; 15(6):461\u201368. https:\/\/doi.org\/10.1038\/s41592-018-0001-7.","journal-title":"Nat Methods"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03590-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03590-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03590-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,23]],"date-time":"2021-07-23T19:06:19Z","timestamp":1627067179000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03590-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7]]},"references-count":36,"journal-issue":{"issue":"S12","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["3590"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03590-7","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/744755","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7]]},"assertion":[{"value":"2 June 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 June 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"306"}}