{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T04:20:11Z","timestamp":1759638011536,"version":"3.37.3"},"reference-count":63,"publisher":"Springer Science and Business Media LLC","issue":"S8","license":[{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,16]],"date-time":"2020-09-16T00:00:00Z","timestamp":1600214400000},"content-version":"vor","delay-in-days":15,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n<jats:title>Background<\/jats:title>\n<jats:p>In [Prezza et al., AMB 2019], a new reference-free and alignment-free framework for the detection of SNPs was suggested and tested. The framework, based on the Burrows-Wheeler Transform (BWT), significantly improves sensitivity and precision of previous de Bruijn graphs based tools by overcoming several of their limitations, namely: (i) the need to establish a fixed value, usually small, for the order <jats:italic>k<\/jats:italic>, (ii) the loss of important information such as <jats:italic>k<\/jats:italic>-mer coverage and adjacency of <jats:italic>k<\/jats:italic>-mers within the same read, and (iii) bad performance in repeated regions longer than <jats:italic>k<\/jats:italic> bases. The preliminary tool, however, was able to identify only SNPs and it was too slow and memory consuming due to the use of additional heavy data structures (namely, the Suffix and LCP arrays), besides the BWT.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Results<\/jats:title>\n<jats:p>In this paper, we introduce a new algorithm and the corresponding tool ebwt2InDel that (i) extend the framework of [Prezza et al., AMB 2019] to detect also INDELs, and (ii) implements recent algorithmic findings that allow to perform the whole analysis using just the BWT, thus reducing the working space by one order of magnitude and allowing the analysis of full genomes. Finally, we describe a simple strategy for effectively parallelizing our tool for SNP detection only. On a 24-cores machine, the parallel version of our tool is one order of magnitude faster than the sequential one. The tool ebwt2InDel is available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/nicolaprezza\/ebwt2InDel\">github.com\/nicolaprezza\/ebwt2InDel<\/jats:ext-link>.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Conclusions<\/jats:title>\n<jats:p>Results on a synthetic dataset covered at 30x (Human chromosome 1) show that our tool is indeed able to find up to 83% of the SNPs and 72% of the existing INDELs. These percentages considerably improve the 71% of SNPs and 51% of INDELs found by the state-of-the art tool based on de Bruijn graphs. We furthermore report results on larger (real) Human whole-genome sequencing experiments. Also in these cases, our tool exhibits a much higher sensitivity than the state-of-the art tool.<\/jats:p>\n<\/jats:sec>","DOI":"10.1186\/s12859-020-03586-3","type":"journal-article","created":{"date-parts":[[2020,9,16]],"date-time":"2020-09-16T04:07:47Z","timestamp":1600229267000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Variable-order reference-free variant discovery with the Burrows-Wheeler Transform"],"prefix":"10.1186","volume":"21","author":[{"given":"Nicola","family":"Prezza","sequence":"first","affiliation":[]},{"given":"Nadia","family":"Pisanti","sequence":"additional","affiliation":[]},{"given":"Marinella","family":"Sciortino","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5075-1214","authenticated-orcid":false,"given":"Giovanna","family":"Rosone","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,16]]},"reference":[{"key":"3586_CR1","doi-asserted-by":"publisher","unstructured":"Peterlongo P, Schnel N, Pisanti N, Sagot M, Lacroix V. Identifying SNPs without a Reference Genome by comparing raw reads. In: SPIRE, LNCS 6393: 2010. p. 147\u201358. https:\/\/doi.org\/10.1007\/978-3-642-16321-0_14.","DOI":"10.1007\/978-3-642-16321-0_14"},{"issue":"S-6","key":"3586_CR2","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1186\/1471-2105-13-S6-S5","volume":"13","author":"GAT Sacomoto","year":"2012","unstructured":"Sacomoto GAT, Kielbassa J, Chikhi R, Uricaru R, Antoniou P, Sagot M, Peterlongo P, Lacroix V. KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. BMC Bioinf. 2012; 13(S-6):5. https:\/\/doi.org\/10.1186\/1471-2105-13-S6-S5.","journal-title":"BMC Bioinf"},{"issue":"4","key":"3586_CR3","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/1471-2164-15-S4-S10","volume":"15","author":"RM Leggett","year":"2014","unstructured":"Leggett RM, MacLean D. Reference-free SNP detection: dealing with the data deluge. BMC Genomics. 2014; 15(4):10. https:\/\/doi.org\/10.1186\/1471-2164-15-S4-S10.","journal-title":"BMC Genomics"},{"issue":"2","key":"3586_CR4","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1038\/ng.1028","volume":"44","author":"Z Iqbal","year":"2012","unstructured":"Iqbal Z, Turner I, McVean G, Flicek P, Caccamo M. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet. 2012; 44(2):226\u201332. https:\/\/doi.org\/10.1038\/ng.1028.","journal-title":"Nat Genet"},{"issue":"2","key":"3586_CR5","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1093\/nar\/gku1187","volume":"43","author":"R Uricaru","year":"2015","unstructured":"Uricaru R, Rizk G, Lacroix V, Quillery E, Plantard O, Chikhi R, Lemaitre C, Peterlongo P. Reference-free detection of isolated SNPs. Nuc Acids Res. 2015; 43(2):11. https:\/\/doi.org\/10.1093\/nar\/gku1187.","journal-title":"Nuc Acids Res"},{"key":"3586_CR6","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.WABI.2018.3","volume-title":"18th Workshop on Algorithms in Bioinformatics (WABI 2018), LIPIcs, vol. 113","author":"N Prezza","year":"2018","unstructured":"Prezza N, Pisanti N, Sciortino M, Rosone G. Detecting Mutations by eBWT. In: 18th Workshop on Algorithms in Bioinformatics (WABI 2018), LIPIcs, vol. 113. Dagstuhl, Germany: Schloss Dagstuhl\u2013Leibniz-Zentrum fuer Informatik: 2018. p. 3\u20131315. https:\/\/doi.org\/10.4230\/LIPIcs.WABI.2018.3."},{"issue":"1","key":"3586_CR7","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1186\/s13015-019-0137-8","volume":"14","author":"N Prezza","year":"2019","unstructured":"Prezza N, Pisanti N, Sciortino M, Rosone G. SNPs detection by eBWT positional clustering. Algoritm Mol Biol. 2019; 14(1):3. https:\/\/doi.org\/10.1186\/s13015-019-0137-8.","journal-title":"Algoritm Mol Biol"},{"key":"3586_CR8","doi-asserted-by":"publisher","unstructured":"Peterlongo P, Riou C, Drezen E, Lemaitre C. DiscoSnp++: de novo detection of small variants from raw unassembled read set(s). bioRxiv. 2017. https:\/\/doi.org\/10.1101\/209965.","DOI":"10.1101\/209965"},{"issue":"1","key":"3586_CR9","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1101\/gr.132480.111","volume":"23","author":"S Li","year":"2013","unstructured":"Li S, Li R, Li H, Lu J, Li Y, Bolund L, Schierup MH, Wang J. SOAPindel: efficient identification of indels from short paired reads. Gen Res. 2013; 23(1):195\u2013200. https:\/\/doi.org\/10.1101\/gr.132480.111.","journal-title":"Gen Res"},{"issue":"24","key":"3586_CR10","doi-asserted-by":"publisher","first-page":"3506","DOI":"10.1093\/bioinformatics\/btu538","volume":"30","author":"L Salmela","year":"2014","unstructured":"Salmela L, Rivals E. LoRDEC: accurate and efficient long read error correction. Bioinformatics. 2014; 30(24):3506\u201314. https:\/\/doi.org\/10.1093\/bioinformatics\/btu538.","journal-title":"Bioinformatics"},{"issue":"6","key":"3586_CR11","doi-asserted-by":"publisher","first-page":"799","DOI":"10.1093\/bioinformatics\/btw321","volume":"33","author":"L Salmela","year":"2017","unstructured":"Salmela L, Walve R, Rivals E, Ukkonen E. Accurate self-correction of errors in long reads using de Bruijn graphs. Bioinformatics. 2017; 33(6):799\u2013806. https:\/\/doi.org\/10.1093\/bioinformatics\/btw321.","journal-title":"Bioinformatics"},{"issue":"5","key":"3586_CR12","doi-asserted-by":"publisher","first-page":"1374","DOI":"10.1093\/bioinformatics\/btz102","volume":"36","author":"A Limasset","year":"2019","unstructured":"Limasset A, Flot J, Peterlongo P. Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs. Bioinformatics. 2019; 36(5):1374\u201381. https:\/\/doi.org\/10.1093\/bioinformatics\/btz102.","journal-title":"Bioinformatics"},{"key":"3586_CR13","doi-asserted-by":"publisher","unstructured":"Lemaitre C, Ciortuz L, Peterlongo P. Mapping-free and assembly-free discovery of inversion breakpoints from raw NGS reads. In: AlCoB: 2014. p. 119\u201330. https:\/\/doi.org\/10.1007\/978-3-319-07953-0_10.","DOI":"10.1007\/978-3-319-07953-0_10"},{"issue":"9","key":"3586_CR14","doi-asserted-by":"crossref","first-page":"718","DOI":"10.1089\/cmb.2015.0220","volume":"23","author":"P Bonizzoni","year":"2016","unstructured":"Bonizzoni P, Dondi R, Klau GW, Pirola Y, Pisanti N, Zaccaria S. On the minimum error correction problem for haplotype assembly in diploid and polyploid genomes. J Comput Biol. 2016; 23(9):718\u201336.","journal-title":"J Comput Biol"},{"issue":"11","key":"3586_CR15","doi-asserted-by":"crossref","first-page":"1610","DOI":"10.1093\/bioinformatics\/btv495","volume":"32","author":"Y Pirola","year":"2016","unstructured":"Pirola Y, Zaccaria S, Dondi R, Klau GW, Pisanti N, Bonizzoni P. Hapcol: accurate and memory-efficient haplotype assembly from long reads. Bioinform. 2016; 32(11):1610\u20137.","journal-title":"Bioinform"},{"issue":"6","key":"3586_CR16","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1089\/cmb.2014.0157","volume":"22","author":"M Patterson","year":"2015","unstructured":"Patterson M, Marschall T, Pisanti N, van Iersel L, Stougie L, Klau GW, Sch\u00f6nhuth A. Whatshap: Weighted haplotype assembly for future-generation sequencing reads. J Comput Biol. 2015; 22(6):498\u2013509.","journal-title":"J Comput Biol"},{"key":"3586_CR17","doi-asserted-by":"publisher","unstructured":"Birmel\u00e9 E, Crescenzi P, Ferreira RA, Grossi R, Lacroix V, Marino A, Pisanti N, Sacomoto GAT, Sagot M. Efficient Bubble Enumeration in Directed Graphs. In: SPIRE, LNCS 7608: 2012. p. 118\u201329. https:\/\/doi.org\/10.1007\/978-3-642-34109-0_13.","DOI":"10.1007\/978-3-642-34109-0_13"},{"issue":"3","key":"3586_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0060058","volume":"8","author":"RM Leggett","year":"2013","unstructured":"Leggett RM, Ramirez-Gonzalez RH, Verweij W, Kawashima CG, Iqbal Z, Jones JDG, Caccamo M, MacLean D. Identifying and Classifying Trait Linked Polymorphisms in Non-Reference Species by Walking Coloured de Bruijn Graphs. PLoS ONE. 2013; 8(3):1\u201311. https:\/\/doi.org\/10.1371\/journal.pone.0060058.","journal-title":"PLoS ONE"},{"issue":"suppl.18","key":"3586_CR19","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1186\/1471-2105-16-S18-S5","volume":"16","author":"K Kimura","year":"2015","unstructured":"Kimura K, Koike A. Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data. BMC Bioinf. 2015; 16(suppl.18):5. https:\/\/doi.org\/10.1186\/1471-2105-16-S18-S5.","journal-title":"BMC Bioinf"},{"issue":"10","key":"3586_CR20","doi-asserted-by":"publisher","first-page":"1577","DOI":"10.1093\/bioinformatics\/btv024","volume":"31","author":"K Kimura","year":"2015","unstructured":"Kimura K, Koike A. Ultrafast SNP analysis using the Burrows-Wheeler transform of short-read data. Bioinformatics. 2015; 31(10):1577\u201383. https:\/\/doi.org\/10.1093\/bioinformatics\/btv024.","journal-title":"Bioinformatics"},{"key":"3586_CR21","doi-asserted-by":"publisher","first-page":"242","DOI":"10.1186\/1471-2105-12-242","volume":"12","author":"N Philippe","year":"2011","unstructured":"Philippe N, Salson M, Lecroq T, L\u00e9onard M, Commes T, Rivals E. Querying large read collections in main memory: a versatile data structure. BMC Bioinf. 2011; 12:242. https:\/\/doi.org\/10.1186\/1471-2105-12-242.","journal-title":"BMC Bioinf"},{"key":"3586_CR22","doi-asserted-by":"publisher","unstructured":"V\u00e4lim\u00e4ki N, Rivals E. Scalable and Versatile k-mer Indexing for High-Throughput Sequencing Data. In: ISBRA, LNCS 7875: 2013. p. 237\u201348. https:\/\/doi.org\/10.1007\/978-3-642-38036-5_24.","DOI":"10.1007\/978-3-642-38036-5_24"},{"key":"3586_CR23","doi-asserted-by":"publisher","unstructured":"Kowalski TM, Grabowski S, Deorowicz S. Indexing arbitrary-length k-mers in sequencing reads. PLoS ONE. 2015; 10(7). https:\/\/doi.org\/10.1371\/journal.pone.0133198.","DOI":"10.1371\/journal.pone.0133198"},{"issue":"5","key":"3586_CR24","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1016\/S0020-0190(02)00512-4","volume":"86","author":"S Mantaci","year":"2003","unstructured":"Mantaci S, Restivo A, Sciortino M. Burrows-Wheeler transform and Sturmian words. Inf Process Lett. 2003; 86(5):241\u20136. https:\/\/doi.org\/10.1016\/S0020-0190(02)00512-4.","journal-title":"Inf Process Lett"},{"issue":"3","key":"3586_CR25","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.tcs.2007.07.019","volume":"387","author":"R Giancarlo","year":"2007","unstructured":"Giancarlo R, Restivo A, Sciortino M. From first principles to the Burrows and Wheeler transform and beyond, via combinatorial optimization. Theoret Comput Sci. 2007; 387(3):236\u201348.","journal-title":"Theoret Comput Sci"},{"key":"3586_CR26","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1016\/j.tcs.2017.07.015","volume":"698","author":"S Mantaci","year":"2017","unstructured":"Mantaci S, Restivo A, Rosone G, Sciortino M, Versari L. Measuring the clustering effect of BWT via RLE. Theor Comput Sci. 2017; 698:79\u201387. https:\/\/doi.org\/10.1016\/j.tcs.2017.07.015.","journal-title":"Theor Comput Sci"},{"key":"3586_CR27","doi-asserted-by":"crossref","unstructured":"Kempa D, Kociumaka T. Resolution of the Burrows-Wheeler Transform Conjecture. CoRR. 2019; abs\/1910.10631.","DOI":"10.1109\/FOCS46700.2020.00097"},{"key":"3586_CR28","doi-asserted-by":"publisher","first-page":"230","DOI":"10.1016\/j.tcs.2019.11.002","volume":"812","author":"R Giancarlo","year":"2020","unstructured":"Giancarlo R, Manzini G, Restivo A, Rosone G, Sciortino M. The Alternating BWT: An algorithmic perspective. Theor Comput Sci. 2020; 812:230\u201343. https:\/\/doi.org\/10.1016\/j.tcs.2019.11.002.","journal-title":"Theor Comput Sci"},{"key":"3586_CR29","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.CPM.2019.12","volume-title":"Annual Symposium on Combinatorial Pattern Matching (CPM), LIPIcs, vol. 128","author":"R Giancarlo","year":"2019","unstructured":"Giancarlo R, Manzini G, Rosone G, Sciortino M. A new class of searchable and provably highly compressible string transformations. In: Annual Symposium on Combinatorial Pattern Matching (CPM), LIPIcs, vol. 128. Dagstuhl, Germany: Schloss Dagstuhl\u2013Leibniz-Zentrum fuer Informatik: 2019. https:\/\/doi.org\/10.4230\/LIPIcs.CPM.2019.12."},{"key":"3586_CR30","unstructured":"Giuliani S, Lipt\u00e1k Z, Rizzi R. When a dollar makes a BWT. In: 20th Italian Conference on Theoretical Computer Science, (ICTCS 2019), CEUR Workshop Proceedings, vol. 2504. CEUR-WS.org: 2019. p. 20\u201333."},{"issue":"3","key":"3586_CR31","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1007\/s00224-007-9078-6","volume":"42","author":"S Mantaci","year":"2008","unstructured":"Mantaci S, Restivo A, Rosone G, Sciortino M. A new combinatorial approach to sequence comparison. Theory Comput Syst. 2008; 42(3):411\u201329. https:\/\/doi.org\/10.1007\/s00224-007-9078-6.","journal-title":"Theory Comput Syst"},{"issue":"1","key":"3586_CR32","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/j.ijar.2007.03.011","volume":"47","author":"S Mantaci","year":"2008","unstructured":"Mantaci S, Restivo A, Sciortino M. Distance measures for biological sequences: Some recent approaches. Int J Approx Reason. 2008; 47(1):109\u201324. https:\/\/doi.org\/10.1016\/j.ijar.2007.03.011.","journal-title":"Int J Approx Reason"},{"issue":"4","key":"3586_CR33","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1016\/j.jtbi.2009.10.033","volume":"262","author":"L Yang","year":"2010","unstructured":"Yang L, Zhang X, Wang T. The Burrows-Wheeler similarity distribution between biological sequences based on Burrows-Wheeler transform. J Theor Biol. 2010; 262(4):742\u20139. https:\/\/doi.org\/10.1016\/j.jtbi.2009.10.033.","journal-title":"J Theor Biol"},{"issue":"5","key":"3586_CR34","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","volume":"26","author":"H Li","year":"2010","unstructured":"Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010; 26(5):589\u201395. https:\/\/doi.org\/10.1093\/bioinformatics\/btp698.","journal-title":"Bioinformatics"},{"issue":"11","key":"3586_CR35","doi-asserted-by":"publisher","first-page":"1415","DOI":"10.1093\/bioinformatics\/bts173","volume":"28","author":"A Cox","year":"2012","unstructured":"Cox A, Bauer M, Jakobi T, Rosone G. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform. Bioinformatics. 2012; 28(11):1415\u20139. https:\/\/doi.org\/10.1093\/bioinformatics\/bts173.","journal-title":"Bioinformatics"},{"key":"3586_CR36","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-39053-1_42","volume-title":"The Nature of Computation. Logic, Algorithms, Applications - 9th Conference on Computability in Europe, CiE 2013. Proceedings, LNCS, vol. 7921","author":"G Rosone","year":"2013","unstructured":"Rosone G, Sciortino M. The Burrows-Wheeler Transform between Data Compression and Combinatorics on Words. In: The Nature of Computation. Logic, Algorithms, Applications - 9th Conference on Computability in Europe, CiE 2013. Proceedings, LNCS, vol. 7921. Berlin, Heidelberg: Springer: 2013. p. 353\u201364. https:\/\/doi.org\/10.1007\/978-3-642-39053-1_42."},{"key":"3586_CR37","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/j.jda.2016.03.003","volume":"37","author":"AJ Cox","year":"2016","unstructured":"Cox AJ, Garofalo F, Rosone G, Sciortino M. Lightweight LCP construction for very large collections of strings. J Discret Algoritm. 2016; 37:17\u201333. https:\/\/doi.org\/10.1016\/j.jda.2016.03.003.","journal-title":"J Discret Algoritm"},{"issue":"1","key":"3586_CR38","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/s13015-019-0140-0","volume":"14","author":"L Egidi","year":"2019","unstructured":"Egidi L, Louza FA, Manzini G, Telles GP. External memory BWT and LCP computation for sequence collections with applications. Algoritm Mol Biol. 2019; 14(1):6\u20131615. https:\/\/doi.org\/10.1186\/s13015-019-0140-0.","journal-title":"Algoritm Mol Biol"},{"key":"3586_CR39","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1016\/j.tcs.2017.06.016","volume":"698","author":"T Gagie","year":"2017","unstructured":"Gagie T, Manzini G, Sir\u00e9n J. Wheeler graphs: A framework for BWT-based data structures. Theor Comput Sci. 2017; 698:67\u201378. https:\/\/doi.org\/10.1016\/j.tcs.2017.06.016.","journal-title":"Theor Comput Sci"},{"issue":"1","key":"3586_CR40","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1145\/3375890","volume":"67","author":"T Gagie","year":"2020","unstructured":"Gagie T, Navarro G, Prezza N. Fully Functional Suffix Trees and Optimal Text Searching in BWT-Runs Bounded Space. J ACM. 2020; 67(1):2\u20131254.","journal-title":"J ACM"},{"issue":"3","key":"3586_CR41","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1016\/j.tcs.2007.07.014","volume":"387","author":"S Mantaci","year":"2007","unstructured":"Mantaci S, Restivo A, Rosone G, Sciortino M. An extension of the Burrows-Wheeler Transform. Theoret Comput Sci. 2007; 387(3):298\u2013312. https:\/\/doi.org\/10.1016\/j.tcs.2007.07.014.","journal-title":"Theoret Comput Sci"},{"issue":"0","key":"3586_CR42","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1016\/j.tcs.2012.02.002","volume":"483","author":"MJ Bauer","year":"2013","unstructured":"Bauer MJ, Cox AJ, Rosone G. Lightweight algorithms for constructing and inverting the BWT of string collections. Theoret Comput Sci. 2013; 483(0):134\u201348. https:\/\/doi.org\/10.1016\/j.tcs.2012.02.002.","journal-title":"Theoret Comput Sci"},{"key":"3586_CR43","unstructured":"BCR_LCP_GSA. GitHub repository. https:\/\/github.com\/giovannarosone\/BCR_LCP_GSA.git. Accessed 19 Feb 2020."},{"key":"3586_CR44","unstructured":"eGAP. GitHub repository. https:\/\/github.com\/felipelouza\/egap.git. Accessed 1 Nov 2019."},{"key":"3586_CR45","unstructured":"sacak-lcp. GitHub repository. https:\/\/github.com\/felipelouza\/sacak-lcp.git. Accessed 1 Nov 2019."},{"key":"3586_CR46","unstructured":"ropebwt, 2. GitHub repository. https:\/\/github.com\/lh3\/ropebwt2.git. Accessed 1 Nov 2019."},{"key":"3586_CR47","unstructured":"BEETL. GitHub repository. https:\/\/github.com\/BEETL\/BEETL.git. Accessed 1 Nov 2019."},{"issue":"2","key":"3586_CR48","doi-asserted-by":"publisher","first-page":"300","DOI":"10.1101\/gr.211748.116","volume":"27","author":"DD Dolle","year":"2017","unstructured":"Dolle DD, Liu Z, Cotten M, Simpson JT, Iqbal Z, Durbin R, McCarthy SA, Keane TM. Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes. Gen Res. 2017; 27(2):300\u20139. https:\/\/doi.org\/10.1101\/gr.211748.116.","journal-title":"Gen Res"},{"key":"3586_CR49","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1038\/nature15393","volume":"526","author":"The 1000 Genomes Project Consortium","year":"2015","unstructured":"The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015; 526:68\u201374. https:\/\/doi.org\/10.1038\/nature15393.","journal-title":"Nature"},{"key":"3586_CR50","doi-asserted-by":"publisher","unstructured":"Cox AJ, Jakobi T, Rosone G, Schulz-Trieglaff OB. Comparing DNA sequence collections by direct comparison of compressed text indexes. In: 12th Workshop on Algorithms in Bioinformatics (WABI 2012), LNBI 7534: 2012. p. 214\u201324. https:\/\/doi.org\/10.1007\/978-3-642-33122-0_17.","DOI":"10.1007\/978-3-642-33122-0_17"},{"issue":"5","key":"3586_CR51","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1186\/1471-2105-14-S5-S2","volume":"14","author":"C Ander","year":"2013","unstructured":"Ander C, Schulz-Trieglaff OB, Stoye J, Cox AJ. metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences. BMC Bioinf. 2013; 14(5):2. https:\/\/doi.org\/10.1186\/1471-2105-14-S5-S2.","journal-title":"BMC Bioinf"},{"key":"3586_CR52","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-18174-1_8","volume-title":"Algorithms for Computational Biology, LNCS, vol. 11488 LNBI","author":"V Guerrini","year":"2019","unstructured":"Guerrini V, Rosone G. Lightweight Metagenomic Classification via eBWT. In: Algorithms for Computational Biology, LNCS, vol. 11488 LNBI. Cham: Springer: 2019. p. 112\u201324. https:\/\/doi.org\/10.1007\/978-3-030-18174-1_8."},{"issue":"27","key":"3586_CR53","doi-asserted-by":"publisher","first-page":"3019","DOI":"10.1016\/j.tcs.2010.11.040","volume":"412","author":"A Restivo","year":"2011","unstructured":"Restivo A, Rosone G. Balancing and clustering of words in the Burrows-Wheeler transform. Theoret Comput Sci. 2011; 412(27):3019\u201332. https:\/\/doi.org\/10.1016\/j.tcs.2010.11.040.","journal-title":"Theoret Comput Sci"},{"key":"3586_CR54","doi-asserted-by":"publisher","unstructured":"Mantaci S, Restivo A, Rosone G, Sciortino M. Burrows-Wheeler Transform and Run-Length Enconding. In: Combinatorics on Words - 11th International Conference, WORDS 2017. Proceedings, LNCS, vol. 10432: 2017. p. 228\u201339. https:\/\/doi.org\/10.1007\/978-3-319-66396-8_21.","DOI":"10.1007\/978-3-319-66396-8_21"},{"key":"3586_CR55","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611975031.96","volume-title":"Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA \u201918","author":"T Gagie","year":"2018","unstructured":"Gagie T, Navarro G, Prezza N. Optimal-time Text Indexing in BWT-runs Bounded Space. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA \u201918. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics: 2018. p. 1459\u201377. https:\/\/doi.org\/10.1137\/1.9781611975031.96."},{"key":"3586_CR56","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.CPM.2019.7","volume-title":"30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019), LIPIcs, vol. 128","author":"N Prezza","year":"2019","unstructured":"Prezza N, Rosone G. Space-Efficient Computation of the LCP Array from the Burrows-Wheeler Transform In: Pisanti N, Pissis SP, editors. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019), LIPIcs, vol. 128. Dagstuhl, Germany: Schloss Dagstuhl\u2013Leibniz-Zentrum fuer Informatik: 2019. p. 7\u20131718. https:\/\/doi.org\/10.4230\/LIPIcs.CPM.2019.7."},{"key":"3586_CR57","unstructured":"Burrows M, Wheeler DJ. A Block Sorting data Compression Algorithm. Technical report. Digit Syst Res Cent. 1994."},{"key":"3586_CR58","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-78909-5","volume-title":"The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching","author":"D Adjeroh","year":"2008","unstructured":"Adjeroh D, Bell T, Mukherjee A. The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Boston, MA: Springer; 2008. https:\/\/doi.org\/10.1007\/978-0-387-78909-5."},{"issue":"4","key":"3586_CR59","doi-asserted-by":"publisher","first-page":"688","DOI":"10.1145\/1082036.1082043","volume":"52","author":"P Ferragina","year":"2005","unstructured":"Ferragina P, Giancarlo R, Manzini G, Sciortino M. Boosting textual compression in optimal linear time. J ACM. 2005; 52(4):688\u2013713. https:\/\/doi.org\/10.1145\/1082036.1082043.","journal-title":"J ACM"},{"issue":"1","key":"3586_CR60","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1093\/bioinformatics\/btt257","volume":"30","author":"L Janin","year":"2014","unstructured":"Janin L, Rosone G, Cox AJ. Adaptive reference-free compression of sequence quality scores. Bioinformatics. 2014; 30(1):24\u201330. https:\/\/doi.org\/10.1093\/bioinformatics\/btt257.","journal-title":"Bioinformatics"},{"key":"3586_CR61","doi-asserted-by":"publisher","unstructured":"Krusche P, Trigg L, Boutros PC, Mason CE, Francisco M, Moore BL, Gonzalez-Porta M, Eberle MA, Tezak Z, Lababidi S, et al.Best practices for benchmarking germline small-variant calls in human genomes. Nat Biotechnol. 2019:1. https:\/\/doi.org\/10.1038\/s41587-019-0054-x.","DOI":"10.1038\/s41587-019-0054-x"},{"issue":"4","key":"3586_CR62","doi-asserted-by":"publisher","first-page":"558","DOI":"10.1093\/bioinformatics\/btx639","volume":"34","author":"S Chandak","year":"2017","unstructured":"Chandak S, Tatwawadi K, Weissman T. Compression of genomic sequencing reads via hash-based reordering: algorithm and analysis. Bioinformatics. 2017; 34(4):558\u201367. https:\/\/doi.org\/10.1093\/bioinformatics\/btx639.","journal-title":"Bioinformatics"},{"issue":"12","key":"3586_CR63","doi-asserted-by":"publisher","first-page":"2224","DOI":"10.1101\/gr.126599.111","volume":"21","author":"D Earl","year":"2011","unstructured":"Earl D, Bradnam K, St John J, Darling A, Lin D, Fass J, Yu H, Buffalo V, Zerbino D, Diekhans M, Nguyen N, Ariyaratne P, Sung W-K, Ning Z, Haimel M, Simpson J, Fonseca N, Birol I, Docking T, Paten B. Assemblathon 1: A competitive assessment of de novo short read assembly methods. Gen Res. 2011; 21(12):2224\u201341. https:\/\/doi.org\/10.1101\/gr.126599.111.","journal-title":"Gen Res"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03586-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03586-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03586-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,15]],"date-time":"2021-09-15T23:05:58Z","timestamp":1631747158000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03586-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":63,"journal-issue":{"issue":"S8","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["3586"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03586-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2020,9]]},"assertion":[{"value":"1 June 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 June 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"260"}}