{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T11:07:45Z","timestamp":1766056065363,"version":"3.48.0"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T00:00:00Z","timestamp":1765411200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,12,14]],"date-time":"2025-12-14T00:00:00Z","timestamp":1765670400000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100007537","name":"Freie Universit\u00e4t Berlin","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007537","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Algorithms Mol Biol"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Adding rank support to strings over a fixed-sized alphabet has numerous applications. Prominent among those is the (bidirectional) FM-Index which is commonly utilized to index and analyze genomic data. At its core lies the\n                    <jats:italic>rank<\/jats:italic>\n                    operation on the Burrows-Wheeler-Transform (BWT) which, given a position in the BWT and a character, answers how often the specified character appears from the start to that position. Implementing those rank queries is usually based on bit vectors with rank support. In this work, we discuss three implementation improvements. First, a novel approach named\n                    <jats:italic>paired-blocks<\/jats:italic>\n                    that reduces the space overhead of the support structure by half to a total of only\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$1.6\\%$$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mrow>\n                            <mml:mn>1.6<\/mml:mn>\n                            <mml:mo>%<\/mml:mo>\n                          <\/mml:mrow>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    . Second, a method for masking bits for the population count (also known as popcount) which greatly improves the runtime of 512-bit wide blocks in conjunction with AVX512 SIMD extensions. Third, a revised method for EPR-dictionaries (Pockrandt et al. in International conference on research in computational molecular biology. Springer, New York, 2017) called\n                    <jats:italic>flattened bit vectors<\/jats:italic>\n                    (fBV) with less space consumption and faster rank operations on strings, which is competitive in size and depending on the parameters between\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$2\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mrow>\n                            <mml:mn>2<\/mml:mn>\n                            <mml:mo>\u00d7<\/mml:mo>\n                          <\/mml:mrow>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    and\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$9\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mrow>\n                            <mml:mn>9<\/mml:mn>\n                            <mml:mo>\u00d7<\/mml:mo>\n                          <\/mml:mrow>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    faster than Wavelet Trees (Gog et al. in 13th International Symposium on Experimental Algorithms. Springer, New York, 2014).\n                  <\/jats:p>","DOI":"10.1186\/s13015-025-00291-9","type":"journal-article","created":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T08:07:29Z","timestamp":1765440449000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Engineering rank queries on bit vectors and strings"],"prefix":"10.1186","volume":"20","author":[{"given":"Simon Gene","family":"Gottlieb","sequence":"first","affiliation":[]},{"given":"Knut","family":"Reinert","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,12,11]]},"reference":[{"key":"291_CR1","first-page":"190","volume-title":"International conference on research in computational molecular biology","author":"C Pockrandt","year":"2017","unstructured":"Pockrandt C, Ehrhardt M, Reinert K. Epr-dictionaries: a practical and fast data structure for constant time searches in unidirectional and bidirectional fm indices. In: International conference on research in computational molecular biology. Springer; 2017. p. 190\u2013206."},{"key":"291_CR2","doi-asserted-by":"publisher","first-page":"390","DOI":"10.1109\/SFCS.2000.892127","volume-title":"Proceedings 41st annual symposium on foundations of computer science","author":"P Ferragina","year":"2000","unstructured":"Ferragina P, Manzini G. Opportunistic data structures with applications. In: Proceedings 41st annual symposium on foundations of computer science. IEEE; 2000. p. 390\u20138."},{"issue":"1\u20132","key":"291_CR3","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1016\/S0020-0255(01)00098-6","volume":"135","author":"P Ferragina","year":"2001","unstructured":"Ferragina P, Manzini G. An experimental study of a compressed index. Inf Sci. 2001;135(1\u20132):13\u201328.","journal-title":"Inf Sci"},{"key":"291_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/gb-2009-10-3-r25","volume":"10","author":"B Langmead","year":"2009","unstructured":"Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short dna sequences to the human genome. Genome Biol. 2009;10:1\u201310.","journal-title":"Genome Biol"},{"issue":"5","key":"291_CR5","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","volume":"26","author":"H Li","year":"2010","unstructured":"Li H, Durbin R. Fast and accurate long-read alignment with burrows-wheeler transform. Bioinformatics. 2010;26(5):589\u201395.","journal-title":"Bioinformatics"},{"issue":"4","key":"291_CR6","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/nmeth.1923","volume":"9","author":"B Langmead","year":"2012","unstructured":"Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012;9(4):357\u20139.","journal-title":"Nat Methods"},{"issue":"7","key":"291_CR7","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1093\/nar\/gkt005","volume":"41","author":"E Siragusa","year":"2013","unstructured":"Siragusa E, Weese D, Reinert K. Fast and accurate read mapping with approximate seeds and multiple backtracking. Nucleic Acids Res. 2013;41(7):78\u201378.","journal-title":"Nucleic Acids Res"},{"key":"291_CR8","unstructured":"Li H. Aligning sequence reads, clone sequences and assembly contigs with bwa-mem. 2013. arXiv:1303.3997."},{"key":"291_CR9","unstructured":"Siragusa E. Approximate string matching for high-throughput sequencing. PhD thesis, Freie Universit\u00e4t Berlin (July 2015). http:\/\/publications.imp.fu-berlin.de\/2507\/."},{"key":"291_CR10","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.jda.2016.03.002","volume":"37","author":"C Vroland","year":"2016","unstructured":"Vroland C, Salson M, Bini S, Touzet H. Approximate search of short patterns with high error rates using the $$01\\star 0$$ lossless seeds. J Discrete Algorith. 2016;37:3\u201316.","journal-title":"J Discrete Algorith"},{"issue":"12","key":"291_CR11","doi-asserted-by":"publisher","first-page":"1721","DOI":"10.1101\/gr.210641.116","volume":"26","author":"D Kim","year":"2016","unstructured":"Kim D, Song L, Breitwieser FP, Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 2016;26(12):1721\u20139.","journal-title":"Genome Res"},{"issue":"7","key":"291_CR12","doi-asserted-by":"publisher","first-page":"102687","DOI":"10.1016\/j.isci.2021.102687","volume":"24","author":"L Renders","year":"2021","unstructured":"Renders L, Marchal K, Fostier J. Dynamic partitioning of search patterns for approximate pattern matching using search schemes. iScience. 2021;24(7):102687. https:\/\/doi.org\/10.1016\/j.isci.2021.102687.","journal-title":"iScience"},{"issue":"1","key":"291_CR13","doi-asserted-by":"publisher","first-page":"519","DOI":"10.1093\/bib\/bbab519","volume":"23","author":"H Liu","year":"2022","unstructured":"Liu H, Zou Q, Xu Y. A novel fast multiple nucleotide sequence alignment method based on fm-index. Brief Bioinform. 2022;23(1):519.","journal-title":"Brief Bioinform"},{"key":"291_CR14","first-page":"164","volume-title":"International conference on research in computational molecular biology","author":"L Renders","year":"2024","unstructured":"Renders L, Depuydt L, Rahmann S, Fostier J. Automated design of efficient search schemes for lossless approximate pattern matching. In: International conference on research in computational molecular biology. Springer; 2024. p. 164\u201384."},{"key":"291_CR15","doi-asserted-by":"crossref","unstructured":"Song L, Langmead B. Centrifuger:lossless compression of microbial genomes for efficient and accurate metagenomic sequence classification. Genome Biol. 2024;25(1):106.","DOI":"10.1186\/s13059-024-03244-4"},{"issue":"1","key":"291_CR16","doi-asserted-by":"publisher","first-page":"014","DOI":"10.1093\/bioinformatics\/btae014","volume":"40","author":"P Zhang","year":"2024","unstructured":"Zhang P, Liu H, Wei Y, Zhai Y, Tian Q, Zou Q. Fmalign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets. Bioinformatics. 2024;40(1):014.","journal-title":"Bioinformatics"},{"issue":"3","key":"291_CR17","doi-asserted-by":"publisher","first-page":"097","DOI":"10.1093\/bioinformatics\/btae097","volume":"40","author":"H Hauswedell","year":"2024","unstructured":"Hauswedell H, Hetzel S, Gottlieb SG, Kretzmer H, Meissner A, Reinert K. Lambda3:homology search for protein, nucleotide, and bisulfite-converted sequences. Bioinformatics. 2024;40(3):097.","journal-title":"Bioinformatics"},{"key":"291_CR18","volume-title":"Succinct static data structures","author":"GJ Jacobson","year":"1988","unstructured":"Jacobson GJ. Succinct static data structures. USA: Carnegie Mellon University; 1988."},{"key":"291_CR19","doi-asserted-by":"crossref","unstructured":"Vigna S. Broadword implementation of rank\/select queries. In: International Workshop on Experimental and Efficient Algorithms. Springer; 2008. p. 154\u2013168.","DOI":"10.1007\/978-3-540-68552-4_12"},{"key":"291_CR20","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1007\/978-3-642-38527-8_15","volume-title":"Experimental algorithms: 12th international symposium, SEA 2013, Rome, Italy, June 5-7, 2013. proceedings 12","author":"D Zhou","year":"2013","unstructured":"Zhou D, Andersen DG, Kaminsky M. Space-efficient, high-performance rank and select structures on uncompressed bit sequences. In: Experimental algorithms: 12th international symposium, SEA 2013, Rome, Italy, June 5-7, 2013. proceedings 12. Springer; 2013. p. 151\u201363."},{"key":"291_CR21","doi-asserted-by":"crossref","unstructured":"Kurpicz F. Engineering compact data structures for rank and select queries on bit vectors. In: String processing and information retrieval: 29th international symposium, SPIRE 2022, Concepci\u00f3n, Chile, November 8\u201310, 2022, Proceedings. Springer; 2022. p. 257\u2013272.","DOI":"10.1007\/978-3-031-20643-6_19"},{"key":"291_CR22","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1137\/1.9781611972870.6","volume-title":"2007 Proceedings of the ninth workshop on algorithm engineering and experiments (ALENEX)","author":"D Okanohara","year":"2007","unstructured":"Okanohara D, Sadakane K. Practical entropy-compressed rank\/select dictionary. In: 2007 Proceedings of the ninth workshop on algorithm engineering and experiments (ALENEX). SIAM; 2007. p. 60\u201370."},{"key":"291_CR23","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1016\/j.is.2017.12.001","volume":"73","author":"S Grabowski","year":"2018","unstructured":"Grabowski S, Raniszewski M. Rank and select:another lesson learned. Inf Syst. 2018;73:25\u201334.","journal-title":"Inf Syst"},{"key":"291_CR24","doi-asserted-by":"publisher","first-page":"101756","DOI":"10.1016\/j.is.2021.101756","volume":"99","author":"GE Pibiri","year":"2021","unstructured":"Pibiri GE, Kanda S. Rank\/select queries over mutable bitmaps. Inf Syst. 2021;99:101756.","journal-title":"Inf Syst"},{"key":"291_CR25","first-page":"841","volume":"3","author":"R Grossi","year":"2003","unstructured":"Grossi R, Gupta A, Vitter JS, et al. High-order entropy-compressed text indexes. In SODA. 2003;3:841\u201350.","journal-title":"In SODA"},{"key":"291_CR26","unstructured":"Ferragina P, Manzini G, M\u00e4kinen V, Navarro G. Succinct representation of sequences. Technical Report. 2004."},{"issue":"1","key":"291_CR27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13015-021-00204-6","volume":"16","author":"T Anderson","year":"2021","unstructured":"Anderson T, Wheeler TJ. An optimized fm-index library for nucleotide and amino acid search. Algorith Mol Biol. 2021;16(1):1\u201313.","journal-title":"Algorith Mol Biol"},{"key":"291_CR28","doi-asserted-by":"crossref","unstructured":"Gog S, Beller T, Moffat A, Petri M. From theory to practice: Plug and play with succinct data structures. In: 13th International symposium on experimental algorithms, (SEA 2014). 2014. p. 326\u2013337.","DOI":"10.1007\/978-3-319-07959-2_28"},{"key":"291_CR29","unstructured":"Leitner-Ankerl M. ankerl::nanobench. GitHub. 2025."}],"container-title":["Algorithms for Molecular Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13015-025-00291-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13015-025-00291-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13015-025-00291-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T11:03:33Z","timestamp":1766055813000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1186\/s13015-025-00291-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,11]]},"references-count":29,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["291"],"URL":"https:\/\/doi.org\/10.1186\/s13015-025-00291-9","relation":{},"ISSN":["1748-7188"],"issn-type":[{"type":"electronic","value":"1748-7188"}],"subject":[],"published":{"date-parts":[[2025,12,11]]},"assertion":[{"value":"2 June 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 October 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 December 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Our library and benchmark is open-source and available at\n                      \n                      .","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Source Code"}},{"value":"The authors declare no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"21"}}