{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T08:40:48Z","timestamp":1768984848155,"version":"3.49.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2023,6,12]],"date-time":"2023-06-12T00:00:00Z","timestamp":1686528000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,12]],"date-time":"2023-06-12T00:00:00Z","timestamp":1686528000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001824","name":"Grantova Agentura Ceske Republiky","doi-asserted-by":"publisher","award":["19-20759\u00a0S"],"award-info":[{"award-number":["19-20759\u00a0S"]}],"id":[{"id":"10.13039\/501100001824","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007655","name":"Czech Technical University in Prague","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100007655","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["SN COMPUT. SCI."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The concept of elastic-degenerate strings (EDS) was introduced as a way of representing a sequenced population of the same species. Several online elastic-degenerate string matching (EDSM) algorithms were presented so far. Some of them provide a practical implementation. We propose a new on-line EDSM algorithm . Our algorithm combines two traditional algorithms  and  that were adapted to the specificities needed by Elastic-Degenerate Strings.  is running in <jats:inline-formula><jats:alternatives><jats:tex-math>$${\\mathcal {O}}(Nm\\lceil \\frac{m}{w}\\rceil )$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:mi>N<\/mml:mi>\n                    <mml:mi>m<\/mml:mi>\n                    <mml:mrow>\n                      <mml:mo>\u2308<\/mml:mo>\n                      <mml:mfrac>\n                        <mml:mi>m<\/mml:mi>\n                        <mml:mi>w<\/mml:mi>\n                      <\/mml:mfrac>\n                      <mml:mo>\u2309<\/mml:mo>\n                    <\/mml:mrow>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> worst-case time. This implies <jats:inline-formula><jats:alternatives><jats:tex-math>$${\\mathcal {O}}(Nm)$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:mi>N<\/mml:mi>\n                    <mml:mi>m<\/mml:mi>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> time for small patterns, where <jats:italic>m<\/jats:italic> is the length of the searched pattern, <jats:italic>N<\/jats:italic> is the size of EDS, and <jats:italic>w<\/jats:italic> is the size of the computer word. The algorithm uses <jats:inline-formula><jats:alternatives><jats:tex-math>$${\\mathcal {O}}(N + n)$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:mi>N<\/mml:mi>\n                    <mml:mo>+<\/mml:mo>\n                    <mml:mi>n<\/mml:mi>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> space, where <jats:italic>n<\/jats:italic> is the length of EDS.  requires a simple preprocessing step with time and space <jats:inline-formula><jats:alternatives><jats:tex-math>$${\\mathcal {O}}(m)$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:mi>m<\/mml:mi>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. We further present an extended version of the algorithm called , which allows searching for proteins (amino acid sequences) encoded in DNA EDS. Though  considers more complex patterns than , it keeps its time and space efficiency. Experimental results on real genomic data show the superiority of  over state-of-the-art algorithms. We also experimentally evaluate the efficiency of  for different pattern complexities, and evaluate the scaling of both  and  with varying attributes of EDS datasets.<\/jats:p>","DOI":"10.1007\/s42979-023-01760-x","type":"journal-article","created":{"date-parts":[[2023,6,12]],"date-time":"2023-06-12T16:01:55Z","timestamp":1686585715000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Backward Pattern Matching on Elastic-Degenerate Strings"],"prefix":"10.1007","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0272-5123","authenticated-orcid":false,"given":"Petr","family":"Proch\u00e1zka","sequence":"first","affiliation":[]},{"given":"Ond\u0159ej","family":"Cvacho","sequence":"additional","affiliation":[]},{"given":"Lubo\u0161","family":"Kr\u010d\u00e1l","sequence":"additional","affiliation":[]},{"given":"Jan","family":"Holub","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,12]]},"reference":[{"key":"1760_CR1","doi-asserted-by":"publisher","unstructured":"Proch\u00e1zka P, Cvacho O, Krc\u00e1l L, Holub J. Backward pattern matching on elastic degenerate strings. In: Proceedings of the 14th international joint conference on biomedical engineering systems and technologies, BIOSTEC 2021, volume 3: BIOINFORMATICS; 2021. p. 50\u20139 . https:\/\/doi.org\/10.5220\/0010243600500059.","DOI":"10.5220\/0010243600500059"},{"key":"1760_CR2","doi-asserted-by":"crossref","unstructured":"The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2011;473:544 (Corrigendum)","DOI":"10.1038\/nature09991"},{"key":"1760_CR3","doi-asserted-by":"crossref","unstructured":"The UK10K Consortium. The uk10k project identifies rare variants in health and disease. Nature. 2015;526:82","DOI":"10.1038\/nature14962"},{"key":"1760_CR4","doi-asserted-by":"crossref","unstructured":"The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526:68","DOI":"10.1038\/nature15393"},{"key":"1760_CR5","doi-asserted-by":"publisher","unstructured":"IUPAC-IUB Commission on Biochemical Nomenclature (CBN). Abbreviations and symbols for nucleic acids, polynucleotides and their constituents. J Mol Biol. 1971;55(3):299\u2013310. https:\/\/doi.org\/10.1016\/0022-2836(71)90319-6","DOI":"10.1016\/0022-2836(71)90319-6"},{"key":"1760_CR6","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/978-3-319-53733-7_9","volume-title":"Language and automata theory and applications","author":"CS Iliopoulos","year":"2017","unstructured":"Iliopoulos CS, Kundu R, Pissis SP. Efficient pattern matching in elastic-degenerate texts. In: Drewes F, Mart\u00edn-Vide C, Truthe B, editors. Language and automata theory and applications. Cham: Springer; 2017. p. 131\u201342."},{"issue":"1","key":"1760_CR7","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1093\/bib\/bbw089","volume":"19","author":"T Marschall","year":"2018","unstructured":"Marschall T. Computational pan-genomics: status, promises and challenges. Brief Bioinform. 2018;19(1):118\u201335. https:\/\/doi.org\/10.1093\/bib\/bbw089.","journal-title":"Brief Bioinform"},{"key":"1760_CR8","doi-asserted-by":"publisher","DOI":"10.1186\/s13059-015-0587-3","author":"DM Church","year":"2015","unstructured":"Church DM, et al. Extending reference assembly models. Genome Biol. 2015. https:\/\/doi.org\/10.1186\/s13059-015-0587-3.","journal-title":"Genome Biol"},{"key":"1760_CR9","doi-asserted-by":"publisher","first-page":"682","DOI":"10.1038\/ng.3257","volume":"47","author":"A Dilthey","year":"2015","unstructured":"Dilthey A, et al. Improved genome inference in the mhc using a population reference graph. Nat Genet. 2015;47:682. https:\/\/doi.org\/10.1038\/ng.3257.","journal-title":"Nat Genet"},{"issue":"15","key":"1760_CR10","doi-asserted-by":"publisher","first-page":"2156","DOI":"10.1093\/bioinformatics\/btr330","volume":"27","author":"P Danecek","year":"2011","unstructured":"Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R. The variant call format and vcftools. Bioinformatics. 2011;27(15):2156\u20138. https:\/\/doi.org\/10.1093\/bioinformatics\/btr330.","journal-title":"Bioinformatics"},{"key":"1760_CR11","doi-asserted-by":"publisher","unstructured":"Consortium TCP-G. Computational pan-genomics: status, promises and challenges. Brief Bioinform. 2016;19(1):118\u201335. https:\/\/academic.oup.com\/bib\/article-pdf\/19\/1\/118\/25406834\/bbw089.pdf. https:\/\/doi.org\/10.1093\/bib\/bbw089","DOI":"10.1093\/bib\/bbw089"},{"key":"1760_CR12","doi-asserted-by":"publisher","unstructured":"Proch\u00e1zka P, Holub J. Compressing similar biological sequences using FM-index. In: Data compression conference, DCC 2014, Snowbird, UT, USA, 26\u201328 March, 2014;2014. p. 312\u201321. https:\/\/doi.org\/10.1109\/DCC.2014.47.","DOI":"10.1109\/DCC.2014.47"},{"key":"1760_CR13","doi-asserted-by":"crossref","unstructured":"Maciuca S, et al. A natural encoding of genetic variation in a burrows-wheeler transform to enable mapping and genome inference. In: Frith M, Pedersen CNS, editors. Algorithms in bioinformatics. Cham: Springer; 2016. p. 222\u201333","DOI":"10.1007\/978-3-319-43681-4_18"},{"key":"1760_CR14","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1016\/j.tcs.2015.08.008","volume":"638","author":"JC Na","year":"2016","unstructured":"Na JC, Kim H, Park H, Lecroq T, L\u00e9onard M, Mouchard L, Park K. FM-index of alignment: a compressed index for similar strings. Theor Comput Sci. 2016;638:159\u201370. https:\/\/doi.org\/10.1016\/j.tcs.2015.08.008. (pattern matching, text data structures and compression).","journal-title":"Theor Comput Sci"},{"key":"1760_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2851495","volume":"21","author":"G Navarro","year":"2016","unstructured":"Navarro G, Pereira AO. Faster compressed suffix trees for repetitive collections. J Exp Algorithmics. 2016;21:1\u2013811838. https:\/\/doi.org\/10.1145\/2851495.","journal-title":"J. Exp. Algorithmics"},{"key":"1760_CR16","doi-asserted-by":"crossref","unstructured":"Sir\u00e9n J. Indexing variation graphs. In: 2017 Proceedings of the ninteenth workshop on algorithm engineering and experiments (ALENEX). SIAM; 2017. p. 13\u201327.","DOI":"10.1137\/1.9781611974768.2"},{"key":"1760_CR17","unstructured":"Grossi R, Iliopoulos CS, Liu C, Pisanti N, Pissis SP, Retha A, Rosone G, Vayani F, Versari L. On-line pattern matching on similar texts. In: 28th symposium on combinatorial pattern matching, CPM 2017, July 4\u20136, 2017, Warsaw, Poland; 2017. p. 9\u20131914"},{"issue":"2","key":"1760_CR18","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1137\/0206024","volume":"6","author":"DE Knuth","year":"1977","unstructured":"Knuth DE, Morris JH, Pratt VR. Fast pattern matching in strings. SIAM J Comput. 1977;6(2):323\u201350.","journal-title":"SIAM J Comput"},{"key":"1760_CR19","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511546853","volume-title":"Algorithms on strings","author":"M Crochemore","year":"2007","unstructured":"Crochemore M, Hancart C, Lecroq T. Algorithms on strings. New York: Cambridge University Press; 2007."},{"key":"1760_CR20","doi-asserted-by":"publisher","unstructured":"Pissis SP, Retha A. Dictionary matching in elastic-degenerate texts with applications in searching VCF files on-line. In: D\u2019Angelo G, editor. 17th international symposium on experimental algorithms (SEA 2018). Leibniz international Proceedings in informatics (LIPIcs), vol. 103. Dagstuhl, Germany: Schloss Dagstuhl\u2013Leibniz-Zentrum fuer Informatik; 2018. p. 16\u201311614. https:\/\/doi.org\/10.4230\/LIPIcs.SEA.2018.16. http:\/\/drops.dagstuhl.de\/opus\/volltexte\/2018\/8951","DOI":"10.4230\/LIPIcs.SEA.2018.16"},{"key":"1760_CR21","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bty506","author":"A Cis\u0142ak","year":"2018","unstructured":"Cis\u0142ak A, Grabowski S, Holub J. SOPanG: online text searching over a pan-genome. Bioinformatics. 2018. https:\/\/doi.org\/10.1093\/bioinformatics\/bty506.","journal-title":"Bioinformatics"},{"issue":"10","key":"1760_CR22","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1145\/135239.135243","volume":"35","author":"R Baeza-Yates","year":"1992","unstructured":"Baeza-Yates R, Gonnet GH. A new approach to text searching. Commun ACM. 1992;35(10):74\u201382.","journal-title":"Commun ACM"},{"issue":"19","key":"1760_CR23","doi-asserted-by":"publisher","first-page":"3599","DOI":"10.1093\/bioinformatics\/btz162","volume":"35","author":"M Rautiainen","year":"2019","unstructured":"Rautiainen M, M\u00e4kinen V, Marschall T. Bit-parallel sequence-to-graph alignment. Bioinformatics. 2019;35(19):3599\u2013607.","journal-title":"Bioinformatics"},{"key":"1760_CR24","doi-asserted-by":"publisher","unstructured":"Aoyama K, Nakashima Y, I, T, Inenaga S, Bannai H, Takeda M. Faster Online Elastic Degenerate String Matching. In: Navarro, G., Sankoff, D., Zhu, B. (eds.) Annual symposium on combinatorial pattern matching (CPM 2018). Leibniz international Proceedings in Informatics (LIPIcs), vol. 105. Dagstuhl, Germany: Schloss Dagstuhl\u2013Leibniz-Zentrum fuer Informatik; 2018. p. 9\u20131910. https:\/\/doi.org\/10.4230\/LIPIcs.CPM.2018.9. http:\/\/drops.dagstuhl.de\/opus\/volltexte\/2018\/8701","DOI":"10.4230\/LIPIcs.CPM.2018.9"},{"issue":"90","key":"1760_CR25","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1090\/S0025-5718-1965-0178586-1","volume":"19","author":"JW Cooley","year":"1965","unstructured":"Cooley JW, Tukey JW. An algorithm for the machine calculation of complex Fourier series. Math Comput. 1965;19(90):297\u2013301.","journal-title":"Math Comput"},{"key":"1760_CR26","doi-asserted-by":"publisher","unstructured":"Bernardini G, Gawrychowski P, Pisanti N, Pissis SP, Rosone G. Even faster elastic-degenerate string matching via fast matrix multiplication. In: 46th international colloquium on automata, languages, and programming, ICALP 2019, July 9\u201312, 2019, Patras, Greece; 2019. p. 21\u201312115. https:\/\/doi.org\/10.4230\/LIPIcs.ICALP.2019.21.","DOI":"10.4230\/LIPIcs.ICALP.2019.21"},{"key":"1760_CR27","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1007\/978-3-319-67428-5_7","volume-title":"String processing and information retrieval","author":"G Bernardini","year":"2017","unstructured":"Bernardini G, Pisanti N, Pissis SP, Rosone G. Pattern matching on elastic-degenerate text with errors. In: Fici G, Sciortino M, Venturini R, editors. String processing and information retrieval. Cham: Springer; 2017. p. 74\u201390."},{"key":"1760_CR28","doi-asserted-by":"crossref","unstructured":"Navarro G, Raffinot M. A bit-parallel approach to suffix automata: fast extended string matching. In: Proceedings of the 9th annual symposium on combinatorial pattern matching. CPM \u201998. London, UK: Springer; 1998. p. 14\u201333.","DOI":"10.1007\/BFb0030778"},{"key":"1760_CR29","first-page":"29","volume":"3","author":"B D\u00f6m\u00f6lki","year":"1964","unstructured":"D\u00f6m\u00f6lki B. An algorithm for syntactical analysis. Comput Linguist. 1964;3:29\u201346 (Hungarian Academy of Science, Budapest).","journal-title":"Comput Linguist"},{"key":"1760_CR30","unstructured":"Melichar B, Holub J, Polcar J. Text searching algorithms. (2005). http:\/\/stringology.org\/athens\/. Accessed 1 June 2021."},{"key":"1760_CR31","volume-title":"Text algorithms","author":"M Crochemore","year":"1994","unstructured":"Crochemore M, Rytter W. Text algorithms. New York: Oxford University Press Inc; 1994."},{"key":"1760_CR32","doi-asserted-by":"crossref","unstructured":"Nomenclature Committee of the International Union of Biochemistry (NC-IUB). Nomenclature for incompletely specified bases in nucleic acid sequences. recommendations 1984. Eur. J. Biochem. 1985;150:1\u20135","DOI":"10.1111\/j.1432-1033.1985.tb08977.x"}],"container-title":["SN Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-023-01760-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s42979-023-01760-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-023-01760-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,12]],"date-time":"2023-06-12T16:29:33Z","timestamp":1686587373000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s42979-023-01760-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,12]]},"references-count":32,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["1760"],"URL":"https:\/\/doi.org\/10.1007\/s42979-023-01760-x","relation":{},"ISSN":["2661-8907"],"issn-type":[{"value":"2661-8907","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,12]]},"assertion":[{"value":"22 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 February 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 June 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}}],"article-number":"442"}}