{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:32:12Z","timestamp":1740133932641,"version":"3.37.3"},"reference-count":28,"publisher":"World Scientific Pub Co Pte Ltd","issue":"07","funder":[{"DOI":"10.13039\/501100001809","name":"Chinese National Natural Science Foundation","doi-asserted-by":"crossref","award":["61472082"],"award-info":[{"award-number":["61472082"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Natural Science Foundation of Fujian Province of China","award":["2014J01220"],"award-info":[{"award-number":["2014J01220"]}]},{"name":"US National Science Foundation","award":["IIS-1552860"],"award-info":[{"award-number":["IIS-1552860"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Found. Comput. Sci."],"published-print":{"date-parts":[[2017,11]]},"abstract":"<jats:p> Let [Formula: see text] be a string, with symbols from an alphabet. [Formula: see text] is said to be degenerate if for some positions, say [Formula: see text], [Formula: see text] can contain a subset of symbols from the symbol alphabet, rather than just one symbol. Given a text string [Formula: see text] and a pattern [Formula: see text], both with symbols from an alphabet [Formula: see text], the degenerate string matching problem, is to find positions in [Formula: see text] where [Formula: see text] occured, such that [Formula: see text], [Formula: see text], or both are allowed to be degenerate. Though some algorithms have been proposed, their huge computational cost pose a significant challenge to their practical utilization. In this work, we propose IDPM, an improved degenerate pattern matching algorithm based on an extension of the Boyer\u2013Moore algorithm. At the preprocessing phase, the algorithm defines an alphabet-independent compatibility rule, and computes the shift arrays using respective variants of the bad character and good suffix heuristics. At the search phase, IDPM improves the matching speed by using the compatibility rule. On average, the proposed IDPM algorithm has a linear time complexity with respect to the text size, and to the overall size of the pattern. IDPM demonstrates significance performance improvement over state-of-the-art approaches. It can be used in fast practical degenerate pattern matching with large data sizes, with important applications in flexible and scalable searching of huge biological sequences. <\/jats:p>","DOI":"10.1142\/s0129054117500307","type":"journal-article","created":{"date-parts":[[2018,2,23]],"date-time":"2018-02-23T03:50:52Z","timestamp":1519357852000},"page":"889-914","source":"Crossref","is-referenced-by-count":1,"title":["IDPM: An Improved Degenerate Pattern Matching Algorithm for Biological Sequences"],"prefix":"10.1142","volume":"28","author":[{"given":"Jie","family":"Lin","sequence":"first","affiliation":[{"name":"Faculty of Software, Fujian Normal University, Fuzhou 350108, P. R. China"}]},{"given":"Yue","family":"Jiang","sequence":"additional","affiliation":[{"name":"Faculty of Software, Fujian Normal University, Fuzhou 350108, P. R. China"}]},{"given":"E. James","family":"Harner","sequence":"additional","affiliation":[{"name":"Department of Statistics, West Virginia University, Morgantown, WV 26506, USA"}]},{"given":"Bing-Hua","family":"Jiang","sequence":"additional","affiliation":[{"name":"Pathology, Anatomy &amp; Cell Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA"}]},{"given":"Don","family":"Adjeroh","sequence":"additional","affiliation":[{"name":"Computer Science &amp; Elect. Engineering, West Virginia University, Morgantown, WV 26506, USA"}]}],"member":"219","published-online":{"date-parts":[[2018,2,23]]},"reference":[{"key":"S0129054117500307BIB001","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1007\/11787006_26","volume":"4052","author":"Abdalla M.","year":"2006","journal-title":"Automata, Languages and Programming, 33rd International Colloquium \u2014 ICALP"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB002","DOI":"10.1137\/0216067"},{"volume-title":"The Burrows\u2013Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching","year":"2009","author":"Adjeroh D.","key":"S0129054117500307BIB003"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB004","DOI":"10.1145\/360825.360855"},{"key":"S0129054117500307BIB005","first-page":"613","volume":"32","author":"Alatabbi A.","year":"2014","journal-title":"Journal of Discrete Algorithms"},{"volume-title":"Algorithmic Combinatorics on Partial Words","year":"2008","author":"Blanchet-Sadri F.","key":"S0129054117500307BIB006"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB007","DOI":"10.1145\/359842.359859"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB008","DOI":"10.1137\/S0097539700382704"},{"key":"S0129054117500307BIB010","first-page":"227","volume":"11","author":"Crawford T.","year":"1998","journal-title":"Computing in Musicology"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB011","DOI":"10.1007\/978-3-319-13075-0_18"},{"issue":"7","key":"S0129054117500307BIB012","volume":"8","author":"Das M. K.","year":"2007","journal-title":"BMC Bioinformatics"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB013","DOI":"10.1186\/1471-2164-15-S6-S2"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB015","DOI":"10.1007\/BF02603120"},{"key":"S0129054117500307BIB016","first-page":"113","volume":"7","author":"Fischer M.","year":"1974","journal-title":"SIAM AMS Proceedings"},{"issue":"9","key":"S0129054117500307BIB017","first-page":"241","volume":"22","author":"Galil Z.","year":"1978","journal-title":"Communications of the ACM"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB018","DOI":"10.1017\/CBO9780511574931"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB021","DOI":"10.1016\/j.jda.2006.10.003"},{"issue":"1","key":"S0129054117500307BIB022","first-page":"40","volume":"10","author":"Iliopoulos C. S.","year":"2003","journal-title":"Nord. J. Comput."},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB023","DOI":"10.1007\/s11786-007-0029-z"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB025","DOI":"10.4137\/CIN.S1054"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB026","DOI":"10.1093\/nar\/gkg585"},{"issue":"5","key":"S0129054117500307BIB027","first-page":"26","volume":"2012","author":"Lin J.","year":"2012","journal-title":"Journal of Fujian Normal University (Natural Science)"},{"issue":"212","key":"S0129054117500307BIB028","first-page":"85","volume":"212","author":"Marth G. T.","year":"2003","journal-title":"Methods in Molecular Biology"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB031","DOI":"10.1142\/S0129054109007005"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB032","DOI":"10.1137\/0222018"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB033","DOI":"10.1093\/nar\/22.22.4673"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB034","DOI":"10.1038\/nbt1053"},{"doi-asserted-by":"publisher","key":"S0129054117500307BIB035","DOI":"10.1002\/rsa.20111"}],"container-title":["International Journal of Foundations of Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0129054117500307","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,6]],"date-time":"2019-08-06T17:40:03Z","timestamp":1565113203000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0129054117500307"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11]]},"references-count":28,"journal-issue":{"issue":"07","published-online":{"date-parts":[[2018,2,23]]},"published-print":{"date-parts":[[2017,11]]}},"alternative-id":["10.1142\/S0129054117500307"],"URL":"https:\/\/doi.org\/10.1142\/s0129054117500307","relation":{},"ISSN":["0129-0541","1793-6373"],"issn-type":[{"type":"print","value":"0129-0541"},{"type":"electronic","value":"1793-6373"}],"subject":[],"published":{"date-parts":[[2017,11]]}}}