{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T07:55:05Z","timestamp":1770537305539,"version":"3.49.0"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2019,6,5]],"date-time":"2019-06-05T00:00:00Z","timestamp":1559692800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2019,9,30]]},"abstract":"<jats:p>Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this article, we show that many subsequence constraints\u2014including and beyond those considered in the literature\u2014can be unified in a single framework. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners. In more detail, we propose a set of simple and intuitive \u201cpattern expressions\u201d to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. Our algorithms translate pattern expressions to succinct finite-state transducers, which we use as computational model, and simulate these transducers in a way suitable for frequent sequence mining. Our experimental study on real-world datasets indicates that our algorithms\u2014although more general\u2014are efficient and, when used for sequence mining with prior constraints studied in literature, competitive to (and in some cases superior to) state-of-the-art specialized methods.<\/jats:p>","DOI":"10.1145\/3321486","type":"journal-article","created":{"date-parts":[[2019,6,6]],"date-time":"2019-06-06T12:28:42Z","timestamp":1559824122000},"page":"1-42","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["A Unified Framework for Frequent Sequence Mining with Subsequence Constraints"],"prefix":"10.1145","volume":"44","author":[{"given":"Kaustubh","family":"Beedkar","sequence":"first","affiliation":[{"name":"Technische Universit\u00e4t Berlin, Berlin, Germany"}]},{"given":"Rainer","family":"Gemulla","sequence":"additional","affiliation":[{"name":"Universit\u00e4t Mannheim, Mannheim, Germany"}]},{"given":"Wim","family":"Martens","sequence":"additional","affiliation":[{"name":"Universit\u00e4t Bayreuth, Bayreuth, Germany"}]}],"member":"320","published-online":{"date-parts":[[2019,6,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/645480.655281"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972733.37"},{"key":"e_1_2_1_3_1","first-page":"11","article-title":"OpenFst: A general and efficient weighted finite-state transducer library. In Implementation and Application of Automata","volume":"4783","author":"Allauzen Cyril","year":"2007","unstructured":"Cyril Allauzen , Michael Riley , Johan Schalkwyk , Wojciech Skut , and Mehryar Mohri . 2007 . OpenFst: A general and efficient weighted finite-state transducer library. In Implementation and Application of Automata . Springer. Vol. 4783 , 11 -- 23 . Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri. 2007. OpenFst: A general and efficient weighted finite-state transducer library. In Implementation and Application of Automata. Springer. Vol. 4783, 11--23.","journal-title":"Springer."},{"key":"e_1_2_1_4_1","unstructured":"Marco Almeida Nelma Moreira and Rog\u00e9rio Reis. 2007. On the performance of automata minimization algorithms. Technical report DCC-2007-03. Universidade do Porto.  Marco Almeida Nelma Moreira and Rog\u00e9rio Reis. 2007. On the performance of automata minimization algorithms. Technical report DCC-2007-03. Universidade do Porto."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46227-1_20"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10601-017-9272-3"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3115\/1119089.1119095"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775109"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2757217"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2723724"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2016.0092"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2452376.2452389"},{"key":"e_1_2_1_13_1","series-title":"Lecture Notes in Computer Science (LNCS\u201998) 1433","volume-title":"Pattern discovery in biosequences","author":"Brazma Alvis","year":"1998","unstructured":"Alvis Brazma , Inge Jonassen , Jaak Vilo , and Esko Ukkonen . 1998. Pattern discovery in biosequences . Lecture Notes in Computer Science (LNCS\u201998) 1433 ( 1998 ), 257--270. Alvis Brazma, Inge Jonassen, Jaak Vilo, and Esko Ukkonen. 1998. Pattern discovery in biosequences. Lecture Notes in Computer Science (LNCS\u201998) 1433 (1998), 257--270."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1006\/inco.1997.2695"},{"key":"e_1_2_1_15_1","first-page":"529","article-title":"Canonical regular expressions and minimal state graphs for definite events","volume":"12","author":"Brzozowski J.A.","year":"1962","unstructured":"J.A. Brzozowski . 1962 . Canonical regular expressions and minimal state graphs for definite events . Math. Theory Automata 12 (1962), 529 -- 561 . J.A. Brzozowski. 1962. Canonical regular expressions and minimal state graphs for definite events. Math. Theory Automata 12 (1962), 529--561.","journal-title":"Math. Theory Automata"},{"key":"e_1_2_1_16_1","volume-title":"A modular and flexible architecture for an integrated corpus query system. CoRR abs\/cmp-lg\/9408005","author":"Christ Oliver","year":"1994","unstructured":"Oliver Christ . 1994. A modular and flexible architecture for an integrated corpus query system. CoRR abs\/cmp-lg\/9408005 ( 1994 ). Oliver Christ. 1994. A modular and flexible architecture for an integrated corpus query system. CoRR abs\/cmp-lg\/9408005 (1994)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2017.03.011"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings Conference on Innovative Data Systems Research (CIDR\u201907)","author":"Demers Alan","unstructured":"Alan Demers , Johannes Gehrke , and P. Biswanath . 2007. Cayuga: A general purpose event monitoring system . In Proceedings Conference on Innovative Data Systems Research (CIDR\u201907) . 412--422. Alan Demers, Johannes Gehrke, and P. Biswanath. 2007. Cayuga: A general purpose event monitoring system. In Proceedings Conference on Innovative Data Systems Research (CIDR\u201907). 412--422."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559971"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3294052.3319684"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/2145432.2145596"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699442"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB\u201999)","author":"Garofalakis Minos N.","year":"1999","unstructured":"Minos N. Garofalakis , Rajeev Rastogi , and Kyuseok Shim . 1999 . SPIRIT: Sequential pattern mining with regular expression constraints . In Proceedings of the International Conference on Very Large Data Bases (VLDB\u201999) . 223--234. Minos N. Garofalakis, Rajeev Rastogi, and Kyuseok Shim. 1999. SPIRIT: Sequential pattern mining with regular expression constraints. In Proceedings of the International Conference on Very Large Data Bases (VLDB\u201999). 223--234."},{"key":"e_1_2_1_24_1","volume-title":"Ullman","author":"Hopcroft John E.","year":"2007","unstructured":"John E. Hopcroft , Rajeev Motwani , and Jeffrey D . Ullman . 2007 . Automata Theory, Languages , and Computation (3rd ed.). Addison Wesley . John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. 2007. Automata Theory, Languages, and Computation (3rd ed.). Addison Wesley."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10601-016-9252-z"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-33954-2_15"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1519103.1519105"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the ACL System Demonstrations (ACL\u201912)","author":"Lin Yuri","year":"2012","unstructured":"Yuri Lin , Jean-Baptiste Michel , Erez Lieberman Aiden , Jon Orwant , Will Brockman , and Slav Petrov . 2012 . Syntactic annotations for the Google books ngram corpus . In Proceedings of the ACL System Demonstrations (ACL\u201912) . 169--174. Yuri Lin, Jean-Baptiste Michel, Erez Lieberman Aiden, Jon Orwant, Will Brockman, and Slav Petrov. 2012. Syntactic annotations for the Google books ngram corpus. In Proceedings of the ACL System Demonstrations (ACL\u201912). 169--174."},{"key":"e_1_2_1_29_1","volume-title":"Statistical machine translation. ACM Comput. Surv. 40, 3","author":"Lopez Adam","year":"2008","unstructured":"Adam Lopez . 2008. Statistical machine translation. ACM Comput. Surv. 40, 3 ( 2008 ), 8:1--8:49. Adam Lopez. 2008. Statistical machine translation. ACM Comput. Surv. 40, 3 (2008), 8:1--8:49."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2016.02.027"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1824795.1824798"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009748302351"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465285"},{"key":"e_1_2_1_34_1","first-page":"2","article-title":"Finite-state transducers in language and speech processing","volume":"23","author":"Mohri Mehryar","year":"1997","unstructured":"Mehryar Mohri . 1997 . Finite-state transducers in language and speech processing . Comput. Linguist. 23 , 2 (June 1997), 269--311. Mehryar Mohri. 1997. Finite-state transducers in language and speech processing. Comput. Linguist. 23, 2 (June 1997), 269--311.","journal-title":"Comput. Linguist."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(98)00115-7"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1006\/csla.2001.0184"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL\u201912)","author":"Nakashole Ndapandula","year":"2012","unstructured":"Ndapandula Nakashole , Gerhard Weikum , and Fabian Suchanek . 2012 . PATTY: A taxonomy of relational patterns with semantic types . In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL\u201912) . 1135--1145. Ndapandula Nakashole, Gerhard Weikum, and Fabian Suchanek. 2012. PATTY: A taxonomy of relational patterns with semantic types. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL\u201912). 1135--1145."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-18008-3_20"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the IEEE International Conference on Data Engineering (ICDE\u201901)","author":"Pei Jian","year":"2001","unstructured":"Jian Pei , Jiawei Han , Behzad Mortazavi-Asl , Helen Pinto , Qiming Chen , Umeshwar Dayal , and Meichun Hsu . 2001 . PrefixSpan: Mining sequential patterns by prefix-projected growth . In Proceedings of the IEEE International Conference on Data Engineering (ICDE\u201901) . 215--224. Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, and Meichun Hsu. 2001. PrefixSpan: Mining sequential patterns by prefix-projected growth. In Proceedings of the IEEE International Conference on Data Engineering (ICDE\u201901). 215--224."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/584792.584799"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.32.0114"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00134"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/645337.650382"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/846183.846188"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/800125.804029"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/363347.363387"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.111"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2750548"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-013-0499-4"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/354756.354849"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007652502315"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3321486","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3321486","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:53:13Z","timestamp":1750204393000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3321486"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,6,5]]},"references-count":51,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2019,9,30]]}},"alternative-id":["10.1145\/3321486"],"URL":"https:\/\/doi.org\/10.1145\/3321486","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"value":"0362-5915","type":"print"},{"value":"1557-4644","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,6,5]]},"assertion":[{"value":"2018-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-06-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}