{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T15:20:48Z","timestamp":1776352848125,"version":"3.51.2"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2010,1,1]],"date-time":"2010-01-01T00:00:00Z","timestamp":1262304000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["EMT-0829835CNS-0103708"],"award-info":[{"award-number":["EMT-0829835CNS-0103708"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000144","name":"Division of Computer and Network Systems","doi-asserted-by":"publisher","award":["EMT-0829835CNS-0103708"],"award-info":[{"award-number":["EMT-0829835CNS-0103708"]}],"id":[{"id":"10.13039\/100000144","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["1R01EB0080161-01A1"],"award-info":[{"award-number":["1R01EB0080161-01A1"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2010,1]]},"abstract":"<jats:p>\n            We present VOGUE, a novel, variable order hidden Markov model with state durations, that combines two separate techniques for modeling complex patterns in sequential data: pattern mining and data modeling. VOGUE relies on a variable gap sequence mining method to extract frequent patterns with different lengths and gaps between elements. It then uses these mined sequences to build a variable order hidden Markov model (HMM), that explicitly models the gaps. The gaps implicitly model the order of the HMM, and they explicitly model the duration of each state. We apply VOGUE to a variety of real sequence data taken from domains such as protein sequence classification, Web usage logs, intrusion detection, and spelling correction. We show that VOGUE has superior classification accuracy compared to regular HMMs, higher-order HMMs, and even special purpose HMMs like HMMER, which is a state-of-the-art method for protein classification. The VOGUE implementation and the datasets used in this article are available as open-source.\n            <jats:sup>1<\/jats:sup>\n          <\/jats:p>","DOI":"10.1145\/1644873.1644878","type":"journal-article","created":{"date-parts":[[2010,1,12]],"date-time":"2010-01-12T20:23:07Z","timestamp":1263327787000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["VOGUE"],"prefix":"10.1145","volume":"4","author":[{"given":"Mohammed J.","family":"Zaki","sequence":"first","affiliation":[{"name":"Rensselaer Polytechnic Institute, Troy, NY"}]},{"given":"Christopher D.","family":"Carothers","sequence":"additional","affiliation":[{"name":"Rensselaer Polytechnic Institute, Troy, NY"}]},{"given":"Boleslaw K.","family":"Szymanski","sequence":"additional","affiliation":[{"name":"Rensselaer Polytechnic Institute, Troy, NY"}]}],"member":"320","published-online":{"date-parts":[[2010,1,18]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the International Conference on Data Engineering.","author":"Agrawal R.","unstructured":"Agrawal , R. and Srikant , R . 1995. Mining sequential patterns . In Proceedings of the International Conference on Data Engineering. Agrawal, R. and Srikant, R. 1995. Mining sequential patterns. In Proceedings of the International Conference on Data Engineering."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the International Workshop on Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Computer Science","volume":"2734","author":"Antunes C.","unstructured":"Antunes , C. and Oliveira , A. L . 2003. Generalization of pattern-growth methods for sequential pattern mining with gap constraints . In Proceedings of the International Workshop on Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Computer Science , vol. 2734 . Springer, 239--251. Antunes, C. and Oliveira, A. L. 2003. Generalization of pattern-growth methods for sequential pattern mining with gap constraints. In Proceedings of the International Workshop on Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Computer Science, vol. 2734. Springer, 239--251."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/3120676.3120687"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1018031204"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1456223.1456270"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the SIAM International Conference on Data Mining.","author":"Deshpande M.","unstructured":"Deshpande , M. and Karypis , G . 2001. Selective Markov models for predicting Web-page accesses . In Proceedings of the SIAM International Conference on Data Mining. Deshpande, M. and Karypis, G. 2001. Selective Markov models for predicting Web-page accesses. In Proceedings of the SIAM International Conference on Data Mining."},{"key":"e_1_2_1_7_1","unstructured":"Dong G. and Pei J. 2007. Sequence Data Mining. Springer.   Dong G. and Pei J. 2007. Sequence Data Mining. Springer."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1006\/csla.1997.0037"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Durbin R. Eddy S. Krogh A. and Mitchison G. 1998. Biological Sequence Analysis. Cambridge University Press.  Durbin R. Eddy S. Krogh A. and Mitchison G. 1998. Biological Sequence Analysis. Cambridge University Press.","DOI":"10.1017\/CBO9780511790492"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/14.9.755"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/30.1.235"},{"key":"e_1_2_1_12_1","unstructured":"Felzenszwalb P. Huttenlocher D. and Kleinberg J. 2003. Fast algorithms for large-state-space HMMs with applications to Web usage analysis. In Advances in Neural Information Processing Systems. MIT Press.  Felzenszwalb P. Huttenlocher D. and Kleinberg J. 2003. Fast algorithms for large-state-space HMMs with applications to Web usage analysis. In Advances in Neural Information Processing Systems. MIT Press."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007469218079"},{"key":"e_1_2_1_14_1","first-page":"487","article-title":"Hierarchical hidden Markov models for user\/process profile learning","volume":"78","author":"Galassi U.","year":"2007","unstructured":"Galassi , U. , Botta , M. , and Giordana , A. 2007 . Hierarchical hidden Markov models for user\/process profile learning . Fundamenta Informaticae 78 , 4, 487 -- 505 . Galassi, U., Botta, M., and Giordana, A. 2007. Hierarchical hidden Markov models for user\/process profile learning. Fundamenta Informaticae 78, 4, 487--505.","journal-title":"Fundamenta Informaticae"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2002.1000341"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 13th International Conference on Machine Learning. 180--190","author":"Golding A.","unstructured":"Golding , A. and Roth , D . 1996. Applying winnow to context-sensitive spelling correction . In Proceedings of the 13th International Conference on Machine Learning. 180--190 . Golding, A. and Roth, D. 1996. Applying winnow to context-sensitive spelling correction. In Proceedings of the 13th International Conference on Machine Learning. 180--190."},{"key":"e_1_2_1_17_1","volume-title":"Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology","author":"Gusfield D.","unstructured":"Gusfield , D. 1997. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology . Cambridge University Press . Gusfield, D. 1997. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bti745"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the International Conference on Acoustics, Speech, and Signal Processing.","author":"Kriouile A.","unstructured":"Kriouile , A. , Mari , J.-F. , and Haon , J . -P. 1990. Some improvements in speech recognition algorithms based on HMM . In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. Kriouile, A., Mari, J.-F., and Haon, J.-P. 1990. Some improvements in speech recognition algorithms based on HMM. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing."},{"key":"e_1_2_1_20_1","unstructured":"Kucera H. and Francis W. 1967. Computational Analysis of Present-Day American English. Brown University Press Providence RI.  Kucera H. and Francis W. 1967. Computational Analysis of Present-Day American English. Brown University Press Providence RI."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/322510.322526"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/992628.992666"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.181"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the SIAM International Conference on Data Mining.","author":"Li C.","unstructured":"Li , C. and Wang , J . 2008. Efficiently mining closed subsequences with gap constraints . In Proceedings of the SIAM International Conference on Data Mining. Li, C. and Wang, J. 2008. Efficiently mining closed subsequences with gap constraints. In Proceedings of the SIAM International Conference on Data Mining."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009748302351"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-2836(05)80134-2"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1232270"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the International Conference on Data Engineering.","author":"Pei J.","unstructured":"Pei , J. , Han , J. , Mortazavi-Asl , B. , Pinto , H. , Dayal , Q. C. U. , and Hsu , M . -C. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix projected pattern growth . In Proceedings of the International Conference on Data Engineering. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Dayal, Q. C. U., and Hsu, M.-C. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix projected pattern growth. In Proceedings of the International Conference on Data Engineering."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems.","author":"Pitkow J.","unstructured":"Pitkow , J. and Pirolli , P . 1999. Mining longest repeating subsequence to predict WWW surfing . In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems. Pitkow, J. and Pirolli, P. 1999. Mining longest repeating subsequence to predict WWW surfing. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.18626"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00114008"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007649326333"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the International Conference on Spoken Language Processing.","author":"Schwardt L. C.","unstructured":"Schwardt , L. C. and Du Preez, J. A. 2000. Efficient mixed-order hidden Markov model inference . In Proceedings of the International Conference on Spoken Language Processing. Schwardt, L. C. and Du Preez, J. A. 2000. Efficient mixed-order hidden Markov model inference. In Proceedings of the International Conference on Spoken Language Processing."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 5th International Conference on Extending Database Technology.","author":"Srikant R.","unstructured":"Srikant , R. and Agrawal , R . 1996. Mining sequential patterns: Generalizations and performance improvements . In Proceedings of the 5th International Conference on Extending Database Technology. Srikant, R. and Agrawal, R. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 5th IEEE System, Man and Cybernetics Information Assurance Workshop. 424--431","author":"Szymanski B.","unstructured":"Szymanski , B. and Zhang , Y . 2004. Recursive data mining for masquerade detection and author identification . In Proceedings of the 5th IEEE System, Man and Cybernetics Information Assurance Workshop. 424--431 . Szymanski, B. and Zhang, Y. 2004. Recursive data mining for masquerade detection and author identification. In Proceedings of the 5th IEEE System, Man and Cybernetics Information Assurance Workshop. 424--431."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.105"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/354756.354849"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007652502315"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066228"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the International Joint Conference on AI.","author":"Zhu X.","unstructured":"Zhu , X. and Wu , X . 2007. Mining complex patterns across sequences with gap requirements . In Proceedings of the International Joint Conference on AI. Zhu, X. and Wu, X. 2007. Mining complex patterns across sequences with gap requirements. In Proceedings of the International Joint Conference on AI."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1644873.1644878","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1644873.1644878","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:41:18Z","timestamp":1750250478000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1644873.1644878"}},"subtitle":["A variable order hidden Markov model with duration based on frequent sequence mining"],"short-title":[],"issued":{"date-parts":[[2010,1]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,1]]}},"alternative-id":["10.1145\/1644873.1644878"],"URL":"https:\/\/doi.org\/10.1145\/1644873.1644878","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,1]]},"assertion":[{"value":"2008-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-01-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}