{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T04:27:47Z","timestamp":1777696067716,"version":"3.51.4"},"reference-count":17,"publisher":"SAGE Publications","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDA"],"published-print":{"date-parts":[[2021,9,15]]},"abstract":"<jats:p>Recently, rapid growth of social networks and online news resources from Internet have made text stream clustering become an insufficient application in multiple domains (e.g.: text retrieval diversification, social event detection, text summarization, etc.) Different from traditional static text clustering approach, text stream clustering task has specific key challenges related to the rapid change of topics\/clusters and high-velocity of coming streaming document batches. Recent well-known model-based text stream clustering models, such as: DTM, DCT, MStream, etc. are considered as word-independent evaluation approach which means largely ignoring the relations between words while sampling clusters\/topics. It definitely leads to the decrease of overall model accuracy performance, especially for short-length text documents such as comments, microblogs, etc. in social networks. To tackle these existing problems, in this paper we propose a novel approach of graph-of-words (GOWs) based text stream clustering, called GOW-Stream. The application of common GOWs which are generated from each document batch while sampling clusters\/topics can support to overcome the word-independent evaluation challenge. Our proposed GOW-Stream is promising to significantly achieve better text stream clustering performance than recent state-of-the-art baselines. Extensive experiments on multiple benchmark real-world datasets demonstrate the effectiveness of our proposed model in both accuracy and time-consuming performances.<\/jats:p>","DOI":"10.3233\/ida-205443","type":"journal-article","created":{"date-parts":[[2021,9,17]],"date-time":"2021-09-17T11:49:37Z","timestamp":1631879377000},"page":"1211-1231","source":"Crossref","is-referenced-by-count":0,"title":["GOW-Stream: A novel approach of graph-of-words based mixture model for semantic-enhanced text stream clustering"],"prefix":"10.1177","volume":"25","author":[{"given":"Tham","family":"Vo","sequence":"first","affiliation":[{"name":"Lac Hong University, Dong Nai, Vietnam"},{"name":"Thu Dau Mot University, Binh Duong, Vietnam"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Phuc","family":"Do","sequence":"additional","affiliation":[{"name":"University of Information Technology, VNU-HCM, Ho Chi Minh, Vietnam"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","reference":[{"issue":"4","key":"10.3233\/IDA-205443_ref1","doi-asserted-by":"crossref","first-page":"849","DOI":"10.3233\/IDA-160048","article-title":"Unsupervised event exploration from social text streams","volume":"21","author":"Zhou","year":"2017","journal-title":"Intelligent Data Analysis"},{"issue":"3","key":"10.3233\/IDA-205443_ref3","doi-asserted-by":"crossref","first-page":"681","DOI":"10.3233\/IDA-183836","article-title":"A joint model of extended LDA and IBTM over streaming Chinese short texts","volume":"23","author":"Zhu","year":"2019","journal-title":"Intelligent Data Analysis"},{"key":"10.3233\/IDA-205443_ref4","doi-asserted-by":"crossref","unstructured":"L. Shou, Z. Wang, K. Chen and G. Chen, Sumblr: continuous summarization of evolving tweet streams, in: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2013.","DOI":"10.1145\/2484028.2484045"},{"key":"10.3233\/IDA-205443_ref5","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"10.3233\/IDA-205443_ref6","doi-asserted-by":"crossref","unstructured":"D.M. Blei and J.D. Lafferty, Dynamic topic models, in: Proceedings of the 23rd International Conference on Machine Learning, 2006.","DOI":"10.1145\/1143844.1143859"},{"key":"10.3233\/IDA-205443_ref7","doi-asserted-by":"crossref","unstructured":"Y. Wang, E. Agichtein and M. Benzi, TM-LDA: efficient online modeling of latent topic transitions in social media, in: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012.","DOI":"10.1145\/2339530.2339552"},{"key":"10.3233\/IDA-205443_ref8","doi-asserted-by":"crossref","unstructured":"H. Amoualian, M. Clausel, E. Gaussier and M.R. Amini, Streaming-LDA: A copula-based approach to modeling topic dependencies in document streams, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.","DOI":"10.1145\/2939672.2939781"},{"key":"10.3233\/IDA-205443_ref9","doi-asserted-by":"crossref","unstructured":"S. Liang, E. Yilmaz and E. Kanoulas, Dynamic clustering of streaming short documents, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.","DOI":"10.1145\/2939672.2939748"},{"key":"10.3233\/IDA-205443_ref10","doi-asserted-by":"crossref","unstructured":"J. Yin, D. Chao, Z. Liu, W. Zhang, X. Yu and J. Wang, Model-based clustering of short text streams, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018.","DOI":"10.1145\/3219819.3220094"},{"key":"10.3233\/IDA-205443_ref11","doi-asserted-by":"crossref","unstructured":"P. Pham, P. Do and C.D.C. Ta, GOW-LDA: Applying Term Co-occurrence Graph Representation in LDA Topic Models Improvement, in: International Conference on Computational Science and Technology, 2017.","DOI":"10.1007\/978-981-10-8276-4_40"},{"key":"10.3233\/IDA-205443_ref12","doi-asserted-by":"crossref","unstructured":"X. Wang and A. McCallum, Topics over time: a non-Markov continuous-time model of topical trends, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.","DOI":"10.1145\/1150402.1150450"},{"key":"10.3233\/IDA-205443_ref13","unstructured":"T. Iwata, S. Watanabe, T. Yamada and N. Ueda, Topic tracking model for analyzing consumer purchase behavior, in: Twenty-First International Joint Conference on Artificial Intelligence, 2009."},{"key":"10.3233\/IDA-205443_ref15","doi-asserted-by":"crossref","unstructured":"N. Du, M. Farajtabar, A. Ahmed, A.J. Smola and L. Song, Dirichlet-hawkes processes with applications to clustering continuous-time document streams, in: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015.","DOI":"10.1145\/2783258.2783411"},{"key":"10.3233\/IDA-205443_ref16","doi-asserted-by":"crossref","unstructured":"A. Ahmed and E. Xing, Dynamic non-parametric mixture models and the recurrent chinese restaurant process: with applications to evolutionary clustering, in: Proceedings of the 2008 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2008.","DOI":"10.1137\/1.9781611972788.20"},{"key":"10.3233\/IDA-205443_ref17","doi-asserted-by":"crossref","unstructured":"J. Yin and J. Wang, A model-based approach for text clustering with outlier detection, in: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), 2016, pp. 625\u2013636.","DOI":"10.1109\/ICDE.2016.7498276"},{"key":"10.3233\/IDA-205443_ref18","doi-asserted-by":"crossref","unstructured":"C.C. Aggarwal, S.Y. Philip, J. Han and J. Wang, A framework for clustering evolving data streams, in: Proceedings 2003 VLDB Conference, Morgan Kaufmann, 2003.","DOI":"10.1016\/B978-012722442-8\/50016-1"},{"key":"10.3233\/IDA-205443_ref19","doi-asserted-by":"crossref","unstructured":"F. Cao, M. Estert, W. Qian and A. Zhou, Density-based clustering over an evolving data stream with noise, in: Proceedings of the 2006 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2006.","DOI":"10.1137\/1.9781611972764.29"}],"container-title":["Intelligent Data Analysis"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDA-205443","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:19:11Z","timestamp":1777454351000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDA-205443"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,15]]},"references-count":17,"journal-issue":{"issue":"5"},"URL":"https:\/\/doi.org\/10.3233\/ida-205443","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"value":"1088-467X","type":"print"},{"value":"1571-4128","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,15]]}}}