{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,24]],"date-time":"2026-01-24T17:01:10Z","timestamp":1769274070962,"version":"3.49.0"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T00:00:00Z","timestamp":1641513600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T00:00:00Z","timestamp":1641513600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2022,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Over the decade, a number of attempts have been made towards data stream clustering, but most of the works fall under clustering by example approach. There are a number of applications where clustering by variable approach is required which involves clustering of multiple data streams as opposed to clustering data examples in a data stream. Furthermore, a few works have been presented for clustering multiple data streams and these are applicable to numeric data streams only. Hence, this research gap has motivated current research work. In the present work, a hierarchical clustering technique has been proposed to cluster multiple data streams where data are nominal. To address the concept changes in the data streams splitting and merging of the clusters in the hierarchical structure are performed. The decision to split or merge is based on the entropy measure, representing the cluster\u2019s degree of disparity. The performance of the proposed technique has been analysed and compared to Agglomerative Nesting clustering technique on synthetic as well as a real-world dataset in terms of Dunn Index, Modified Hubert <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\varGamma $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u0393<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> statistic, Cophenetic Correlation Coefficient, and Purity. The proposed technique outperforms Agglomerative Nesting clustering technique for concept evolving data streams. Furthermore, the effect of concept evolution on clustering structure and average entropy has been visualised for detailed analysis and understanding.<\/jats:p>","DOI":"10.1007\/s40747-021-00634-0","type":"journal-article","created":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T08:03:04Z","timestamp":1641542584000},"page":"1737-1761","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Hierarchical clustering for multiple nominal data streams with evolving behaviour"],"prefix":"10.1007","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4739-6818","authenticated-orcid":false,"given":"Jerry W.","family":"Sangma","sequence":"first","affiliation":[]},{"given":"Mekhla","family":"Sarkar","sequence":"additional","affiliation":[]},{"given":"Vipin","family":"Pal","sequence":"additional","affiliation":[]},{"given":"Amit","family":"Agrawal","sequence":"additional","affiliation":[]},{"family":"Yogita","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,1,7]]},"reference":[{"key":"634_CR1","first-page":"1","volume":"17","author":"MR Ackermann","year":"2012","unstructured":"Ackermann MR, M\u00e4rtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) Streamkm++ a clustering algorithm for data streams. J Exp Algorithmics (JEA) 17:1\u20132","journal-title":"J Exp Algorithmics (JEA)"},{"key":"634_CR2","doi-asserted-by":"crossref","unstructured":"Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. Proceedings of the Thirtieth international conference on Very large data bases-Volume 30:852\u2013863","DOI":"10.1016\/B978-012088469-8.50075-9"},{"key":"634_CR3","doi-asserted-by":"crossref","unstructured":"Aggarwal CC, Philip SY, Han J, Wang J (2003) A framework for clustering evolving data streams. In: Proceedings 2003 VLDB conference, Elsevier, pp 81\u201392","DOI":"10.1016\/B978-012722442-8\/50016-1"},{"issue":"2","key":"634_CR4","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1007\/s10115-009-0241-z","volume":"24","author":"CC Aggarwal","year":"2010","unstructured":"Aggarwal CC, Philip SY (2010) On clustering massive text and categorical data streams. Knowl Inf Syst 24(2):171\u2013196","journal-title":"Knowl Inf Syst"},{"key":"634_CR5","doi-asserted-by":"crossref","unstructured":"Balzanella A, Lechevallier Y, Verde R (2011) Clustering multiple data streams. In: New perspectives in statistical modeling and data analysis, Springer, pp 247\u2013254","DOI":"10.1007\/978-3-642-11363-5_28"},{"key":"634_CR6","doi-asserted-by":"crossref","unstructured":"Bones CC, Romani LA, de\u00a0Sousa EP (2016) Improving multivariate data streams clustering. In: Embrapa Inform\u00e1tica Agropecu\u00e1ria-Artigo em anais de congresso (ALICE), Procedia Computer Science, pp 461\u2013471","DOI":"10.1016\/j.procs.2016.05.325"},{"key":"634_CR7","doi-asserted-by":"crossref","unstructured":"Cao F, Estert M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceedings of the 2006 SIAM international conference on data mining, SIAM, pp 328\u2013339","DOI":"10.1137\/1.9781611972764.29"},{"issue":"5","key":"634_CR8","doi-asserted-by":"publisher","first-page":"652","DOI":"10.1109\/TKDE.2008.192","volume":"21","author":"HL Chen","year":"2008","unstructured":"Chen HL, Chen MS, Lin SC (2008) Catching the trend: a framework for clustering concept-drifting categorical data. IEEE Trans Knowl Data Eng 21(5):652\u2013665","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"1","key":"634_CR9","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1016\/j.ins.2011.09.004","volume":"183","author":"L Chen","year":"2012","unstructured":"Chen L, Zou LJ, Tu L (2012) A clustering algorithm for multiple data streams based on spectral component similarity. Inf Sci 183(1):35\u201347","journal-title":"Inf Sci"},{"key":"634_CR10","doi-asserted-by":"crossref","unstructured":"Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 133\u2013142","DOI":"10.1145\/1281192.1281210"},{"key":"634_CR11","unstructured":"Dai BR, Huang JW, Yeh MY, Chen MS (2004) Clustering on demand for multiple data streams. In: Fourth IEEE International Conference on Data Mining (ICDM\u201904), IEEE, pp 367\u2013370"},{"issue":"2","key":"634_CR12","first-page":"19","volume":"19","author":"E Diday","year":"1971","unstructured":"Diday E (1971) Une nouvelle m\u00e9thode en classification automatique et reconnaissance des formes la m\u00e9thode des nu\u00e9es dynamiques. Revue de statistique appliqu\u00e9e 19(2):19\u201333","journal-title":"Revue de statistique appliqu\u00e9e"},{"key":"634_CR13","doi-asserted-by":"crossref","unstructured":"Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 71\u201380","DOI":"10.1145\/347090.347107"},{"issue":"3","key":"634_CR14","doi-asserted-by":"publisher","first-page":"279","DOI":"10.2307\/2412324","volume":"18","author":"JS Farris","year":"1969","unstructured":"Farris JS (1969) On the cophenetic correlation coefficient. Syst Zool 18(3):279\u2013285","journal-title":"Syst Zool"},{"key":"634_CR15","doi-asserted-by":"crossref","unstructured":"Gama J, Medas P, Rocha R (2004) Forest trees for on-line data. In: Proceedings of the 2004 ACM symposium on Applied computing, pp 632\u2013636","DOI":"10.1145\/967900.968033"},{"issue":"3","key":"634_CR16","doi-asserted-by":"publisher","first-page":"515","DOI":"10.1109\/TKDE.2003.1198387","volume":"15","author":"S Guha","year":"2003","unstructured":"Guha S, Meyerson A, Mishra N, Motwani R, O\u2019Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515\u2013528","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"634_CR17","doi-asserted-by":"crossref","unstructured":"Guha S, Mishra N, Motwani R, et\u00a0al. (2000) Clustering data streams. In: focs, IEEE, p 359","DOI":"10.1109\/SFCS.2000.892124"},{"key":"634_CR18","volume-title":"Data mining: concepts and techniques","author":"J Han","year":"2011","unstructured":"Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam"},{"key":"634_CR19","doi-asserted-by":"crossref","unstructured":"Hassani M, Spaus P, Seidl T (2014) Adaptive multiple-resolution stream clustering. In: International Workshop on Machine Learning and Data Mining in Pattern Recognition, Springer, pp 134\u2013148","DOI":"10.1007\/978-3-319-08979-9_11"},{"key":"634_CR20","doi-asserted-by":"crossref","unstructured":"Hoeffding W (1994) Probability inequalities for sums of bounded random variables. In: The Collected Works of Wassily Hoeffding, Springer, pp 409\u2013426","DOI":"10.1007\/978-1-4612-0865-5_26"},{"issue":"1","key":"634_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-015-0036-x","volume":"3","author":"M Khalilian","year":"2016","unstructured":"Khalilian M, Mustapha N, Sulaiman N (2016) Data stream clustering by divide and conquer approach based on vector model. J Big Data 3(1):1","journal-title":"J Big Data"},{"issue":"2","key":"634_CR22","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1007\/s10618-018-0598-2","volume":"33","author":"P Laurinec","year":"2019","unstructured":"Laurinec P, Luck\u00e1 M (2019) Interpretable multiple data streams clustering with clipped streams representation for the improvement of electricity consumption forecasting. Data Min Knowl Disc 33(2):413\u2013445","journal-title":"Data Min Knowl Disc"},{"key":"634_CR23","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/j.knosys.2014.02.004","volume":"59","author":"Y Li","year":"2014","unstructured":"Li Y, Li D, Wang S, Zhai Y (2014) Incremental entropy-based clustering on categorical data streams with concept drift. Knowl-Based Syst 59:33\u201347","journal-title":"Knowl-Based Syst"},{"key":"634_CR24","doi-asserted-by":"crossref","unstructured":"Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: 2010 IEEE international conference on data mining, IEEE, pp 911\u2013916","DOI":"10.1109\/ICDM.2010.35"},{"key":"634_CR25","doi-asserted-by":"crossref","unstructured":"Meesuksabai W, Kangkachit T, Waiyamai K (2011) Hue-stream: evolution-based clustering technique for heterogeneous data streams with uncertainty. In: International Conference on Advanced Data Mining and Applications, Springer, pp 27\u201340","DOI":"10.1007\/978-3-642-25856-5_3"},{"key":"634_CR26","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1098\/rsta.1896.0007","volume":"187","author":"K Pearson","year":"1896","unstructured":"Pearson K (1896) Vii. mathematical contributions to the theory of evolution. \u2013iii. regression, heredity, and panmixia. Philos Trans Roy Soc Lond Ser A 187:253\u2013318","journal-title":"Philos Trans Roy Soc Lond Ser A"},{"issue":"1","key":"634_CR27","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1109\/JIOT.2016.2618909","volume":"4","author":"D Puschmann","year":"2016","unstructured":"Puschmann D, Barnaghi P, Tafazolli R (2016) Adaptive clustering for dynamic IoT data streams. IEEE Internet Things J 4(1):64\u201374","journal-title":"IEEE Internet Things J"},{"issue":"5","key":"634_CR28","doi-asserted-by":"publisher","first-page":"615","DOI":"10.1109\/TKDE.2007.190727","volume":"20","author":"PP Rodrigues","year":"2008","unstructured":"Rodrigues PP, Gama J, Pedroso J (2008) Hierarchical clustering of time-series data streams. IEEE Trans Knowl Data Eng 20(5):615\u2013627","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"634_CR29","doi-asserted-by":"crossref","unstructured":"Tu L (2012) Clustering on multiple data streams. In: Advances in Computer Science and Information Engineering, Springer, pp 73\u201378","DOI":"10.1007\/978-3-642-30223-7_13"},{"key":"634_CR30","doi-asserted-by":"crossref","unstructured":"Udommanetanakit K, Rakthanmanon T, Waiyamai K (2007) E-stream: evolution-based technique for stream clustering. In: International conference on advanced data mining and applications, Springer, pp 605\u2013615","DOI":"10.1007\/978-3-540-73871-8_58"},{"key":"634_CR31","doi-asserted-by":"crossref","unstructured":"Wu J, Xiong H, Chen J (2009) Adapting the right measures for k-means clustering. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 877\u2013886","DOI":"10.1145\/1557019.1557115"},{"issue":"5","key":"634_CR32","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1016\/j.engappai.2012.04.005","volume":"25","author":"L Zhao","year":"2012","unstructured":"Zhao L, Wang L, Dw C (2012) Hoeffding bound based evolutionary algorithm for symbolic regression. Eng Appl Artif Intell 25(5):945\u2013957","journal-title":"Eng Appl Artif Intell"},{"issue":"5\u20136","key":"634_CR33","doi-asserted-by":"publisher","first-page":"790","DOI":"10.1016\/j.neunet.2005.06.008","volume":"18","author":"S Zhong","year":"2005","unstructured":"Zhong S (2005) Efficient streaming text clustering. Neural Netw 18(5\u20136):790\u2013798","journal-title":"Neural Netw"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00634-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-021-00634-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00634-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,4,29]],"date-time":"2022-04-29T17:18:00Z","timestamp":1651252680000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-021-00634-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,7]]},"references-count":33,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,4]]}},"alternative-id":["634"],"URL":"https:\/\/doi.org\/10.1007\/s40747-021-00634-0","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,7]]},"assertion":[{"value":"11 June 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 December 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 January 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}