{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T19:28:49Z","timestamp":1757618929349,"version":"3.44.0"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2025,7,17]],"date-time":"2025-07-17T00:00:00Z","timestamp":1752710400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,7,17]],"date-time":"2025-07-17T00:00:00Z","timestamp":1752710400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100007195","name":"Universit\u00e0 degli Studi di Napoli Federico II","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100007195","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Evol. Intel."],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Due to the inadequacy of standard clustering approaches for handling extensive data, considerable research has recently focused on clustering large and extremely large datasets. Specifically, certain variations of the famous fuzzy C-Means algorithm have been put forth, testing techniques for segmenting datasets and aggregating the intermediate clustered results. Among them, the Fuzzy C-Means online technique is one of the most used for clustering large amounts of data. It splits the dataset into equal-sized subsets, or chunks, and assigns a weight to each chunk depending on the membership degrees per cluster. This study introduces a novel variation of the Online Fuzzy C-Means (OFCM) algorithm designed to boost its performance. Our proposed method integrates a cluster compactness measure into the weight attribution process, quantified by the fuzzy entropy of each cluster. Comparative experiments, conducted across diverse classification datasets of varying scales, demonstrate that the proposed algorithm significantly improves the accuracy of clustering results when compared to the standard OFCM. Crucially, this enhancement is achieved without increasing the computational complexity of the algorithm. Furthermore, our approach yields performance comparable to that of heuristic Fuzzy C-Means algorithms, while offering the distinct advantage of shorter execution times. Future research will focus on exploring feature selection and reduction techniques to adapt the proposed algorithm for effective application to massive datasets characterized by an exceptionally high number of features.<\/jats:p>","DOI":"10.1007\/s12065-025-01076-0","type":"journal-article","created":{"date-parts":[[2025,7,17]],"date-time":"2025-07-17T19:13:23Z","timestamp":1752779603000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A novel fuzzy-entropy based online fuzzy C-Means clustering algorithm for massive data"],"prefix":"10.1007","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5690-5384","authenticated-orcid":false,"given":"Barbara","family":"Cardone","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ferdinando","family":"Di Martino","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,7,17]]},"reference":[{"issue":"4","key":"1076_CR1","doi-asserted-by":"publisher","first-page":"339","DOI":"10.1109\/TBDATA.2016.2622288","volume":"2","author":"N Bharill","year":"2016","unstructured":"Bharill N, Tiwari A (2016) Malviya A Fuzzy based scalable clustering algorithms for handling big data using apache spark. IEEE Trans Big Data. 2(4):339\u2013352. https:\/\/doi.org\/10.1109\/TBDATA.2016.2622288","journal-title":"IEEE Trans Big Data."},{"key":"1076_CR2","doi-asserted-by":"publisher","first-page":"154","DOI":"10.1007\/978-1-4757-0450-1","volume-title":"Pattern Recognition with Fuzzy Objective Function Algorithms","author":"JC Bezdek","year":"1981","unstructured":"Bezdek JC (1981) Pattern Recognition with Fuzzy Objective Function Algorithms, vol 256. Plenum Press, New York, pp 154\u2013196. https:\/\/doi.org\/10.1007\/978-1-4757-0450-1"},{"issue":"4","key":"1076_CR3","doi-asserted-by":"publisher","first-page":"554","DOI":"10.3390\/electronics904","volume":"9","author":"B Cardone","year":"2020","unstructured":"Cardone B, Di Martino F (2020) A novel fuzzy entropy-based method to improve the performance of the fuzzy c-means algorithm. Electronics 9(4):554. https:\/\/doi.org\/10.3390\/electronics904","journal-title":"Electronics"},{"issue":"9","key":"1076_CR4","doi-asserted-by":"publisher","first-page":"2616","DOI":"10.1109\/TCYB.2016.2627686","volume":"47","author":"X Chang","year":"2016","unstructured":"Chang X, Wang Q, Liu Y, Wang Y (2016) Sparse regularization in fuzzy c -means for high-dimensional data clustering. IEEE Trans Cybern 47(9):2616\u20133262. https:\/\/doi.org\/10.1109\/TCYB.2016.2627686","journal-title":"IEEE Trans Cybern"},{"key":"1076_CR5","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1016\/S0019-9958(72)90199-4","volume":"20","author":"A De Luca","year":"1972","unstructured":"De Luca A, Termini S (1972) A definition of non-probabilistic entropy in the setting of fuzzy sets theory. Inf Control 20:301\u2013312","journal-title":"Inf Control"},{"key":"1076_CR6","volume-title":"Advances in Fuzzy Set Theory and Applications","author":"A De Luca","year":"1979","unstructured":"De Luca A (1979) Entropy and energy measures of fuzzy sets. In: Gupta MM, Ragade RK, Yager RR (eds) Advances in Fuzzy Set Theory and Applications. The Netherlands, North-Holland: Amsterdam"},{"key":"1076_CR7","doi-asserted-by":"publisher","first-page":"198","DOI":"10.1016\/j.ins.2018.02.029","volume":"441","author":"F Di Martino","year":"2018","unstructured":"Di Martino F, Sessa S (2018) Extended Fuzzy C-Means hotspot detection method for large and very large event datasets. Inf Sci 441:198\u2013215. https:\/\/doi.org\/10.1016\/j.ins.2018.02.029","journal-title":"Inf Sci"},{"issue":"2","key":"1076_CR8","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1109\/TFUZZ.2003.809902","volume":"11","author":"S Eschrich","year":"2003","unstructured":"Eschrich S, Ke J, Hall L, Goldgof D (2003) Fast accurate fuzzy clustering through data reduction. IEEE Trans Fuzzy Syst 11(2):262\u2013269. https:\/\/doi.org\/10.1109\/TFUZZ.2003.809902","journal-title":"IEEE Trans Fuzzy Syst"},{"key":"1076_CR9","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1016\/j.eswa.2023.120377","volume":"227","author":"SE Hashemi","year":"2023","unstructured":"Hashemi SE, Fatemeh G-J, Mostafa H-K (2023) A fuzzy C-means algorithm for optimizing data clustering. Expert Syst Appl 227:14. https:\/\/doi.org\/10.1016\/j.eswa.2023.120377","journal-title":"Expert Syst Appl"},{"key":"1076_CR10","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1016\/j.csda.2006.02.008","volume":"51","author":"R Hathaway","year":"2006","unstructured":"Hathaway R, Bezdek J (2006) Extending fuzzy and probabilistic clustering to very large data sets. Comput Statist Data Anal 51:215\u2013234. https:\/\/doi.org\/10.1016\/j.csda.2006.02.008","journal-title":"Comput Statist Data Anal"},{"key":"1076_CR11","doi-asserted-by":"publisher","unstructured":"Havens T, Chitta R, Jain A, Jin R (2011) Speedup of fuzzy and possibilistic c-means for large-scale clustering. in Proc. IEEE Int. Conf. Fuzzy Systems, Taipei, Taiwan. pp 463\u2013470. https:\/\/doi.org\/10.1109\/FUZZY.2011.6007618","DOI":"10.1109\/FUZZY.2011.6007618"},{"issue":"6","key":"1076_CR12","doi-asserted-by":"publisher","first-page":"1130","DOI":"10.1109\/TFUZZ.2012.2201485","volume":"20","author":"TC Havens","year":"2012","unstructured":"Havens TC, Bezdek JC, Leckie CR, Hall LO, Palaniswami M (2012) Fuzzy C-means algorithms for very large data. IEEE Trans Fuzzy Syst 20(6):1130\u20131146. https:\/\/doi.org\/10.1109\/TFUZZ.2012.2201485","journal-title":"IEEE Trans Fuzzy Syst"},{"key":"1076_CR13","doi-asserted-by":"publisher","unstructured":"Hore P, Hall L, Goldgof D (2007) Single pass fuzzy C-means. In: Proceedings IEEE Internat. Conf. on Fuzzy Systems, London. pp 1\u20137. https:\/\/doi.org\/10.1109\/FUZZY.2007.4295372","DOI":"10.1109\/FUZZY.2007.4295372"},{"key":"1076_CR14","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1007\/s11265-008-0243-1","volume":"54","author":"P Hore","year":"2009","unstructured":"Hore P, Hall LO, Goldgof DB, Gu Y, Maudsley AA, Darkazanli A (2009) A scalable framework for segmenting magnetic resonance images. J Sign Process Syst Sign Image Video Technol 54:183\u2013203. https:\/\/doi.org\/10.1007\/s11265-008-0243-1","journal-title":"J Sign Process Syst Sign Image Video Technol"},{"issue":"6","key":"1076_CR15","doi-asserted-by":"publisher","first-page":"705","DOI":"10.1109\/TFUZZ.2002.805901","volume":"10","author":"U Kaymak","year":"2002","unstructured":"Kaymak U, Setnes M (2002) Fuzzy clustering with volume prototype and adaptive cluster merging. IEEE Trans Fuzzy Syst 10(6):705\u2013712. https:\/\/doi.org\/10.1109\/TFUZZ.2002.805901","journal-title":"IEEE Trans Fuzzy Syst"},{"issue":"12","key":"1076_CR16","doi-asserted-by":"publisher","first-page":"2719","DOI":"10.1049\/iet-ipr.2019.0899","volume":"14","author":"O Kulkarni","year":"2020","unstructured":"Kulkarni O, Jena S, Sankar VR (2020) MapReduce framework based big data clustering using fractional integrated sparse fuzzy C means algorithm. IET Image Process J 14(12):2719\u20132727. https:\/\/doi.org\/10.1049\/iet-ipr.2019.0899","journal-title":"IET Image Process J"},{"issue":"4","key":"1076_CR17","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1145\/2094114.2094118","volume":"40","author":"Y-H Lee","year":"2011","unstructured":"Lee Y-H, Lee Y-J, Choi H, Chung YD, Moon B (2011) Parallel data processing with MapReduce: a survey. SIGMOD Rec. 40(4):11\u201320. https:\/\/doi.org\/10.1145\/2094114.2094118","journal-title":"SIGMOD Rec."},{"key":"1076_CR18","doi-asserted-by":"publisher","first-page":"8566253","DOI":"10.1155\/2022\/8566253","volume":"2022","author":"Y Liu","year":"2022","unstructured":"Liu Y, Zhang Y, Chao H (2022) Incremental fuzzy clustering based on feature reduction. J Electr Comput Eng 2022:8566253. https:\/\/doi.org\/10.1155\/2022\/8566253","journal-title":"J Electr Comput Eng"},{"key":"1076_CR19","doi-asserted-by":"publisher","unstructured":"Maitrey S, Jha CK (2015) Handling big data efficiently by using map reduce technique. In: 2015 IEEE International Conference on Computational Intelligence & Communication Technology, Ghaziabad, India. pp 703\u2013708. https:\/\/doi.org\/10.1109\/CICT.2015.140","DOI":"10.1109\/CICT.2015.140"},{"key":"1076_CR20","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1016\/j.advengsoft.2016.01.008","volume":"95","author":"S Mirjalili","year":"2016","unstructured":"Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51\u201367. https:\/\/doi.org\/10.1016\/j.advengsoft.2016.01.008","journal-title":"Adv Eng Softw"},{"issue":"2","key":"1076_CR21","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1504\/IJBIDM.2021.117110","volume":"19","author":"KM Padmapriya","year":"2021","unstructured":"Padmapriya KM, Anandhi B, Vijayakumar M (2021) MapReduce fuzzy C-means ensemble clustering with gentle AdaBoost for big data analytics. Int J Bus Intell Data Min 19(2):170\u2013188. https:\/\/doi.org\/10.1504\/IJBIDM.2021.117110","journal-title":"Int J Bus Intell Data Min"},{"key":"1076_CR22","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1186\/s40537-021-00450-w","volume":"8","author":"SM Razavi","year":"2021","unstructured":"Razavi SM, Kahani M, Paydar S (2021) Big data fuzzy C-means algorithm based on bee colony optimization using an Apache Hbase. J Big Data 8:64. https:\/\/doi.org\/10.1186\/s40537-021-00450-w","journal-title":"J Big Data"},{"issue":"2","key":"1076_CR23","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1109\/TFUZZ.2017.2692203","volume":"26","author":"MS Yang","year":"2018","unstructured":"Yang MS, Nataliani Y (2018) A feature-reduction fuzzy clustering algorithm based on feature-weighted entropy. IEEE Trans Fuzzy Syst 26(2):817\u2013835. https:\/\/doi.org\/10.1109\/TFUZZ.2017.2692203","journal-title":"IEEE Trans Fuzzy Syst"}],"container-title":["Evolutionary Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12065-025-01076-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s12065-025-01076-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12065-025-01076-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,7]],"date-time":"2025-09-07T13:45:06Z","timestamp":1757252706000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s12065-025-01076-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,17]]},"references-count":23,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["1076"],"URL":"https:\/\/doi.org\/10.1007\/s12065-025-01076-0","relation":{},"ISSN":["1864-5909","1864-5917"],"issn-type":[{"type":"print","value":"1864-5909"},{"type":"electronic","value":"1864-5917"}],"subject":[],"published":{"date-parts":[[2025,7,17]]},"assertion":[{"value":"23 May 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 June 2025","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 July 2025","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 July 2025","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare\u00a0that\u00a0they have no financial\u00a0or\u00a0non-financial interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval and consent to participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}],"article-number":"86"}}