{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T10:20:42Z","timestamp":1777890042731,"version":"3.51.4"},"reference-count":42,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2022,7,20]],"date-time":"2022-07-20T00:00:00Z","timestamp":1658275200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Web Intelligence"],"published-print":{"date-parts":[[2022,10,5]]},"abstract":"<jats:p>The process of retrieving essential information from the dataset is a significant data mining approach, which is specifically termed as data clustering. However, nature-inspired optimizations are designed in recent decades to solve optimization problems, particularly for data clustering complexities. However, the existing methods are not feasible to process with a large amount of data, as the execution time taken by the traditional approaches is larger. Hence, an efficient and optimal data clustering scheme is designed using the devised Fractional Sail Fish-Sparse Fuzzy C-Means\u00a0+ Particle Whale optimization (FSF-Sparse FCM\u00a0+ PWO) based MapReduce Framework (MRF) to process high dimensional data. Theproposed FSF-Sparse FCM is designed by the integration of Sail Fish Optimization (SFO) with fractional concept and Sparse FCM. The proposed MRF poses two functions, such as the mapper function and reducer function to perform the process of data clustering. Moreover, the proposed FSF-Sparse FCM is employed in the mapper phase to compute the cluster centroids, and thereby the intermediate data is generated. The intermediate data is tuned in the reducer phase using Particle Whale Optimization (PWO), which is the integration of Particle Swarm Optimization (PSO) and Whale optimization algorithm (WOA). Accordingly, the optimal cluster centroid is computed at the reducer phase using the objective function based on DB-Index. The proposed FSF-Sparse FM\u00a0+ PWO obtained the highest accuracy of 0.903 and lowest DB-Index of 39.07.<\/jats:p>","DOI":"10.3233\/web-210490","type":"journal-article","created":{"date-parts":[[2022,7,22]],"date-time":"2022-07-22T11:42:43Z","timestamp":1658490163000},"page":"153-171","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Big data clustering using fractional sail fish-sparse fuzzy C-means and particle whale optimization based MapReduce framework"],"prefix":"10.1177","volume":"20","author":[{"given":"Omkaresh","family":"Kulkarni","sequence":"first","affiliation":[{"name":"MIT World Peace University, Kothrud, Pune 411038, India"}]},{"given":"Ravi Sankar","family":"Vadali","sequence":"additional","affiliation":[{"name":"GITAM School of Technology, GITAM Deemed to be University, GITAM University, Rudraram, Telangana\u00a0502329, India"}]}],"member":"179","published-online":{"date-parts":[[2022,7,20]]},"reference":[{"key":"ref001","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2937021"},{"key":"ref002","doi-asserted-by":"publisher","DOI":"10.1049\/trit.2019.0048"},{"key":"ref003","doi-asserted-by":"publisher","DOI":"10.1109\/CONFLUENCE.2014.6949256"},{"key":"ref004","doi-asserted-by":"crossref","unstructured":"A.B.\u00a0Ayed, M.B.\u00a0Halima and A.M.\u00a0Alimi, Survey on clustering methods: Towards fuzzy clustering for big data, in: IEEE 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR), 2014, pp.\u00a0331\u2013336.","DOI":"10.1109\/SOCPAR.2014.7008028"},{"key":"ref005","doi-asserted-by":"crossref","unstructured":"P.R.\u00a0Bhaladhare and D.C.\u00a0Jinwala, A clustering approach for the\n                      l\n                      -diversity model in privacy preserving data mining using fractional calculus-bacterial foraging optimization algorithm, Advances in Computer Engineering 2014 (2014) Article ID 396529.","DOI":"10.1155\/2014\/396529"},{"issue":"1","key":"ref006","first-page":"1","volume":"1","author":"Brajula W.","year":"2018","journal-title":"Journal of Networking and Communication Systems"},{"key":"ref007","unstructured":"X.\u00a0Cai, F.\u00a0Nie and H.\u00a0Huang, Multi-view k-means clustering on big data, in: Twenty-Third International Joint Conference on Artificial Intelligence, 2013."},{"key":"ref008","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2016.2627686"},{"key":"ref009","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2016.06.080"},{"key":"ref010","unstructured":"M.\u00a0Ester, H.P.\u00a0Kriegel, J.\u00a0Sander and X.\u00a0Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Kdd, Vol.\u00a096, 1996, pp.\u00a0226\u2013231."},{"key":"ref011","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2019.01.006"},{"key":"ref012","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-017-1205-9"},{"key":"ref013","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-017-1571-3"},{"key":"ref014","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-13-8715-9_13"},{"key":"ref015","doi-asserted-by":"crossref","unstructured":"P.P.\u00a0Jadhav and S.D.\u00a0Joshi, Atom search sunflower optimization for trust-based routing in Internet of things, International Journal of Numerical Modelling: Electronic Networks, Devices and Fields 34(3) (2021), e2845.","DOI":"10.1002\/jnm.2845"},{"issue":"11","key":"ref016","first-page":"997","volume":"11","author":"Kulkarni O.","year":"2016","journal-title":"Int. Rev. Comput. Softw"},{"key":"ref017","doi-asserted-by":"publisher","DOI":"10.1515\/jisys-2018-0117"},{"key":"ref018","doi-asserted-by":"publisher","DOI":"10.1109\/TETC.2016.2517930"},{"key":"ref019","unstructured":"Localization Data for Person Activity Data Set, https:\/\/archive.ics.uci.edu\/ml\/datasets\/Localization+Data+for+Person+Activity, accessed on June 2020."},{"key":"ref020","doi-asserted-by":"crossref","unstructured":"W.\u00a0Lu, Improved K-means clustering algorithm for big data mining under hadoop parallel framework, Journal of Grid Computing (2020).","DOI":"10.1007\/s10723-019-09503-0"},{"key":"ref021","doi-asserted-by":"crossref","unstructured":"A.\u00a0Marino and P.\u00a0Pariso, E-government and its impact on national economic development: A case study concerning southern Italy, in: ACM International Conference Proceeding Series, 2019, pp.\u00a01\u20134.","DOI":"10.1145\/3340017.3342242"},{"key":"ref022","doi-asserted-by":"publisher","DOI":"10.1016\/j.advengsoft.2016.01.008"},{"key":"ref023","doi-asserted-by":"crossref","unstructured":"E.\u00a0Mooi and M.\u00a0Sarstedt, Introduction to Market Research: A Concise Guide to Market Research, 2011.","DOI":"10.1007\/978-3-642-12541-6"},{"key":"ref024","doi-asserted-by":"crossref","unstructured":"V.\u00a0Ravuri and S.\u00a0Vasundra, Moth-flame optimization-bat optimization: Map-reduce framework for big data clustering using the moth-flame bat optimization and sparse fuzzy C-means, Big Data (2020).","DOI":"10.1089\/big.2019.0125"},{"key":"ref025","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8655(97)00122-0"},{"key":"ref026","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2019.01.001"},{"issue":"1","key":"ref027","first-page":"26","volume":"1","author":"Shareef S.K.M.","year":"2018","journal-title":"Journal of Computational Mechanics, Power System and Control"},{"key":"ref028","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.11.001"},{"key":"ref029","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2018.09.002"},{"key":"ref030","unstructured":"Skin Segmentation Data Set, https:\/\/archive.ics.uci.edu\/ml\/datasets\/Skin+Segmentation, accessed on June 2020."},{"key":"ref031","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.12.093"},{"key":"ref032","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-020-00200-0"},{"issue":"1","key":"ref033","first-page":"27","volume":"1","author":"Veeraiah N.","year":"2018","journal-title":"Multimedia Research"},{"key":"ref034","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2015.2503743"},{"key":"ref035","doi-asserted-by":"publisher","DOI":"10.1007\/s00500-016-2474-6"},{"key":"ref036","doi-asserted-by":"publisher","DOI":"10.1007\/s12539-018-0294-3"},{"key":"ref037","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2016.05.010"},{"key":"ref038","doi-asserted-by":"crossref","unstructured":"Q.\u00a0Yu and Z.\u00a0Ding, An improved fuzzy C-means algorithm based on MapReduce, in: Proceedings of 8th International Conference on Biomedical Engineering and Informatics (BMEI), Shenyang, China, 2015.","DOI":"10.1109\/BMEI.2015.7401581"},{"key":"ref039","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2016.2598679"},{"key":"ref040","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2607757"},{"key":"ref041","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2016.2516952"},{"key":"ref042","doi-asserted-by":"crossref","unstructured":"H.\u00a0Zhu, Y.\u00a0Guo, M.\u00a0Niu, G.\u00a0Yang and L.\u00a0Jiao, Distributed SAR image change detection based on spark, in: Proceedings of IEEE International Conference on Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 2015.","DOI":"10.1109\/IGARSS.2015.7326739"}],"container-title":["Web Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/WEB-210490","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/WEB-210490","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/WEB-210490","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T05:27:31Z","timestamp":1777613251000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/WEB-210490"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,20]]},"references-count":42,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,10,5]]}},"alternative-id":["10.3233\/WEB-210490"],"URL":"https:\/\/doi.org\/10.3233\/web-210490","relation":{},"ISSN":["2405-6456","2405-6464"],"issn-type":[{"value":"2405-6456","type":"print"},{"value":"2405-6464","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,20]]}}}