{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:29:36Z","timestamp":1777854576896,"version":"3.51.4"},"reference-count":34,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2023,5,2]],"date-time":"2023-05-02T00:00:00Z","timestamp":1682985600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"},{"start":{"date-parts":[[2023,5,2]],"date-time":"2023-05-02T00:00:00Z","timestamp":1682985600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2025,10]]},"abstract":"<jats:p>\n                    Text clustering has been an overlooked field of text mining that requires more attention. Several applications require automatic text organisation which relies on an information retrieval system based on organised search results. Spherical\n                    <jats:italic>k<\/jats:italic>\n                    -means is a successful adaptation of the classic\n                    <jats:italic>k<\/jats:italic>\n                    -means algorithm for text clustering. However, conventional methods to accelerate\n                    <jats:italic>k<\/jats:italic>\n                    -means may not apply to spherical\n                    <jats:italic>k<\/jats:italic>\n                    -means due to the different nature of text document data. The proposed work introduces an iterative feature filtering technique that reduces the data size during the process of clustering which further produces more feature-relevant clusters in less time compared to classic spherical\n                    <jats:italic>k<\/jats:italic>\n                    -means. The novelty of the proposed method is that feature assessment is distinct from the objective function of clustering and derived from the cluster structure. Experimental results show that the proposed scheme achieves computation speed without sacrificing cluster quality over popular text corpora. The demonstrated results are satisfactory and outperform compared to recent works in this domain.\n                  <\/jats:p>","DOI":"10.1177\/01655515231165230","type":"journal-article","created":{"date-parts":[[2023,5,3]],"date-time":"2023-05-03T01:38:56Z","timestamp":1683077936000},"page":"1204-1216","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["SKIFF: Spherical\n                    <i>K<\/i>\n                    -means with iterative feature filtering for text document clustering"],"prefix":"10.1177","volume":"51","author":[{"given":"Iti","family":"Sharma","sequence":"first","affiliation":[{"name":"Government Polytechnic College Kota, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Abhay","family":"Sharma","sequence":"additional","affiliation":[{"name":"Manipal University Jaipur, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rekha","family":"Chaturvedi","sequence":"additional","affiliation":[{"name":"Manipal University Jaipur, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jitendra","family":"Rajpurohit","sequence":"additional","affiliation":[{"name":"Manipal University Jaipur, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5113-0639","authenticated-orcid":false,"given":"Manoj","family":"Kumar","sequence":"additional","affiliation":[{"name":"Faculty of Engineering and Information Sciences, University of Wollongong in Dubai (UOWD), Dubai Knowledge Park, Dubai, UAEMEU Research Unit, Faculty of Information Technology, Middle East University, Amman, Jordan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2023,5,2]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13042-010-0001-0"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-011-9163-y"},{"issue":"1","key":"e_1_3_2_6_2","first-page":"1","article-title":"Spherical k-means clustering","volume":"50","author":"Hornik K","year":"2012","unstructured":"Hornik K, Feinerer I, Kober M, et al. Spherical k-means clustering. J Stat Software 2012; 50(1): 1\u201322.","journal-title":"J Stat Software"},{"key":"e_1_3_2_7_2","first-page":"419","article-title":"Text classification using string kernels","volume":"2","author":"Lodhi H","year":"2002","unstructured":"Lodhi H, Saunders C, Shawe-Taylor J, et al. Text classification using string kernels. J Mach Learn Res 2002; 2: 419\u2013444.","journal-title":"J Mach Learn Res"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.csda.2009.09.023"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.114.2.211"},{"key":"e_1_3_2_10_2","doi-asserted-by":"crossref","unstructured":"Blei DM Lafferty JD. Topic models. In: Text mining. London: Chapman and Hall\/CRC 2009 pp. 101\u2013124 http:\/\/www.cs.columbia.edu\/~blei\/papers\/BleiLafferty2009.pdf","DOI":"10.1201\/9781420059458.ch4"},{"key":"e_1_3_2_11_2","first-page":"234","volume-title":"Introduction to information retrieval","author":"Sch\u00fctze H","year":"2008","unstructured":"Sch\u00fctze H, Manning CD, Raghavan P. Introduction to information retrieval, vol. 39. Cambridge: Cambridge University Press, 2008, pp. 234\u2013265."},{"key":"e_1_3_2_12_2","first-page":"768","article-title":"Cluster analysis of multivariate data: efficiency versus interpretability of classifications","volume":"21","author":"Forgy E","year":"1965","unstructured":"Forgy E. Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 1965; 21: 768\u2013780.","journal-title":"Biometrics"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007612920971"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2006.870586"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3040506"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2004.843074"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Dhillon IS Fan J Guan Y. Efficient clustering of very large document collections. In: Data mining for scientific and engineering applications. Boston MA: Springer 2001 pp. 357\u2013381 https:\/\/www.cs.utexas.edu\/users\/inderjit\/public_papers\/effclus.pdf","DOI":"10.1007\/978-1-4615-1733-7_20"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2020.113288"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2004.824416"},{"key":"e_1_3_2_20_2","volume-title":"k-means++: the advantages of careful seeding","author":"Arthur D","year":"2006","unstructured":"Arthur D, Vassilvitskii S. k-means++: the advantages of careful seeding. Stanford, CA: Stanford University Press, 2006."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.simpat.2015.03.007"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.190740"},{"key":"e_1_3_2_23_2","first-page":"327","volume-title":"2018 second international conference on electronics, communication and aerospace technology (ICECA)","author":"Sharma I","unstructured":"Sharma I, Sharma H. Recognizing patterns in text data through effective initialization of spherical K-means. In: 2018 second international conference on electronics, communication and aerospace technology (ICECA), Coimbatore, India, 29\u201331 March 2018, pp. 327\u2013331. New York: IEEE."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-06014-6"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10878-020-00569-1"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.protcy.2012.10.052"},{"key":"e_1_3_2_27_2","doi-asserted-by":"crossref","unstructured":"Roffo G Melzi S Castellani U et al. Infinite latent feature selection: a probabilistic latent graph-based ranking approach. In: Proceedings of the IEEE international conference on computer vision 2017 pp. 1398\u20131406 https:\/\/openaccess.thecvf.com\/content_ICCV_2017\/papers\/Roffo_Infinite_Latent_Feature_ICCV_2017_paper.pdf","DOI":"10.1109\/ICCV.2017.156"},{"key":"e_1_3_2_28_2","doi-asserted-by":"crossref","unstructured":"Hu Y Milios EE Blustein J. Interactive feature selection for document clustering. In: Proceedings of the 2011 ACM symposium on applied computing 2011 pp. 1143\u20131150 https:\/\/web.cs.dal.ca\/~eem\/cvWeb\/pubs\/2011-Yeming-SAC.pdf","DOI":"10.1145\/1982185.1982436"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107517"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.105417"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.114563"},{"key":"e_1_3_2_32_2","first-page":"513","volume-title":"2020 international conference on computer engineering and application (ICCEA)","author":"Liu W","unstructured":"Liu W, Liu M, Huang M. Study on Chinese text clustering algorithm based on K-mean and evaluation method on effect of clustering for software-intensive system. In: 2020 international conference on computer engineering and application (ICCEA), Guangzhou, China, 18\u201320 March 2020, pp. 513\u2013519. New York: IEEE."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-021-06021-7"},{"key":"e_1_3_2_34_2","unstructured":"https:\/\/archive.ics.uci.edu\/ml\/datasets\/bag+of+words"},{"key":"e_1_3_2_35_2","unstructured":"http:\/\/www.cad.zju.edu.cn\/home\/dengcai\/Data\/TextData.html"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01655515231165230","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/01655515231165230","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01655515231165230","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:10:05Z","timestamp":1777504205000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/01655515231165230"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,2]]},"references-count":34,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["10.1177\/01655515231165230"],"URL":"https:\/\/doi.org\/10.1177\/01655515231165230","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,2]]}}}