{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:53:59Z","timestamp":1777704839277,"version":"3.51.4"},"reference-count":50,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2020,6,6]],"date-time":"2020-06-06T00:00:00Z","timestamp":1591401600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2020,8,31]]},"abstract":"<jats:p>Overlapping clustering algorithms have shown to be effective for clustering documents. However, the current overlapping document clustering algorithms produce a big number of clusters, which make them little useful for the user. Therefore, in this paper, we propose a k-means based method for overlapping document clustering, which allows to specify by the user the number of groups to be built. Our experiments with different corpora show that our proposal allows obtaining better results in terms of FBcubed than other recent works for overlapping document clustering reported in the literature.<\/jats:p>","DOI":"10.3233\/jifs-179878","type":"journal-article","created":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T12:49:32Z","timestamp":1591706972000},"page":"2127-2135","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["K-means based method for overlapping document clustering"],"prefix":"10.1177","volume":"39","author":[{"given":"Beatriz","family":"Beltr\u00e1n","sequence":"first","affiliation":[{"name":"Language &amp; Knowledge Engineering Lab, Benem\u00e9rita Universidad Aut\u00f3noma de Puebla, Puebla, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Darnes","family":"Vilari\u00f1o","sequence":"additional","affiliation":[{"name":"Language &amp; Knowledge Engineering Lab, Benem\u00e9rita Universidad Aut\u00f3noma de Puebla, Puebla, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jos\u00e9 Fco.","family":"Mart\u00ednez-Trinidad","sequence":"additional","affiliation":[{"name":"Computer Science, Instituto Nacional de Astrof\u00edsica, \u00d3ptica y Electr\u00f3nica, Puebla, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"J.A.","family":"Carrasco-Ochoa","sequence":"additional","affiliation":[{"name":"Computer Science, Instituto Nacional de Astrof\u00edsica, \u00d3ptica y Electr\u00f3nica, Puebla, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Pinto","sequence":"additional","affiliation":[{"name":"Language &amp; Knowledge Engineering Lab, Benem\u00e9rita Universidad Aut\u00f3noma de Puebla, Puebla, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,6,6]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"AbualigahL. KhaderA.T. and HanandehE. A Novel Weighting Scheme Applied to Improve the Text Document Clustering Techniques (2018) 305\u2013320. 01.","DOI":"10.1007\/978-3-319-66984-7_18"},{"key":"e_1_3_2_3_2","unstructured":"AllahyariM. PouriyehS.A. AssefiM. SafaeiS. TrippeE.D. GutierrezJ.B. and KochutK. A brief survey of text mining: Classification clustering and extraction techniques. CoRR abs\/1707.02919 (2017)."},{"key":"e_1_3_2_4_2","unstructured":"AlonsoA.G. Su\u00e1rezA.P. and Medina-PagolaJ.E. ACONS: A new algorithm for clustering documents In Progress in Pattern Recognition Image Analysis and Applications 12th Iberoamericann Congress on Pattern Recognition CIARP 2007 Valparaiso Chile November 13-16 2007 Proceedings pages (2007) 664\u2013673."},{"issue":"2","key":"e_1_3_2_5_2","article-title":"Semclustdml: algoritmo para agrupar art\u00edculos cient\u00edficos basado en la informaci\u00f3n brindada por las referencias bibliogr\u00e1ficas","volume":"11","author":"Amador L.","year":"2017","unstructured":"AmadorL., Garc\u00edaM., L\u00edoD.G. and GuevaraD.M., Semclustdml: algoritmo para agrupar art\u00edculos cient\u00edficos basado en la informaci\u00f3n brindada por las referencias bibliogr\u00e1ficas, Revista Cubana de Ciencias Inform\u00e1ticas11(2) (2017).","journal-title":"Revista Cubana de Ciencias Inform\u00e1ticas"},{"key":"e_1_3_2_6_2","first-page":"93","article-title":"New Similarity Function for Scientific Articles Clustering based on the Bibliographic References","volume":"22","author":"Penichet L.A.","year":"2018","unstructured":"PenichetL.A., GuevaraD.M. and LorenzoM.M.G., New Similarity Function for Scientific Articles Clustering based on the Bibliographic References, Computaci\u00f3ny Sistemas22 (2018), 93\u2013102, 03.","journal-title":"Computaci\u00f3ny Sistemas"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"AmatoG. and SavinoP. Approximate Similarity Search in Metric Spaces Using Inverted Files In Proceedings of the 3rd International Conference on Scalable Information Systems pages 28:1\u201328:10 ICST Brussels Belgium Belgium (2008). ICST(Institute for Computer Sciences Social-Informatics and Telecommunications Engineering).","DOI":"10.4108\/ICST.INFOSCALE2008.3486"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-008-9066-8"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-014-1416-y"},{"key":"e_1_3_2_10_2","doi-asserted-by":"crossref","unstructured":"Aroche-VillarruelA.A. Carrasco-OchoaJ.A. Mart\u00ednez-TrinidadJ.F. Arturo Olvera-L\u00f3pezJ. and P\u00e9rez-Su\u00e1rezA. Study of overlapping clustering algorithms based on kmeans through fbcubed metric In J.F. Mart\u00ednez-Trinidad J.A. Carrasco-Ochoa J.A. Olvera-Lopez J. Salas-Rodr\u00edguez and C.Y. Suen editors Pattern Recognition pages 112\u2013121 Cham (2014). Springer International Publishing.","DOI":"10.1007\/978-3-319-07491-7_12"},{"key":"e_1_3_2_11_2","doi-asserted-by":"crossref","unstructured":"AslamJ. PelekhovK. and RusD. Static and dynamic information organization with star clusters In Proceedings of the Seventh International Conference on Information and Knowledge Management CIKM \u201998 pages 208\u2013217 New York NY USA 1998. ACM.","DOI":"10.1145\/288627.288659"},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","unstructured":"BaadelS. ThabtahF. and LuJ. Overlapping clustering: A review. In 2016 SAI Computing Conference (SAI) (2016) pp. 233\u2013237.","DOI":"10.1109\/SAI.2016.7555988"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dam.2013.12.019"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-018-6005-6"},{"key":"e_1_3_2_15_2","doi-asserted-by":"crossref","unstructured":"CleuziouG. Two variants of the okm for overlapping clustering In F. Guillet G. Ritschard D.A. Zighed and H. Briand editors EGC (best of volume) volume 292 of Studies in Computational Intelligence pages 149\u2013166. Springer (2010).","DOI":"10.1007\/978-3-642-00580-0_9"},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"CoavouxM. ElsaharH. and Gall\u00e9M. Unsupervised aspect-basedmulti-document abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization pages 42\u201347 Hong Kong China November (2019). Association for Computational Linguistics.","DOI":"10.18653\/v1\/D19-5405"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"DeyL. RanjanK. VermaI. and NaskarA. A semantic overlapping clustering algorithm for analyzing short-texts In V. Flores F. Gomide A. Janusz C. Meneses D. Miao G. Peters D. Slezak G. Wang R. Weber and Y. Yao editors Rough Sets pages 470\u2013479 Cham (2016). Springer International Publishing.","DOI":"10.1007\/978-3-319-47160-0_43"},{"key":"e_1_3_2_18_2","doi-asserted-by":"crossref","unstructured":"EdlaD.R. TripathiD. KuppiliV. and CherukuR. Survey on clustering techniques In 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT) pages 696\u2013703 April (2018).","DOI":"10.1109\/ICICCT.2018.8473039"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TETC.2014.2330519"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2015.05.009"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","unstructured":"GoelR. GuptaP. and YadavR.K. Improved Kharmonic means in wireless sensor networks In 2017 IEEE 15th Student Conference on Research and Development (SCOReD) (2017) pp. 275\u2013279.","DOI":"10.1109\/SCORED.2017.8305379"},{"key":"e_1_3_2_22_2","doi-asserted-by":"crossref","unstructured":"GuillaumeC. An extended version of the k-means method for overlapping clustering. 2008 19th International Conference on Pattern Recognition pages 1\u20134 (2008).","DOI":"10.1109\/ICPR.2008.4761079"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2015.04.009"},{"key":"e_1_3_2_24_2","first-page":"1509","article-title":"An improved Kmeans clustering algorithm based on an adaptive initial parameter estimation procedure for image segmentation","volume":"13","author":"Khan Z.","year":"2017","unstructured":"KhanZ., NiJ., FanX. and ShiP., An improved Kmeans clustering algorithm based on an adaptive initial parameter estimation procedure for image segmentation, International Journal of Innovative Computing Information and Control13, 1509\u20131526. 10 (2017)","journal-title":"International Journal of Innovative Computing Information and Control"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2016.09.025"},{"key":"e_1_3_2_26_2","doi-asserted-by":"crossref","unstructured":"LekhaJ. MaheshwaranJ. TharaniK. RamP.K. SuryaM.K. and ManikandanA. Efficient detection of spam messages using obf and cbf blocking techniques In 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI) (2019) pp. 1175\u20131179.","DOI":"10.1109\/ICOEI.2019.8862542"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TFUZZ.2016.2604009"},{"key":"e_1_3_2_28_2","first-page":"11","article-title":"A novel technique for region-based features similarity for content-based image retrieval","volume":"37","author":"Memon I.","year":"2017","unstructured":"MemonI., AliQ. and PirzadaN., A novel technique for region-based features similarity for content-based image retrieval, Mehran University Research Journal of Engineering & Technology37 (2017), 11.","journal-title":"Mehran University Research Journal of Engineering & Technology"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkv468"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12626-017-0002-5"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-014-0808-1"},{"key":"e_1_3_2_32_2","volume-title":"A New Overlapping Clustering Algorithm Based on Graph Theory","author":"P\u00e9rez-Su\u00e1rez A.","year":"2013","unstructured":"P\u00e9rez-Su\u00e1rezA., Mart\u00ednez-TrinidadJ.F., Carrasco-OchoaJ.A. and Medina-PagolaJ.E., A New Overlapping Clustering Algorithm Based on Graph Theory, In Advances in Artificial Intelligence, pages 61\u201372, Berlin, Heidelberg, (2013). Springer Berlin Heidelberg."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2013.03.022"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2013.04.025"},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","unstructured":"P\u00e9rez-Su\u00e1rezA. Mart\u00ednez-TrinidadJ.F. Carrasco-OchoaJ.A. and Medina-PagolaJ.E. A new graph-based algorithmfor clustering documents In Workshops Proceedings of the 8th IEEE International Conference on Data Mining (ICDM2008) December 15\u201319 (2008) Pisa Italy pages 710\u2013719 (2008).","DOI":"10.1109\/ICDMW.2008.69"},{"key":"e_1_3_2_36_2","unstructured":"P\u00e9rez-Su\u00e1rezA. Mart\u00ednez-TrinidadJ.F. Carrasco-OchoaJ.A. and Medina-PagolaJ.E. A New Incremental Algorithm for Overlapped Clustering In Progress in Pattern Recognition Image Analysis Computer Vision and Applications 14th Iberoamerican Conference on Pattern Recognition CIARP 2009 Guadalajara Jalisco Mexico November 15-18 2009. Proceedings pages 497\u2013504 (2009)."},{"key":"e_1_3_2_37_2","doi-asserted-by":"crossref","unstructured":"P\u00e9rez-Su\u00e1rezA. and Medina-PagolaJ.E. Aclustering algorithm based on generalized Stars In Petra Perner editor Machine Learning and Data Mining in Pattern Recognition pages 248\u2013262 Berlin Heidelberg (2007). Springer Berlin Heidelberg.","DOI":"10.1007\/978-3-540-73499-4_19"},{"key":"e_1_3_2_38_2","first-page":"1","article-title":"Mining text in news channels: A case study from facebook","volume":"1","author":"Salloum S.","year":"2018","unstructured":"SalloumS., Al-EmranM. and ShaalanK., Mining text in news channels: A case study from facebook, International Journal of Information Technology and Language Studies (IJITLS)1 (2018), 1\u20139, 08.","journal-title":"International Journal of Information Technology and Language Studies (IJITLS)"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.5120\/ijais12-450691"},{"issue":"2","key":"e_1_3_2_40_2","article-title":"Static and incremental overlapping clustering algorithms for large collections processing in GPU","volume":"42","author":"Gonz\u00e1lez Soler L.J.","year":"2018","unstructured":"Gonz\u00e1lez SolerL.J., Su\u00e1rezA.P. and Fern\u00e1ndez-JambrinaL., Static and incremental overlapping clustering algorithms for large collections processing in GPU, Informatica (Slovenia)42(2) (2018).","journal-title":"Informatica (Slovenia)"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2017.12.005"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12559-017-9462-8"},{"key":"e_1_3_2_43_2","doi-asserted-by":"crossref","unstructured":"TkaczynskiA. Segmentation Using Two-Step Cluster Analysis SpringerSingapore Singapore (2017).","DOI":"10.1007\/978-981-10-1835-0_8"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.2991\/ijcis.10.1.82"},{"key":"e_1_3_2_45_2","first-page":"2411","article-title":"Mulan: A Java Library for Multi-Label Learning","volume":"12","author":"Tsoumakas G.","year":"2011","unstructured":"TsoumakasG., Spyromitros-XioufisE., VilcekJ. and VlahavasI., Mulan: A Java Library for Multi-Label Learning, Journal of Machine Learning Research12 (2011), 2411\u20132414.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.2307\/3001968"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1007\/s40745-015-0040-1"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2013.19"},{"key":"e_1_3_2_49_2","doi-asserted-by":"crossref","unstructured":"ZhangB. HsuM. and DayalU. KHarmonic Means -A Spatial Clustering Algorithm with Boosting. In Proceedings of the First International Workshop on Temporal Spatial and Spatio-Temporal Data Mining-Revised Papers TSDM \u201900 pages 31\u201345 London UK UK (2000). Springer-Verlag.","DOI":"10.1007\/3-540-45244-3_4"},{"key":"e_1_3_2_50_2","article-title":"K-Harmonic Means - A Data Clustering Algorithm","volume":"12","author":"Zhang B.","year":"1999","unstructured":"ZhangB., HsuM., DayalU. and DataM., K-Harmonic Means - A Data Clustering Algorithm, Hewlett Packard Research Laboratory Technical Report12 ( (1999).","journal-title":"Hewlett Packard Research Laboratory Technical Report"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11859-018-1357-3"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179878","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-179878","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179878","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:42:08Z","timestamp":1777455728000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-179878"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,6]]},"references-count":50,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,8,31]]}},"alternative-id":["10.3233\/JIFS-179878"],"URL":"https:\/\/doi.org\/10.3233\/jifs-179878","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,6]]}}}