{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T22:04:39Z","timestamp":1766268279299},"reference-count":75,"publisher":"Cambridge University Press (CUP)","issue":"5","license":[{"start":{"date-parts":[[2015,8,14]],"date-time":"2015-08-14T00:00:00Z","timestamp":1439510400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2016,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This article presents silhouette\u2013attraction (<jats:italic>Sil\u2013Att<\/jats:italic>), a simple and effective method for text clustering, which is based on two main concepts: the<jats:italic>silhouette coefficient<\/jats:italic>and the idea of<jats:italic>attraction<\/jats:italic>. The combination of both principles allows us to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that<jats:italic>Sil\u2013Att<\/jats:italic>is able to obtain high-quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of<jats:italic>Sil\u2013Att<\/jats:italic>with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered.<\/jats:p>","DOI":"10.1017\/s1351324915000273","type":"journal-article","created":{"date-parts":[[2015,8,14]],"date-time":"2015-08-14T04:41:59Z","timestamp":1439527319000},"page":"687-726","source":"Crossref","is-referenced-by-count":5,"title":["Silhouette + attraction: A simple and effective method for text clustering"],"prefix":"10.1017","volume":"22","author":[{"given":"MARCELO L.","family":"ERRECALDE","sequence":"first","affiliation":[]},{"given":"LETICIA C.","family":"CAGNINA","sequence":"additional","affiliation":[]},{"given":"PAOLO","family":"ROSSO","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2015,8,14]]},"reference":[{"key":"S1351324915000273_ref074","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000027785.44527.d6"},{"key":"S1351324915000273_ref069","first-page":"267","volume-title":"Proceedings of the 26th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2003)","author":"Xu","year":"2003"},{"key":"S1351324915000273_ref071","first-page":"1057","volume-title":"The Proceedings of the 2001 Neural Information Processing Systems (NIPS) Conference","author":"Zha","year":"2001"},{"key":"S1351324915000273_ref072","first-page":"873","volume-title":"Proceedings of the 34th International ACM Conference on Research and Development in Information Retrieval (SIGIR 2011)","author":"Zhang","year":"2011"},{"key":"S1351324915000273_ref066","first-page":"1","volume-title":"The Collected Works of John W. Tukey VIII. Multiple Comparisons: 1948\u20131983","author":"Tukey","year":"1953"},{"key":"S1351324915000273_ref073","first-page":"718","volume-title":"Proceedings of the VLDB Endowment","author":"Zhou","year":"2009"},{"key":"S1351324915000273_ref065","doi-asserted-by":"publisher","DOI":"10.1145\/1552303.1552305"},{"key":"S1351324915000273_ref070","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557120"},{"key":"S1351324915000273_ref075","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-005-0361-3"},{"key":"S1351324915000273_ref035","doi-asserted-by":"publisher","DOI":"10.1002\/9780470316801"},{"key":"S1351324915000273_ref034","doi-asserted-by":"publisher","DOI":"10.1109\/2.781637"},{"key":"S1351324915000273_ref011","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2013.12.010"},{"key":"S1351324915000273_ref022","doi-asserted-by":"publisher","DOI":"10.1109\/SKG.2007.76"},{"key":"S1351324915000273_ref056","doi-asserted-by":"publisher","DOI":"10.2307\/1412159"},{"key":"S1351324915000273_ref045","first-page":"144","volume-title":"Proceedings of the 20th International Conference on Very Large Data Bases (VLDB \u201994)","author":"Ng","year":"1994"},{"key":"S1351324915000273_ref052","doi-asserted-by":"publisher","DOI":"10.1016\/0377-0427(87)90125-7"},{"key":"S1351324915000273_ref043","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511809071"},{"key":"S1351324915000273_ref051","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.77"},{"key":"S1351324915000273_ref025","first-page":"3","volume-title":"Computational Intelligence and Bioengineering","author":"Ingaramo","year":"2009"},{"key":"S1351324915000273_ref038","first-page":"41","volume-title":"Proceedings of the 4th International Conference Practical Applications of Knowledge Discovery and Data Mining (PADD-2000)","author":"Neto","year":"2000"},{"key":"S1351324915000273_ref030","first-page":"19","article-title":"Clustering iterativo de textos cortos con representaciones basadas en concepto","volume":"46","author":"Ingaramo","year":"2011","journal-title":"Procesamiento del Lenguaje Natural"},{"key":"S1351324915000273_ref014","first-page":"1771","article-title":"Learning latent tree graphical models","volume":"12","author":"Choi","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324915000273_ref029","first-page":"555","volume-title":"Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2008","author":"Ingaramo","year":"2008"},{"key":"S1351324915000273_ref004","first-page":"787","volume-title":"Proceedings of the 30th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR-2007)","author":"Banerjee","year":"2007"},{"key":"S1351324915000273_ref067","volume-title":"Exploratory Data Analysis","author":"Tukey","year":"1977"},{"key":"S1351324915000273_ref031","doi-asserted-by":"publisher","DOI":"10.1109\/BIBE.2003.1188978"},{"key":"S1351324915000273_ref027","first-page":"1","article-title":"A new AntTree-based algorithm for clustering short-text corpora","volume":"10","author":"Ingaramo","year":"2010","journal-title":"Journal of Computer Science and Technology"},{"key":"S1351324915000273_ref013","doi-asserted-by":"publisher","DOI":"10.1137\/S0097539702418498"},{"key":"S1351324915000273_ref062","first-page":"2142","volume-title":"Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006)","author":"Steinberger","year":"2006"},{"key":"S1351324915000273_ref026","first-page":"661","volume-title":"Proceedings of 11th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2010","author":"Ingaramo","year":"2010"},{"key":"S1351324915000273_ref006","volume-title":"Survey of Text Mining: Clustering, Classification, and Retrieval","author":"Berry","year":"2003"},{"key":"S1351324915000273_ref061","first-page":"109","volume-title":"Working Notes - Workshop on Text Mining","author":"Steinbach","year":"2000"},{"key":"S1351324915000273_ref055","first-page":"129","volume-title":"Proceedings of the 25th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2002)","author":"Slonim","year":"2002"},{"key":"S1351324915000273_ref003","first-page":"2642","volume-title":"Proceedings of the IEEE Conference on Evolutionary Computation (CEC-2003)","author":"Azzag","year":"2003"},{"key":"S1351324915000273_ref068","volume-title":"Information Retrieval","author":"van Rijsbergen","year":"1979"},{"key":"S1351324915000273_ref053","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(91)90097-O"},{"key":"S1351324915000273_ref044","first-page":"849","volume-title":"The Proceedings of the 2001 Neural Information Processing Systems (NIPS) Conference","author":"Ng","year":"2001"},{"key":"S1351324915000273_ref028","first-page":"264","article-title":"Adaptive clustering with artificial ants","volume":"5","author":"Ingaramo","year":"2005","journal-title":"Journal of Computer Science and Technology"},{"key":"S1351324915000273_ref002","doi-asserted-by":"publisher","DOI":"10.1109\/ICCIMA.2007.328"},{"key":"S1351324915000273_ref039","first-page":"278","volume-title":"Contributions to Probability and Statistics","author":"Levene","year":"1960"},{"key":"S1351324915000273_ref042","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30120-2_17"},{"key":"S1351324915000273_ref032","unstructured":"Jing L. 2005. Survey of text clustering. Technical report. Department of Mathematics. The University of Hong Kong, Hong Kong, China."},{"key":"S1351324915000273_ref019","first-page":"550","volume-title":"Proceedings of the 23rd International Conference on Industrial Engineering and other Applications of Applied Intelligent Systems, IEA\/AIE 2010","author":"Errecalde","year":"2010"},{"key":"S1351324915000273_ref054","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"S1351324915000273_ref060","first-page":"216","volume-title":"3rd International Conference on Artificial Intelligence and Applications (AIA 2003)","author":"Stein","year":"2003"},{"key":"S1351324915000273_ref033","first-page":"39","article-title":"Hybrid algorithm for noise-free high density clusters with self-detection of best number of clusters","volume":"4","author":"Karthikeyan","year":"2011","journal-title":"International Journal of Hybrid Information Technology"},{"key":"S1351324915000273_ref010","first-page":"93","volume-title":"Proceedings of the International Conference on Bioinspired Optimization Methods and their Applications (BIOMA08)","author":"Cagnina","year":"2008"},{"key":"S1351324915000273_ref023","doi-asserted-by":"publisher","DOI":"10.1145\/1121949.1121983"},{"key":"S1351324915000273_ref018","first-page":"15","volume-title":"19th International Workshop on Database and Expert Systems Application","author":"Errecalde","year":"2008"},{"key":"S1351324915000273_ref057","first-page":"45","volume-title":"2nd International Workshop on Text-Based Information Retrieval (TIR 2005)","author":"Stein","year":"2005"},{"key":"S1351324915000273_ref058","first-page":"91","volume-title":"Proceedings Workshop on Issues in the Theory of Security (WITS 2002)","author":"Stein","year":"2002"},{"key":"S1351324915000273_ref059","first-page":"353","volume-title":"Proceedings of the International Conference on Knowledge Management (I-KNOW 2004)","author":"Stein","year":"2004"},{"key":"S1351324915000273_ref001","doi-asserted-by":"publisher","DOI":"10.1007\/11428817_25"},{"key":"S1351324915000273_ref050","first-page":"49","volume-title":"Proceedings of the 5th International Conference on Advances in Semantic Processing (SEMAPRO 2011)","author":"Popova","year":"2011"},{"key":"S1351324915000273_ref008","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2010.05.029"},{"key":"S1351324915000273_ref037","first-page":"233","volume-title":"Tools in Artificial Intelligence","author":"Kyriakopoulou","year":"2008"},{"key":"S1351324915000273_ref015","first-page":"318","volume-title":"Proceedings of the 15th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2002)","author":"Cutting","year":"1992"},{"key":"S1351324915000273_ref021","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2009.04.013"},{"key":"S1351324915000273_ref040","first-page":"186","volume-title":"Proceedings of the 27th annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2004)","author":"Liu","year":"2004"},{"key":"S1351324915000273_ref017","doi-asserted-by":"publisher","DOI":"10.1080\/01969727408546059"},{"key":"S1351324915000273_ref012","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2009.04.001"},{"key":"S1351324915000273_ref020","volume-title":"Statistical Methods for Research Workers","author":"Fisher","year":"1925"},{"key":"S1351324915000273_ref036","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1952.10483441"},{"key":"S1351324915000273_ref024","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1145\/1645953.1646071","volume-title":"Proceedings of the 18th ACM Conference on Information and Knowledge Management","author":"Hu","year":"2009"},{"key":"S1351324915000273_ref041","first-page":"281","volume-title":"Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability","author":"MacQueen","year":"1967"},{"key":"S1351324915000273_ref007","doi-asserted-by":"publisher","DOI":"10.1109\/ANNES.1995.499469"},{"key":"S1351324915000273_ref009","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2006.06.026"},{"key":"S1351324915000273_ref016","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1979.4766909"},{"key":"S1351324915000273_ref063","first-page":"438","volume-title":"Proceedings of the 7th ACM\/IEEE CS Joint Conference on Digital Libraries","author":"Takeda","year":"2007"},{"key":"S1351324915000273_ref048","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74628-7_22"},{"key":"S1351324915000273_ref064","volume-title":"Introduction to Data Mining","author":"Tan","year":"2005"},{"key":"S1351324915000273_ref005","first-page":"2","volume-title":"Proceedings of 27 Annual Meeting of the Mid-South Educational Research Association","author":"Barnette","year":"1998"},{"key":"S1351324915000273_ref049","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2006.06.001"},{"key":"S1351324915000273_ref046","first-page":"221","article-title":"Using pivots to speed-up k-medoids clustering","volume":"2","author":"Paterlini","year":"2011","journal-title":"Journal of Information and Data Management"},{"key":"S1351324915000273_ref047","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-70939-8_54"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324915000273","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,20]],"date-time":"2022-05-20T12:06:04Z","timestamp":1653048364000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324915000273\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,8,14]]},"references-count":75,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2016,9]]}},"alternative-id":["S1351324915000273"],"URL":"https:\/\/doi.org\/10.1017\/s1351324915000273","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,8,14]]}}}