{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T23:40:51Z","timestamp":1764978051966,"version":"3.46.0"},"reference-count":22,"publisher":"Walter de Gruyter GmbH","issue":"1","license":[{"start":{"date-parts":[[2018,12,4]],"date-time":"2018-12-04T00:00:00Z","timestamp":1543881600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,12,18]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    This paper explores a multi-strategy technique that aims at enriching text documents for improving clustering quality. We use a combination of entity linking and document summarization in order to determine the identity of the most\n                    <jats:italic>salient entities<\/jats:italic>\n                    mentioned in texts. To effectively enrich documents without introducing noise, we limit ourselves to the text fragments mentioning the salient entities, in turn, belonging to a\n                    <jats:italic>knowledge base<\/jats:italic>\n                    like Wikipedia, while the actual enrichment of text fragments is carried out using WordNet. To feed clustering algorithms, we investigate different document representations obtained using several combinations of document enrichment and feature extraction. This allows us to exploit ensemble clustering, by combining multiple clustering results obtained using different document representations. Our experiments indicate that our novel enriching strategies, combined with ensemble clustering, can improve the quality of classical text clustering when applied to text corpora like The British Broadcasting Corporation (BBC) NEWS.\n                  <\/jats:p>","DOI":"10.1515\/jisys-2018-0098","type":"journal-article","created":{"date-parts":[[2018,12,4]],"date-time":"2018-12-04T03:49:30Z","timestamp":1543895370000},"page":"1109-1121","source":"Crossref","is-referenced-by-count":1,"title":["Enriching Documents by Linking Salient Entities and Lexical-Semantic Expansion"],"prefix":"10.1515","volume":"29","author":[{"given":"Mohsen","family":"Pourvali","sequence":"first","affiliation":[{"name":"Universit\u00e0 Ca\u2019 Foscari Venezia , Venice 30172, Italy"}]},{"given":"Salvatore","family":"Orlando","sequence":"additional","affiliation":[{"name":"Universit\u00e0 Ca\u2019 Foscari Venezia , Venice , Italy"}]}],"member":"374","published-online":{"date-parts":[[2018,12,4]]},"reference":[{"key":"2025120523362732724_j_jisys-2018-0098_ref_001","doi-asserted-by":"crossref","unstructured":"S. Banerjee and T. Pedersen, An adapted Lesk algorithm for word sense disambiguation using WordNet, in: Int. Conf. on Intel. Text Processing and Computational Linguistics, pp. 136\u2013145, Springer, Berlin, Heidelberg, 2002.","DOI":"10.1007\/3-540-45715-1_11"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_002","doi-asserted-by":"crossref","unstructured":"D. M. Blei, Probabilistic topic models, Comm. ACM 55 (2012), 77\u201384.","DOI":"10.1145\/2133806.2133826"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_003","doi-asserted-by":"crossref","unstructured":"T. H. Cao, T. M. Tang and C. K. Chau, Text clustering with named entities: a model, experimentation and realization, in: Data Mining: Foundations and Intelligent Paradigms, pp. 267\u2013287, Springer, Berlin, Heidelberg, 2012.","DOI":"10.1007\/978-3-642-23166-7_10"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_004","doi-asserted-by":"crossref","unstructured":"D. Ceccarelli, C. Lucchese, S. Orlando, R. Perego and S. Trani, Learning relatedness measures for entity linking, in: Proc. of CIKM \u201913, pp. 139\u2013148, ACM, San Francisco, California, USA, 2013.","DOI":"10.1145\/2505515.2505711"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_005","unstructured":"P. Edmonds, SENSEVAL: the evaluation of word sense disambiguation systems, ELRA Newsletter 7 (2002), 5\u201314."},{"key":"2025120523362732724_j_jisys-2018-0098_ref_006","unstructured":"M. van Erp, P. N. Mendes, H. Paulheim, F. Ilievski, J. Plu, G. Rizzo and J. Waitelonis, Evaluating entity linking: an analysis of current benchmark datasets and a roadmap for doing a better job, in: Proc. of LREC\u201916, 2016."},{"key":"2025120523362732724_j_jisys-2018-0098_ref_007","doi-asserted-by":"crossref","unstructured":"C. Fellbaum, WordNet, Wiley Online Library, 1998.","DOI":"10.7551\/mitpress\/7287.001.0001"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_008","doi-asserted-by":"crossref","unstructured":"P. Ferragina and U. Scaiella, Tagme: on-the-fly annotation of short text fragments (by wikipedia entities), in: Proc. of CIKM\u201910, pp. 1625\u20131628, ACM, Toronto, ON, Canada, 2010.","DOI":"10.1145\/1871437.1871689"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_009","doi-asserted-by":"crossref","unstructured":"D. Greene and P. Cunningham, Practical solutions to the problem of diagonal dominance in kernel document clustering, in: Proc. of ICML\u201906, pp. 377\u2013384, ACM, Pittsburgh, Pennsylvania, USA, 2006.","DOI":"10.1145\/1143844.1143892"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_010","doi-asserted-by":"crossref","unstructured":"B. Hachey, W. Radford, J. Nothman, M. Honnibal and J. R Curran, Evaluating entity linking with Wikipedia, Artif. Intell. 194 (2013), 130\u2013150.","DOI":"10.1016\/j.artint.2012.04.005"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_011","doi-asserted-by":"crossref","unstructured":"R. Kadlec, M. Schmid, O. Bajgar and J. Kleindienst, Text understanding with the attention sum reader network, In: Proc. of ACL\u201916, pp. 908\u2013918, Berlin, Germany, 2016.","DOI":"10.18653\/v1\/P16-1086"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_012","doi-asserted-by":"crossref","unstructured":"G. Karypis and V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput. 20 (1998), 359\u2013392.","DOI":"10.1137\/S1064827595287997"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_013","doi-asserted-by":"crossref","unstructured":"R. Mihalcea and A. Csomai, Wikify!: linking documents to encyclopedic knowledge, in: Proc. of CIKM \u201907, pp. 233\u2013242, ACM, Lisbon, Portugal, 2007.","DOI":"10.1145\/1321440.1321475"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_014","doi-asserted-by":"crossref","unstructured":"D. Milne and I. H. Witten, Learning to link with Wikipedia, in: Proc. of CIKM \u201908, pp. 509\u2013518, ACM, Napa Valley, California, USA, 2008.","DOI":"10.1145\/1458082.1458150"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_015","doi-asserted-by":"crossref","unstructured":"S. Montalvo, R. Martnez, V. Fresno and A. Delgado, Exploiting named entities for bilingual news clustering, J. Assoc. Inf. Sci. Technol. 66 (2015), 363\u2013376.","DOI":"10.1002\/asi.23175"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_016","doi-asserted-by":"crossref","unstructured":"R. Navigli, Word sense disambiguation: a survey, ACM Comp. Surveys 41 (2009), 10.","DOI":"10.1145\/1459352.1459355"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_017","doi-asserted-by":"crossref","unstructured":"M. Pourvali, S. Orlando and M. Gharagozloo, Improving clustering quality by automatic text summarization, Inf. Retrieval Technology, pp. 292\u2013303, Springer, Cham, 2015.","DOI":"10.1007\/978-3-319-28940-3_23"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_018","doi-asserted-by":"crossref","unstructured":"D. Reforgiato Recupero, A new unsupervised method for document clustering by using WordNet lexical and conceptual relations, Inf. Retrieval 10 (2007), 563\u2013579.","DOI":"10.1007\/s10791-007-9035-7"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_019","doi-asserted-by":"crossref","unstructured":"J. A. Silva, E. R. Faria, R. C. Barros, E. R. Hruschka, A. C. P. L. F. de Carvalho, and J. Gama, Data stream clustering: a survey, ACM Comput. Surveys 46 (2013), 13.","DOI":"10.1145\/2522968.2522981"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_020","unstructured":"A. Strehl and J. Ghosh, Cluster ensembles \u2013 a knowledge reuse framework for combining multiple partitions, J. Mach Learn. Res. 3 (2003), 583\u2013617."},{"key":"2025120523362732724_j_jisys-2018-0098_ref_021","doi-asserted-by":"crossref","unstructured":"S. Vega-Pons and J. Ruiz-Shulcloper, A survey of clustering ensemble algorithms, J. Pattern Recognit. Artif. Intell. 25 (2011), 337\u2013372.","DOI":"10.1142\/S0218001411008683"},{"key":"2025120523362732724_j_jisys-2018-0098_ref_022","doi-asserted-by":"crossref","unstructured":"X. Zhang, L. Jing, X. Hu, M. Ng and X. Zhou, A comparative study of ontology based term similarity measures on PubMed document clustering, in: Advances in Databases: Concepts, Systems and Applications, pp. 115\u2013126, Springer, Berlin, Heidelberg, 2007.","DOI":"10.1007\/978-3-540-71703-4_12"}],"container-title":["Journal of Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.degruyter.com\/view\/journals\/jisys\/29\/1\/article-p1109.xml","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyterbrill.com\/document\/doi\/10.1515\/jisys-2018-0098\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyterbrill.com\/document\/doi\/10.1515\/jisys-2018-0098\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T23:37:09Z","timestamp":1764977829000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyterbrill.com\/document\/doi\/10.1515\/jisys-2018-0098\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,4]]},"references-count":22,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2018,4,25]]},"published-print":{"date-parts":[[2019,12,18]]}},"alternative-id":["10.1515\/jisys-2018-0098"],"URL":"https:\/\/doi.org\/10.1515\/jisys-2018-0098","relation":{},"ISSN":["2191-026X","0334-1860"],"issn-type":[{"type":"electronic","value":"2191-026X"},{"type":"print","value":"0334-1860"}],"subject":[],"published":{"date-parts":[[2018,12,4]]}}}