{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T16:06:58Z","timestamp":1771258018204,"version":"3.50.1"},"reference-count":41,"publisher":"MIT Press","issue":"2","license":[{"start":{"date-parts":[[2021,2,17]],"date-time":"2021-02-17T00:00:00Z","timestamp":1613520000000},"content-version":"vor","delay-in-days":413,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The effects of enhancing direct citations, with respect to publication\u2013publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For comparison, we include each approach that is involved in the enhancement of direct citations. In total, we investigate the relative performance of seven approaches. To evaluate the approaches we use a methodology proposed by earlier research. However, the evaluation criterion used is based on MeSH, one of the most sophisticated publication-level classification schemes available. We also introduce an approach, based on interpolated accuracy values, by which overall relative clustering solution accuracy can be studied. The results show that the cocitation approach has the worst performance, and that the direct citations approach is outperformed by the other five investigated approaches. The extended direct citations approach has the best performance, followed by an approach in which direct citations are enhanced by the BM25 textual relatedness measure. An approach that combines direct citations with bibliographic coupling and cocitation performs slightly better than the bibliographic coupling approach, which in turn has a better performance than the BM25 approach.<\/jats:p>","DOI":"10.1162\/qss_a_00027","type":"journal-article","created":{"date-parts":[[2020,3,25]],"date-time":"2020-03-25T09:27:57Z","timestamp":1585128477000},"page":"714-729","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":18,"title":["Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications"],"prefix":"10.1162","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0229-3073","authenticated-orcid":false,"given":"Per","family":"Ahlgren","sequence":"first","affiliation":[{"name":"Department of Statistics, Uppsala University, Uppsala (Sweden)"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6597-7416","authenticated-orcid":false,"given":"Yunwei","family":"Chen","sequence":"additional","affiliation":[{"name":"Scientometrics & Evaluation Research Center (SERC), Chengdu Library and Information Center of Chinese Academy of Sciences, Chengdu, 610041 (China)"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7653-4004","authenticated-orcid":false,"given":"Cristian","family":"Colliander","sequence":"additional","affiliation":[{"name":"Department of Sociology, Inforsk, Ume\u00e5 University, Ume\u00e5 (Sweden)"},{"name":"University Library, Ume\u00e5 University, Ume\u00e5 (Sweden)"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8448-4521","authenticated-orcid":false,"given":"Nees Jan","family":"van Eck","sequence":"additional","affiliation":[{"name":"Centre for Science and Technology Studies, Leiden University (The Netherlands)"}]}],"member":"281","published-online":{"date-parts":[[2020,6,1]]},"reference":[{"key":"2025073014020818400_bib1","doi-asserted-by":"crossref","unstructured":"Ahlgren,  P., & Colliander,  C. (2009). Document-document similarity approaches and science mapping: Experimental comparison of five approaches. Journal of Informetrics, 3(1), 49\u201363. https:\/\/doi.org\/10.1016\/j.joi.2008.11.003","DOI":"10.1016\/j.joi.2008.11.003"},{"key":"2025073014020818400_bib2","unstructured":"Ahlgren,  P., Chen,  Y. W., Colliander,  C., & Van Eck,  N. J. (2019). Community detection using citation relations and textual similarities in a large set of PubMed publications. Accepted for publication in Proceedings of the 17th International Conference on Scientometrics and Informetrics."},{"key":"2025073014020818400_bib4","doi-asserted-by":"crossref","unstructured":"Boyack,  K. W., & Klavans,  R. (2014). Including cited non-source items in a large-scale map of science: What difference does it make?Journal of Informetrics, 8(3), 569\u2013580. https:\/\/doi.org\/10.1016\/j.joi.2014.04.001","DOI":"10.1016\/j.joi.2014.04.001"},{"key":"2025073014020818400_bib5","unstructured":"Boyack,  K. W., & Klavans,  R. (2018). Accurately identifying topics using text: Mapping PubMed. Proceedings of the 23rd International Conference on Science and Technology Indicators\u2014STI 2018, 107\u2013115."},{"key":"2025073014020818400_bib3","doi-asserted-by":"crossref","unstructured":"Boyack,  K. W., Newman,  D., Duhon,  R. J., Klavans,  R., Patek,  M., Biberstine,  J. R., Schijvenaars,  B., Skupin,  A., Ma,  N., & B\u00f6rner,  K. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLOS ONE, 6(3), e18029. https:\/\/doi.org\/10.1371\/journal.pone.0018029","DOI":"10.1371\/journal.pone.0018029"},{"key":"2025073014020818400_bib6","doi-asserted-by":"crossref","unstructured":"Chen,  P., & Redner,  S. (2010). Community structure of the physical review citation network. Journal of Informetrics, 4(3), 278\u2013290. https:\/\/doi.org\/10.1016\/j.joi.2010.01.001","DOI":"10.1016\/j.joi.2010.01.001"},{"key":"2025073014020818400_bib7","doi-asserted-by":"crossref","unstructured":"Chen,  W., Fengxia,  Y., & Wang,  Y. (2013). Community discovery algorithm of citation semantic link network. 6th International Symposium on Computational Intelligence and Design (Vol. 2), 289\u2013292. https:\/\/doi.org\/10.1109\/ISCID.2013.186","DOI":"10.1109\/ISCID.2013.186"},{"key":"2025073014020818400_bib8","unstructured":"Chen,  Y. W., Xiao,  X., Deng,  Y., & Zhang,  Z. (2017). A weighted method for citation network community detection. Proceedings of the 16th International Conference on Scientometrics and Informetrics\u2014ISSI 2017, 58\u201367."},{"key":"2025073014020818400_bib9","unstructured":"Cohn,  D., & Hofmann,  T. (2001). The missing link\u2014A probabilistic model of document content and hypertext connectivity. In T. K.Leenet al (Eds.), Advances in neural information processing systems13 (pp. 430\u2013436). Cambridge, MA: MIT Press."},{"key":"2025073014020818400_bib10","doi-asserted-by":"crossref","unstructured":"Colliander,  C., & Ahlgren,  P. (2019). Comparison of publication-level approaches to ex-post citation normalization. Scientometrics, 120(1), 283\u2013300. https:\/\/doi.org\/10.1007\/s11192-019-03121-z","DOI":"10.1007\/s11192-019-03121-z"},{"key":"2025073014020818400_bib11","doi-asserted-by":"crossref","unstructured":"Fritsch,  F. N., & Butland,  J. (1984). A method for constructing local monotone piecewise cubic interpolants. Siam Journal on Scientific and Statistical Computing, 5(2), 300\u2013304. https:\/\/doi.org\/10.1137\/0905021","DOI":"10.1137\/0905021"},{"key":"2025073014020818400_bib12","doi-asserted-by":"crossref","unstructured":"Fritsch,  F. N., & Carlson,  R. E. (1980). Monotone piecewise cubic interpolation. Siam Journal on Numerical Analysis, 17(2), 238\u2013246. https:\/\/doi.org\/10.1137\/0717021","DOI":"10.1137\/0717021"},{"key":"2025073014020818400_bib13","doi-asserted-by":"crossref","unstructured":"Fujita,  K., Kajikawa,  Y., Mori,  J., & Sakata,  I. (2014). Detecting research fronts using different types of weighted citation networks. Journal of Engineering and Technology Management, 32, 129\u2013146. https:\/\/doi.org\/10.1016\/j.jengtecman.2013.07.002","DOI":"10.1016\/j.jengtecman.2013.07.002"},{"key":"2025073014020818400_bib14","doi-asserted-by":"crossref","unstructured":"Girvan,  M., & Newman,  M. E. J. (2002). Community structure in social and biological networks. PNAS, 99(12), 7821\u20137826. https:\/\/doi.org\/10.1073\/pnas.122653799","DOI":"10.1073\/pnas.122653799"},{"key":"2025073014020818400_bib15","doi-asserted-by":"crossref","unstructured":"Gl\u00e4nzel,  W., & Thijs,  B. (2017). Using hybrid methods and \u201ccore documents\u201d for the representation of clusters and topics: the astronomy dataset. Scientometrics, 111(2), 1071\u20131087. https:\/\/doi.org\/10.1007\/s11192-017-2301-6","DOI":"10.1007\/s11192-017-2301-6"},{"key":"2025073014020818400_bib16","doi-asserted-by":"crossref","unstructured":"Hamedani,  M. R., Kim,  S. W., & Kin,  D. J. (2016). SimCC: A novel method to consider both content and citations for computing similarity of scientific papers. Information Sciences, 334\u2013335, 273\u2013292. https:\/\/doi.org\/10.1016\/j.ins.2015.12.001","DOI":"10.1016\/j.ins.2015.12.001"},{"key":"2025073014020818400_bib17","doi-asserted-by":"crossref","unstructured":"Haunschild,  R., Schier,  H., Marx,  W., & Bornmann,  L. (2018). Algorithmically generated subject categories based on citation relations: An empirical micro study using papers on overall water splitting. Journal of Informetrics, 12(2), 436\u2013447. https:\/\/doi.org\/10.1016\/j.joi.2018.03.004","DOI":"10.1016\/j.joi.2018.03.004"},{"key":"2025073014020818400_bib18","doi-asserted-by":"crossref","unstructured":"Kajikawa,  Y., Yoshikawa,  J., Takeda,  Y., & Matsushima,  K. (2008). Tracking emerging technologies in energy research: Toward a roadmap for sustainable energy. Technological Forecasting and Social Change, 75(6), 771\u2013782. https:\/\/doi.org\/10.1016\/j.techfore.2007.05.005","DOI":"10.1016\/j.techfore.2007.05.005"},{"key":"2025073014020818400_bib19","doi-asserted-by":"crossref","unstructured":"Klavans,  R., & Boyack,  K. W. (2017). Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge?Journal of the Association for Information Science and Technology, 68(4), 984\u2013998. https:\/\/doi.org\/10.1002\/asi.23734","DOI":"10.1002\/asi.23734"},{"key":"2025073014020818400_bib20","doi-asserted-by":"crossref","unstructured":"Kusumastuti,  S., Derks,  M. G., Tellier,  S., Di Nucci,  E., Lund,  R., Mortensen,  E. L., & Westendorp,  R. G. (2016). Successful ageing: A study of the literature using citation network analysis. Maturitas, 93, 4\u201312. https:\/\/doi.org\/10.1016\/j.maturitas.2016.04.010","DOI":"10.1016\/j.maturitas.2016.04.010"},{"key":"2025073014020818400_bib21","doi-asserted-by":"crossref","unstructured":"Meyer-Br\u00f6tz,  F., Schiebel,  E., & Brecht,  L. (2017). Experimental evaluation of parameter settings in calculation of hybrid similarities: effects of first- and second-order similarity, edge cutting, and weighting factors. Scientometrics, 111(3), 1307\u20131325. https:\/\/doi.org\/10.1007\/s11192-017-2366-2","DOI":"10.1007\/s11192-017-2366-2"},{"key":"2025073014020818400_bib22","doi-asserted-by":"crossref","unstructured":"Persson,  O.\n           (2010). Identifying research themes with weighted direct citation links. Journal of Informetrics, 4(3), 415\u2013422. https:\/\/doi.org\/10.1016\/j.joi.2010.03.006","DOI":"10.1016\/j.joi.2010.03.006"},{"key":"2025073014020818400_bib23","doi-asserted-by":"crossref","unstructured":"Ruiz-Castillo,  J., & Waltman,  L. (2015). Field-normalized citation impact indicators using algorithmically constructed classification systems of science. Journal of Informetrics, 9(1), 102\u2013117. https:\/\/doi.org\/10.1016\/j.joi.2014.11.010","DOI":"10.1016\/j.joi.2014.11.010"},{"key":"2025073014020818400_bib24","unstructured":"Salton,  G., & McGill,  M. J. (1983). Introduction to modern information retrieval. New York: McGraw-Hill."},{"key":"2025073014020818400_bib25","doi-asserted-by":"crossref","unstructured":"Sj\u00f6g\u00e5rde,  P., & Ahlgren,  P. (2018). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics. Journal of Informetrics, 12(1), 133\u2013152. https:\/\/doi.org\/10.1016\/j.joi.2017.12.006","DOI":"10.1016\/j.joi.2017.12.006"},{"key":"2025073014020818400_bib26","doi-asserted-by":"crossref","unstructured":"Sj\u00f6g\u00e5rde,  P., & Ahlgren,  P. (2020). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of specialties. Quantitative Science Studies, 1(1), 207\u2013238. https:\/\/doi.org\/10.1162\/qss_a_00004","DOI":"10.1162\/qss_a_00004"},{"key":"2025073014020818400_bib27","doi-asserted-by":"crossref","unstructured":"Small,  H.\n           (1997). Update on science mapping: Creating large document spaces. Scientometrics, 38(2), 275\u2013293. https:\/\/doi.org\/10.1007\/BF02457414","DOI":"10.1007\/BF02457414"},{"key":"2025073014020818400_bib28","doi-asserted-by":"crossref","unstructured":"Sparck Jones,  K., Walker,  S., & Robertson,  S. E. (2000a). A probabilistic model of information retrieval: Development and comparative experiments: Part 1. Information Processing and Management, 36(6), 779\u2013808. https:\/\/doi.org\/10.1016\/S0306-4573(00)00015-7","DOI":"10.1016\/S0306-4573(00)00015-7"},{"key":"2025073014020818400_bib29","doi-asserted-by":"crossref","unstructured":"Sparck Jones,  K., Walker,  S., & Robertson,  S. E. (2000b). A probabilistic model of information retrieval: Development and comparative experiments: Part 2. Information Processing and Management, 36(6), 809\u2013840. https:\/\/doi.org\/10.1016\/S0306-4573(00)00016-9","DOI":"10.1016\/S0306-4573(00)00016-9"},{"key":"2025073014020818400_bib30","doi-asserted-by":"crossref","unstructured":"Subelj,  L., Van Eck,  N. J., & Waltman,  L. (2016). Clustering scientific publications based on citation relations: A systematic comparison of different methods. PLOS ONE, 11(4), e0154404. https:\/\/doi.org\/10.1371\/journal.pone.0154404","DOI":"10.1371\/journal.pone.0154404"},{"key":"2025073014020818400_bib31","doi-asserted-by":"crossref","unstructured":"Traag,  V. A., Van Dooren,  P., & Nesterov,  Y. (2011). Narrow scope for resolution-limit-free community detection. Physical Review E, 84(1), 016114. https:\/\/doi.org\/10.1103\/PhysRevE.84.016114","DOI":"10.1103\/PhysRevE.84.016114"},{"key":"2025073014020818400_bib33","unstructured":"Traag,  V. A., Waltman,  L., & Van Eck,  N. J. (2018). CWTSLeiden\/networkanalysis [Source code]. Zenodo. https:\/\/doi.org\/10.5281\/zenodo.1466831"},{"key":"2025073014020818400_bib32","doi-asserted-by":"crossref","unstructured":"Traag,  V. A., Waltman,  L., & Van Eck,  N. J. (2019). From Louvain to Leiden: Guaranteeing well-connected communities. Scientific Reports, 9, 5233. https:\/\/doi.org\/10.1038\/s41598-019-41695-z","DOI":"10.1038\/s41598-019-41695-z"},{"key":"2025073014020818400_bib34","unstructured":"U.S. National Library of Medicine. (2019a). Introduction to MeSH. Retrieved from https:\/\/www.nlm.nih.gov\/mesh\/introduction.html."},{"key":"2025073014020818400_bib35","unstructured":"U.S. National Library of Medicine. (2019b). The Indexing Process. Retrieved from https:\/\/www.nlm.nih.gov\/bsd\/indexing\/training\/TIP_010.html."},{"key":"2025073014020818400_bib36","doi-asserted-by":"crossref","unstructured":"Waltman,  L., & Van Eck,  N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378\u20132392. https:\/\/doi.org\/10.1002\/asi.22748","DOI":"10.1002\/asi.22748"},{"key":"2025073014020818400_bib37","unstructured":"Waltman,  L., Boyack,  K. W., Colavizza,  G., & Van Eck,  N. J. (2017). A principled methodology for comparing relatedness measures for clustering publications. In Proceedings of the 16th International Conference on Scientometrics and Informetrics\u2014ISSI 2017, 691\u2013702."},{"key":"2025073014020818400_bib38","doi-asserted-by":"crossref","unstructured":"Waltman,  L., Boyack,  K. W., Colavizza,  G., & Van Eck,  N. J. (2019). A principled methodology for comparing relatedness measures for clustering publications. arXiv:1901.06815.","DOI":"10.1162\/qss_a_00035"},{"key":"2025073014020818400_bib39","doi-asserted-by":"crossref","unstructured":"Yu,  D. J., Wang,  W. R., Zhang,  S., Zhang,  W. Y., & Liu,  R. Y. (2017). Hybrid self-optimized clustering model based on citation links and textual features to detect research topics. PLOS ONE, 12(10), e0187164. https:\/\/doi.org\/10.1371\/journal.pone.0187164","DOI":"10.1371\/journal.pone.0187164"},{"key":"2025073014020818400_bib40","doi-asserted-by":"crossref","unstructured":"Yudhoatmojo,  S. B., & Samuar,  M. A. (2017). Community detection on citation network of DBLP data sample set using LinkRank Algorithm. Procedia Computer Science, 124, 29\u201337. https:\/\/doi.org\/10.1016\/j.procs.2017.12.126","DOI":"10.1016\/j.procs.2017.12.126"},{"key":"2025073014020818400_bib41","doi-asserted-by":"crossref","unstructured":"Zhu,  S., Zeng,  J., & Mamitsuka,  H. (2009). Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity. Bioinformatics, 25(15), 1944\u20131951. https:\/\/doi.org\/10.1093\/bioinformatics\/btp338","DOI":"10.1093\/bioinformatics\/btp338"}],"container-title":["Quantitative Science Studies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/qss\/article-pdf\/1\/2\/714\/1885755\/qss_a_00027.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/qss\/article-pdf\/1\/2\/714\/1885755\/qss_a_00027.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T18:02:20Z","timestamp":1753898540000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/qss\/article\/1\/2\/714\/96134\/Enhancing-direct-citations-A-comparison-of"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020]]},"references-count":41,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1162\/qss_a_00027","relation":{},"ISSN":["2641-3337"],"issn-type":[{"value":"2641-3337","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020]]},"published":{"date-parts":[[2020]]}}}