{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T07:11:16Z","timestamp":1775459476031,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2019,3,1]],"date-time":"2019-03-01T00:00:00Z","timestamp":1551398400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 51875503, No. 51475410"],"award-info":[{"award-number":["No. 51875503, No. 51475410"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Zhejiang Natural Science Foundation of China","award":["No. LY17E050010"],"award-info":[{"award-number":["No. LY17E050010"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Research front detection and topic evolution has for a long time been an important direction for research in the informetrics field. However, most previous studies either simply use a citation count for scientific document clustering or assume that each scientific document has the same importance in detecting the clustering theme in a cluster. In this study, utilizing the topological structure and the PageRank algorithm, we propose a new research front detection and topic evolution approach based on graph theory. This approach is made up of three stages: (1) Setting a time window with appropriate length according to the accuracy of scientific documents clustering results and the time delay of a scientific document to be cited, dividing scientific documents into several time windows according to their years of publication, calculating similarities between them according to their topological structure, and clustering them in each time window based on the fast greedy algorithm; (2) combining the PageRank algorithm and keywords\u2019 frequency to detect the clustering theme, which assumes that the more important a scientific document in the cluster is, the greater the possibility that it is cited by the other documents in the same cluster; and (3) reconstructing the cluster graph where nodes represent clusters and edges\u2019 strengths represent the similarities between different clusters, then detecting research front and identifying topic evolution based on the reconstructed cluster graph. To evaluate the performance of our proposed approach, the scientific documents related to data mining and covered by Science Citation Index Expanded (SCI-EXPANDED) or Social Science Citation Index (SSCI) in Web of Science are collected as a case study. The experiment\u2019s results show that the proposed approach can obtain reasonable clustering results, and it is effective for research front detection and topic evolution.<\/jats:p>","DOI":"10.3390\/sym11030310","type":"journal-article","created":{"date-parts":[[2019,3,4]],"date-time":"2019-03-04T05:45:36Z","timestamp":1551678336000},"page":"310","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Research Front Detection and Topic Evolution Based on Topological Structure and the PageRank Algorithm"],"prefix":"10.3390","volume":"11","author":[{"given":"Yangbing","family":"Xu","sequence":"first","affiliation":[{"name":"School of Information, Zhejiang University of Finance and Economics, Hangzhou 310018, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6405-584X","authenticated-orcid":false,"given":"Shuai","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Information, Zhejiang University of Finance and Economics, Hangzhou 310018, China"}]},{"given":"Wenyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Information, Zhejiang University of Finance and Economics, Hangzhou 310018, China"}]},{"given":"Shuiqing","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Information, Zhejiang University of Finance and Economics, Hangzhou 310018, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9857-443X","authenticated-orcid":false,"given":"Yue","family":"Shen","sequence":"additional","affiliation":[{"name":"School of Information, Zhejiang University of Finance and Economics, Hangzhou 310018, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,3,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1002\/asi.20317","article-title":"Citespace II: Detecting and visualizing emerging trends and transient patterns in scientific literature","volume":"57","author":"Chen","year":"2006","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"e7349","DOI":"10.1097\/MD.0000000000007349","article-title":"Evaluation of research topic evolution in psychiatry using co-word analysis","volume":"96","author":"Wu","year":"2017","journal-title":"Medicine"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1016\/j.joi.2013.01.003","article-title":"Collective dynamics in knowledge networks: Emerging trends analysis","volume":"7","author":"Liu","year":"2013","journal-title":"J. Informetrics"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.jengtecman.2013.07.002","article-title":"Detecting research fronts using different types of weighted citation networks","volume":"32","author":"Fujita","year":"2014","journal-title":"J. Eng. Technol. Manag."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1175","DOI":"10.1016\/j.joi.2017.10.003","article-title":"Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval","volume":"11","author":"Chen","year":"2017","journal-title":"J. Informetr."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2389","DOI":"10.1002\/asi.21419","article-title":"Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?","volume":"61","author":"Boyack","year":"2010","journal-title":"J. Assoc. Inf. Sci. Technol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1007\/s11192-011-0591-7","article-title":"Using \u2018core documents\u2019 for detecting and labelling new emerging topics","volume":"91","author":"Thijs","year":"2012","journal-title":"Scientometrics"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yu, D.J., Wang, W.R., Zhang, S., Zhang, W.Y., and Liu, R.Y. (2017). Hybrid self-optimized clustering model based on citation links and textual features to detect research topics. PLoS ONE, 12.","DOI":"10.1371\/journal.pone.0187164"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhang, W., Wang, X.G., Zhao, D.L., and Tang, X.O. (2012, January 7\u201313). Graph degree linkage: Agglomerative clustering on a directed graph. Proceedings of the 12th European Conference on Computer Vision, Florence, Italy.","DOI":"10.1007\/978-3-642-33718-5_31"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1002\/asi.4630310408","article-title":"The combined use of bibliographic coupling and cocitation for document retrieval","volume":"31","author":"Bichteler","year":"1980","journal-title":"J. Am. Soc. Inf. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Shubankar, K., Singh, A.P., and Pudi, V. (2011, January 28\u201329). A frequent keyword-set based algorithm for topic modeling and clustering of research papers. Proceedings of the 3rd Conference on Data Mining and Optimization, Putrajaya, Malaysia.","DOI":"10.1109\/DMO.2011.5976511"},{"key":"ref_12","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Kim, J., and Lee, E. (2018). Understanding review expertise of developers: A reviewer recommendation approach based on latent dirichlet allocation. Symmetry Basel, 10.","DOI":"10.3390\/sym10040114"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"506","DOI":"10.1016\/j.asoc.2017.09.028","article-title":"Crowdsourcing based scientific issue tracking with topic analysis","volume":"66","author":"Kim","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_15","unstructured":"Qiao, S., and Han, A. (2013, January 20\u201322). A way to construct evolution model of scientific papers based on the seed document and OLDA models. Proceedings of the 2013 International Conference on Mechatronic Science, Electric Engineering and Computer, Shenyang, China."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1002\/asi.10227","article-title":"Time line visualization of research fronts","volume":"54","author":"Morris","year":"2003","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"066111","DOI":"10.1103\/PhysRevE.70.066111","article-title":"Finding community structure in very large networks","volume":"70","author":"Clauset","year":"2004","journal-title":"Phys. Rev. E"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/S0169-7552(98)00110-X","article-title":"The anatomy of a large-scale hypertextual web search engine","volume":"30","author":"Brin","year":"1998","journal-title":"Comput. Netw. ISDN Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"7821","DOI":"10.1073\/pnas.122653799","article-title":"Community structure in social and biological networks","volume":"99","author":"Girvan","year":"2002","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"066133","DOI":"10.1103\/PhysRevE.69.066133","article-title":"Fast algorithm for detecting community structure in networks","volume":"69","author":"Newman","year":"2004","journal-title":"Phys. Rev. E"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"dos Santos, C.K., Evsukoff, A.G., and de Lima, B.S.L.P. (2008, January 26\u201328). Cluster analysis in document networks. Proceedings of the Conference on Data Mining Protection, Univ Cadiz, Cadiz, Spain.","DOI":"10.2495\/DATA080101"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.joi.2006.06.001","article-title":"Finding scientific gems with google\u2019s PageRank algorithm","volume":"1","author":"Chen","year":"2007","journal-title":"J. Informetr."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1016\/j.joi.2015.07.002","article-title":"Author ranking based on personalized PageRank","volume":"9","author":"Nykl","year":"2015","journal-title":"J. Informetr."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1007\/s11192-017-2262-9","article-title":"A multiple-link, mutually reinforced journal-ranking model to measure the prestige of journals","volume":"111","author":"Yu","year":"2017","journal-title":"Scientometrics"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1023\/A:1020458612014","article-title":"Co-citation, bibliographic coupling and a characterization of lattice citation networks","volume":"55","author":"Egghe","year":"2002","journal-title":"Scientometrics"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Boyack, K.W., Newman, D., Duhon, R.J., Klavans, R., Patek, M., and Biberstine, J.R. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLoS ONE, 6.","DOI":"10.1371\/journal.pone.0018029"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/s11192-014-1327-2","article-title":"Research trends in gender differences in higher education and science: A co-word analysis","volume":"101","author":"Dehdarirad","year":"2014","journal-title":"Scientometrics"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: A graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1007\/s11192-007-2002-7","article-title":"A hybrid mapping of information science","volume":"75","author":"Janssens","year":"2008","journal-title":"Scientometrics"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Bafna, P., Pramod, D., and Vaidya, A. (2016, January 3\u20135). Document clustering: TF-IDF approach. Proceedings of the International Conference on Electrical, Electronics, and Optimization Techniques, Palnchur, India.","DOI":"10.1109\/ICEEOT.2016.7754750"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/3\/310\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:35:35Z","timestamp":1760186135000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/3\/310"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,1]]},"references-count":30,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2019,3]]}},"alternative-id":["sym11030310"],"URL":"https:\/\/doi.org\/10.3390\/sym11030310","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,1]]}}}