{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T17:32:40Z","timestamp":1754155960308,"version":"3.41.2"},"reference-count":47,"publisher":"Emerald","issue":"4","license":[{"start":{"date-parts":[[2023,2,6]],"date-time":"2023-02-06T00:00:00Z","timestamp":1675641600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["LHT"],"published-print":{"date-parts":[[2024,7,23]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>Predicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>This research analyzed academic papers published in the <jats:italic>Journal of the Association for Computing Machinery<\/jats:italic> (<jats:italic>ACM<\/jats:italic>) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1\u20133\u00a0years after publication.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>Experimental results showed that early highly cited academic papers are predictable when they are first published. The authors\u2019 prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>Based on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.<\/jats:p><\/jats:sec>","DOI":"10.1108\/lht-06-2022-0305","type":"journal-article","created":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T20:33:32Z","timestamp":1675370012000},"page":"1366-1384","source":"Crossref","is-referenced-by-count":3,"title":["Predictable by publication: discovery of early highly cited\u00a0academic papers based on\u00a0their own features"],"prefix":"10.1108","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5885-4509","authenticated-orcid":false,"given":"Xiaobo","family":"Tang","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1133-2812","authenticated-orcid":false,"given":"Heshen","family":"Zhou","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1879-4895","authenticated-orcid":false,"given":"Shixuan","family":"Li","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2023,2,6]]},"reference":[{"issue":"1","key":"key2024072005471291200_ref001","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1016\/j.joi.2018.11.003","article-title":"Predicting publication long-term impact through a combination of early citations and journal impact factor","volume":"13","year":"2019","journal-title":"Journal of Informetrics"},{"issue":"2","key":"key2024072005471291200_ref002","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1016\/j.joi.2019.02.011","article-title":"Predicting citation counts based on deep neural network learning techniques","volume":"13","year":"2019","journal-title":"Journal of Informetrics"},{"issue":"2","key":"key2024072005471291200_ref003","doi-asserted-by":"publisher","DOI":"10.1016\/j.joi.2020.101128","article-title":"Early indicators of scientific impact: predicting citations with altmetrics","volume":"15","year":"2021","journal-title":"Journal of Informetrics"},{"issue":"3","key":"key2024072005471291200_ref004","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1108\/LHT-05-2021-0154","article-title":"Investigating the citation advantage of author-pays charges model in computer science research: a case study of Elsevier and Springer","volume":"40","year":"2022","journal-title":"Library Hi Tech"},{"issue":"ahead-of-print","key":"key2024072005471291200_ref005","doi-asserted-by":"publisher","DOI":"10.1108\/LHT-07-2021-0233","article-title":"Mapping the quantity, quality and structural indicators of Asian (48 countries and 3 territories) research productivity on cloud computing","volume":"ahead-of-print","year":"2022","journal-title":"Library Hi Tech"},{"issue":"3","key":"key2024072005471291200_ref006","doi-asserted-by":"publisher","DOI":"10.3390\/info8030073","article-title":"An overview on evaluating and predicting scholarly article impact","volume":"8","year":"2017","journal-title":"Information"},{"issue":"No. ahead-of-print","key":"key2024072005471291200_ref007","doi-asserted-by":"publisher","DOI":"10.1108\/LHT-09-2021-0305","article-title":"Does the venue of scientific conferences leverage their impact? A large scale study on Computer Science conferences","volume":"Vol. ahead-of-print","year":"2022","journal-title":"Library Hi Tech"},{"key":"key2024072005471291200_ref008","doi-asserted-by":"publisher","first-page":"3615","DOI":"10.48550\/arXiv.1903.10676","article-title":"Scibert: a pretrained language model for scientific text","year":"2019"},{"issue":"1","key":"key2024072005471291200_ref009","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1016\/j.joi.2013.11.005","article-title":"How to improve the prediction based on citation impact percentiles for years shortly after the publication date?","volume":"8","year":"2014","journal-title":"Journal of Informetrics"},{"issue":"2","key":"key2024072005471291200_ref010","doi-asserted-by":"publisher","first-page":"917","DOI":"10.1007\/s11192-016-1979-1","article-title":"Research assessment using early citation information","volume":"108","year":"2016","journal-title":"Scientometrics"},{"issue":"4","key":"key2024072005471291200_ref011","doi-asserted-by":"crossref","first-page":"1084","DOI":"10.1108\/LHT-01-2021-0038","article-title":"The effect of interdisciplinary components' citation intensity on scientific impact","volume":"39","year":"2021","journal-title":"Library Hi Tech"},{"issue":"1","key":"key2024072005471291200_ref012","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1007\/s11192-020-03479-5","article-title":"Predicting the future success of scientific publications through social network and semantic analysis","volume":"124","year":"2020","journal-title":"Scientometrics"},{"key":"key2024072005471291200_ref013","first-page":"4171","article-title":"Bert: pre-training of deep bidirectional transformers for language understanding","year":"2019","journal-title":"Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies"},{"issue":"4","key":"key2024072005471291200_ref014","doi-asserted-by":"publisher","first-page":"861","DOI":"10.1016\/j.joi.2013.08.006","article-title":"Which factors help authors produce the highest impact research? Collaboration, journal and document properties","volume":"7","year":"2013","journal-title":"Journal of Informetrics"},{"issue":"21","key":"key2024072005471291200_ref015","doi-asserted-by":"publisher","first-page":"14871","DOI":"10.1007\/s11042-019-07856-y","article-title":"KA-Ensemble: towards imbalanced image classification ensembling under-sampling and over-sampling","volume":"79","year":"2020","journal-title":"Multimedia\u00a0Tools and Applications"},{"issue":"4","key":"key2024072005471291200_ref016","doi-asserted-by":"publisher","first-page":"915","DOI":"10.1108\/LHT-06-2020-0131","article-title":"A bibliometric analysis and science mapping of scientific publications of Alzahra University during 1986-2019","volume":"39","year":"2021","journal-title":"Library Hi Tech"},{"issue":"6","key":"key2024072005471291200_ref017","doi-asserted-by":"publisher","first-page":"759","DOI":"10.1592\/phco.26.6.759","article-title":"Scientific collaboration results in higher citation rates of published articles","volume":"26","year":"2006","journal-title":"Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy"},{"issue":"1","key":"key2024072005471291200_ref018","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1007\/s11192-010-0237-1","article-title":"Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature","volume":"85","year":"2010","journal-title":"Scientometrics"},{"issue":"2","key":"key2024072005471291200_ref019","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1002\/leap.1348","article-title":"Journal self\u2010citation trends in 1975-2017 and the effect on journal impact and article citations","volume":"34","year":"2021","journal-title":"Learned Publishing"},{"issue":"9","key":"key2024072005471291200_ref020","doi-asserted-by":"publisher","first-page":"7583","DOI":"10.1007\/s11192-021-04083-x","article-title":"Article length and citation outcomes","volume":"126","year":"2021","journal-title":"Scientometrics"},{"key":"key2024072005471291200_ref021","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.105383","article-title":"Predicting literature's early impact with sentiment analysis in Twitter","volume":"192","year":"2020","journal-title":"Knowledge-Based Systems"},{"issue":"1","key":"key2024072005471291200_ref022","doi-asserted-by":"publisher","DOI":"10.1016\/j.joi.2019.101004","article-title":"Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity","volume":"14","year":"2020","journal-title":"Journal of Informetrics"},{"issue":"24","key":"key2024072005471291200_ref023","doi-asserted-by":"publisher","first-page":"3303","DOI":"10.1093\/bioinformatics\/btp585","article-title":"Predicting citation count of Bioinformatics papers within four years of publication","volume":"25","year":"2009","journal-title":"Bioinformatics"},{"first-page":"990","article-title":"ArnetMiner: extraction and mining of academic social networks","year":"2008","key":"key2024072005471291200_ref024"},{"issue":"3","key":"key2024072005471291200_ref025","doi-asserted-by":"publisher","first-page":"1395","DOI":"10.1007\/s11192-018-2703-0","article-title":"Predicting long-run citation counts for articles in top economics journals","volume":"115","year":"2018","journal-title":"Scientometrics"},{"issue":"3","key":"key2024072005471291200_ref026","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1108\/EL-10-2019-0253","article-title":"Author-related factors predicting citation counts of conference papers: focusing on computer and information science","volume":"38","year":"2020","journal-title":"The Electronic Library"},{"issue":"1","key":"key2024072005471291200_ref027","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1007\/s11192-007-1946-y","article-title":"Patterns of annual citation of highly cited articles and the prediction of their citation ranking: a comparison across subjects","volume":"77","year":"2008","journal-title":"Scientometrics"},{"first-page":"1172","article-title":"A deep learning methodology for citation count prediction with large-scale biblio-features","year":"2019","key":"key2024072005471291200_ref028"},{"issue":"5","key":"key2024072005471291200_ref029","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2021.102673","article-title":"A deep learning-based approach to constructing a domain sentiment lexicon: a case study in financial distress prediction","volume":"58","year":"2021","journal-title":"Information Processing and Management"},{"issue":"7645","key":"key2024072005471291200_ref030","doi-asserted-by":"publisher","first-page":"655","DOI":"10.1136\/bmj.39482.526713.BE","article-title":"Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study","volume":"336","year":"2008","journal-title":"British Medical Journal"},{"issue":"8","key":"key2024072005471291200_ref031","doi-asserted-by":"publisher","first-page":"6803","DOI":"10.1007\/s11192-021-04033-7","article-title":"A deep-learning based citation count prediction model with paper metadata semantic features","volume":"126","year":"2021","journal-title":"Scientometrics"},{"issue":"1","key":"key2024072005471291200_ref032","doi-asserted-by":"publisher","first-page":"785","DOI":"10.1007\/s11192-020-03759-0","article-title":"Impact of the reference list features on the number of citations","volume":"126","year":"2021","journal-title":"Scientometrics"},{"issue":"3","key":"key2024072005471291200_ref033","doi-asserted-by":"publisher","DOI":"10.1016\/j.joi.2020.101039","article-title":"Predicting the citation counts of individual papers via a BP neural network","volume":"14","year":"2020","journal-title":"Journal of Informetrics"},{"first-page":"411","article-title":"Citation semantic based approaches to identify article quality","year":"2013","key":"key2024072005471291200_ref034"},{"issue":"3","key":"key2024072005471291200_ref035","doi-asserted-by":"publisher","first-page":"704","DOI":"10.1108\/LHT-01-2021-0018","article-title":"Evolutions and trends of artificial intelligence (AI): research, output, influence and competition","volume":"40","year":"2022","journal-title":"Library Hi Tech"},{"issue":"1","key":"key2024072005471291200_ref036","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1007\/s11192-016-2161-5","article-title":"The effect of keyword repetition in abstract and keyword frequency per journal in predicting citation counts","volume":"110","year":"2017","journal-title":"Scientometrics"},{"issue":"1","key":"key2024072005471291200_ref037","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1016\/j.joi.2018.01.008","article-title":"Could scientists use Altmetric. com scores to predict longer term citation counts?","volume":"12","year":"2018","journal-title":"Journal of Informetrics"},{"issue":"2","key":"key2024072005471291200_ref038","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1016\/j.joi.2016.02.007","article-title":"A review of the literature on citation impact indicators","volume":"10","year":"2016","journal-title":"Journal of Informetrics"},{"issue":"3","key":"key2024072005471291200_ref039","doi-asserted-by":"publisher","first-page":"695","DOI":"10.1007\/s11192-011-0366-1","article-title":"Mining typical features for highly cited papers","volume":"87","year":"2011","journal-title":"Scientometrics"},{"issue":"6154","key":"key2024072005471291200_ref040","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1126\/science.1237825","article-title":"Quantifying long-term scientific impact","volume":"342","year":"2013","journal-title":"Science"},{"issue":"8","key":"key2024072005471291200_ref041","doi-asserted-by":"publisher","first-page":"6533","DOI":"10.1007\/s11192-021-04026-6","article-title":"Prediction and application of article potential citations based on nonlinear citation-forecasting combined model","volume":"126","year":"2021","journal-title":"Scientometrics"},{"issue":"6","key":"key2024072005471291200_ref042","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.115089","article-title":"A hybrid approach to classifying wikipedia article quality flaws with feature fusion framework","volume":"181","year":"2021","journal-title":"Expert Systems with Applications"},{"issue":"No. ahead-of-print","key":"key2024072005471291200_ref043","doi-asserted-by":"publisher","DOI":"10.1108\/LHT-06-2021-0198","article-title":"A bibliometric study on library and information science and information systems literature during 2010-2019","volume":"Vol. ahead-of-print","year":"2022","journal-title":"Library Hi Tech"},{"key":"key2024072005471291200_ref044","doi-asserted-by":"crossref","first-page":"92248","DOI":"10.1109\/ACCESS.2019.2927011","article-title":"Early prediction of scientific impact based on multi-bibliographic features and convolutional neural network","volume":"7","year":"2019","journal-title":"IEEE Access"},{"first-page":"51","article-title":"To better stand on the shoulder of giants","year":"2012","key":"key2024072005471291200_ref045"},{"issue":"1","key":"key2024072005471291200_ref046","doi-asserted-by":"publisher","first-page":"284","DOI":"10.1108\/LHT-12-2019-0239","article-title":"Do proceedings papers in science fields have higher impacts than those in the field of social science and humanities?","volume":"39","year":"2021","journal-title":"Library Hi Tech"},{"volume-title":"Machine Learning","year":"2021","key":"key2024072005471291200_ref047","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-15-1967-3"}],"container-title":["Library Hi Tech"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/LHT-06-2022-0305\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/LHT-06-2022-0305\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:14:44Z","timestamp":1753395284000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/lht\/article\/42\/4\/1366-1384\/1220256"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,6]]},"references-count":47,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2,6]]},"published-print":{"date-parts":[[2024,7,23]]}},"alternative-id":["10.1108\/LHT-06-2022-0305"],"URL":"https:\/\/doi.org\/10.1108\/lht-06-2022-0305","relation":{},"ISSN":["0737-8831"],"issn-type":[{"type":"print","value":"0737-8831"}],"subject":[],"published":{"date-parts":[[2023,2,6]]}}}