{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T22:41:36Z","timestamp":1776120096563,"version":"3.50.1"},"reference-count":57,"publisher":"Emerald","issue":"2","license":[{"start":{"date-parts":[[2022,5,10]],"date-time":"2022-05-10T00:00:00Z","timestamp":1652140800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["LHT"],"published-print":{"date-parts":[[2023,6,1]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to uncover latent thematic structures from large collections of documents, is a widespread approach in literature analysis, especially with the rapid growth of academic literature. In this paper, a comparison of topic modeling based literature analysis has been done using full texts and abstracts of articles.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>The authors conduct a comparison study of topic modeling on full-text paper and corresponding abstract to assess the influence of the different types of documents been used as input for topic modeling. In particular, the authors use the large volumes of COVID-19 research literature as a case study for topic modeling based literature analysis. The authors illustrate the research topics, research trends and topic similarity of COVID-19 research by using Latent Dirichlet\u00a0allocation (LDA) and topic visualization method.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The authors found 14 research topics for COVID-19 research. The authors also found that the topic similarity between using full-text paper and corresponding abstract is higher when more documents are analyzed.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>First, this study contributes to the literature analysis approach. The comparison study can help us understand the influence of the different types of documents on the results of topic modeling analysis. Second, the authors present an overview of COVID-19 research by summarizing 14 research topics for it. This automated literature analysis can help specialists in the health and medical domain or other people to quickly grasp the structured morphology of the current studies for COVID-19.<\/jats:p><\/jats:sec>","DOI":"10.1108\/lht-03-2022-0144","type":"journal-article","created":{"date-parts":[[2022,5,8]],"date-time":"2022-05-08T22:55:17Z","timestamp":1652050517000},"page":"543-569","source":"Crossref","is-referenced-by-count":29,"title":["A comparison study of topic modeling based literature analysis by using full texts and abstracts of\u00a0scientific articles: a case of\u00a0COVID-19 research"],"prefix":"10.1108","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9890-323X","authenticated-orcid":false,"given":"Qiang","family":"Cao","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4802-6299","authenticated-orcid":false,"given":"Xian","family":"Cheng","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3572-6253","authenticated-orcid":false,"given":"Shaoyi","family":"Liao","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2022,5,10]]},"reference":[{"key":"key2023053016404960400_ref001","first-page":"13","article-title":"Evaluating topic coherence using distributional semantics","year":"2013"},{"key":"key2023053016404960400_ref002","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1111\/hir.12307","article-title":"'The COVID-19 (Coronavirus) pandemic: reflections on the roles of librarians and information professionals","volume":"37","year":"2020","journal-title":"Health Information and Libraries Journal"},{"key":"key2023053016404960400_ref003","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1016\/j.techfore.2014.01.007","article-title":"R&D partnerships: an exploratory approach to the role of structural variables in joint project performance","volume":"90","year":"2015","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref004","first-page":"115","volume-title":"Extracting Scientific Trends by Mining Topics from Call for Papers","year":"2019"},{"key":"key2023053016404960400_ref005","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1145\/2133806.2133826","article-title":"Probabilistic topic models","volume":"55","year":"2012","journal-title":"Communications of the ACM"},{"key":"key2023053016404960400_ref006","first-page":"993","article-title":"Latent Dirichlet\u00a0allocation","volume":"3","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"key2023053016404960400_ref007","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1016\/S2213-2600(20)30056-4","article-title":"Coronavirus in China","volume":"8","year":"2020","journal-title":"The Lancet. Respiratory Medicine"},{"key":"key2023053016404960400_ref008","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1126\/science.1171022","article-title":"Revisiting the foundations of network analysis","volume":"325","year":"2009","journal-title":"Science"},{"key":"key2023053016404960400_ref009","article-title":"Using social media for actionable disease surveillance and outbreak management: a systematic literature review","volume":"10","year":"2015","journal-title":"PloS One"},{"key":"key2023053016404960400_ref010","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1016\/S0140-6736(20)30211-7","article-title":"'Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study","volume":"395","year":"2020","journal-title":"The Lancet"},{"key":"key2023053016404960400_ref011","unstructured":"CORD-19 (2020), \u201cCOVID-19 open research dataset challenge (CORD-19)\u201d, available at: https:\/\/www.kaggle.com\/allen-institute-for-ai\/CORD-19-research-challenge."},{"key":"key2023053016404960400_ref012","doi-asserted-by":"crossref","first-page":"102034","DOI":"10.1016\/j.ipm.2019.04.002","article-title":"An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit","volume":"57","year":"2020","journal-title":"Information Processing and Management"},{"key":"key2023053016404960400_ref013","doi-asserted-by":"crossref","first-page":"1707","DOI":"10.1016\/j.eswa.2007.01.035","article-title":"'Seeding the survey and analysis of research literature with text mining","volume":"34","year":"2008","journal-title":"Expert Systems with Applications"},{"key":"key2023053016404960400_ref014","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1016\/j.wpi.2010.12.005","article-title":"Patent data as indicators of wind power technology development","volume":"33","year":"2011","journal-title":"World Patent Information"},{"key":"key2023053016404960400_ref015","first-page":"280","article-title":"Identifying the evolutionary process of emerging technologies: a chronological network analysis of World Wide Web conference sessions","volume-title":"Technological Forecasting and Social Change","year":"2015"},{"key":"key2023053016404960400_ref016","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1108\/JD-05-2017-0069","article-title":"Long-term community development within a researcher network","volume":"74","year":"2018","journal-title":"Journal of Documentation"},{"key":"key2023053016404960400_ref017","first-page":"65","article-title":"LIS research across 50 years: content analysis of journal articles","volume":"78","year":"2021","journal-title":"Journal of Documentation"},{"key":"key2023053016404960400_ref018","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1016\/j.techfore.2018.05.010","article-title":"Identifying emerging Research and Business Development (R&BD) areas based on topic modeling and visualization with intellectual property right data","volume":"146","year":"2019","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref019","first-page":"1","article-title":"Top 100 cited articles in cardiovascular magnetic resonance: a bibliometric analysis","volume":"18","year":"2017","journal-title":"Journal of Cardiovascular Magnetic Resonance"},{"key":"key2023053016404960400_ref020","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1016\/j.techfore.2017.02.035","article-title":"Using the data mining method to assess the innovation gap: a case of industrial robotics in a catching-up country","volume":"119","year":"2017","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref021","doi-asserted-by":"crossref","first-page":"1164","DOI":"10.1016\/j.techfore.2011.03.022","article-title":"Literature-related discovery: potential treatments and preventatives for SARS","volume":"78","year":"2011","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref022","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1177\/0165551509353251","article-title":"Domain analysis with text mining: analysis of digital library research trends using profiling methods","volume":"36","year":"2010","journal-title":"Journal of Information Science"},{"key":"key2023053016404960400_ref023","doi-asserted-by":"crossref","first-page":"1761","DOI":"10.1007\/s11192-016-2135-7","article-title":"Subject\u2013method topic network analysis in communication studies","volume":"109","year":"2016","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref024","article-title":"A bibliometric analysis of topic modelling studies (2000-2017)","volume":"0","year":"2019","journal-title":"Journal of Information Science"},{"key":"key2023053016404960400_ref025","doi-asserted-by":"crossref","first-page":"1753","DOI":"10.1007\/s11192-019-03239-0","article-title":"Visual topical analysis of library and information science","volume":"121","year":"2019","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref026","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1007\/s11192-019-03132-w","article-title":"Complex network analysis of keywords co-occurrence in the recent efficiency analysis literature","volume":"120","year":"2019","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref027","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1007\/s11192-019-03274-x","article-title":"Application of entity linking to identify research fronts and trends","volume":"122","year":"2020","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref028","doi-asserted-by":"crossref","first-page":"1314","DOI":"10.1016\/j.eswa.2014.09.024","article-title":"Business intelligence in banking: a literature analysis from 2002 to 2013 using text mining and latent Dirichlet\u00a0allocation","volume":"42","year":"2015","journal-title":"Expert Systems with Applications"},{"key":"key2023053016404960400_ref029","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/j.jbusres.2019.01.053","article-title":"A text mining and topic modelling perspective of ethnic marketing research","volume":"103","year":"2019","journal-title":"Journal of Business Research"},{"key":"key2023053016404960400_ref030","article-title":"Topic extraction to provide an overview of research activities: the case of the high-temperature superconductor and simulation and modelling","volume":"0","year":"2020","journal-title":"Journal of Information Science"},{"key":"key2023053016404960400_ref031","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1007\/BF02457439","article-title":"Mapping the social and behavioral sciences world-wide: use of maps in portfolio analysis of national research efforts","volume":"40","year":"1997","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref032","first-page":"275","article-title":"Can abstract screening workload be reduced using text mining? User experiences of the tool Rayyan","volume-title":"Research Synthesis Methods","year":"2017"},{"key":"key2023053016404960400_ref033","first-page":"1","article-title":"Text-mining analysis of mHealth research","volume":"3","year":"2017","journal-title":"MHealth"},{"key":"key2023053016404960400_ref034","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1007\/s11192-016-1978-2","article-title":"The normalization of co-authorship networks in the bibliometric evaluation: the government stimulation programs of China and Korea","volume":"109","year":"2016","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref035","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1016\/j.chb.2017.09.001","article-title":"Examining thematic similarity, difference, and membership in three online mental health communities from reddit: a text mining and visualization approach","volume":"78","year":"2018","journal-title":"Computers in Human Behavior"},{"key":"key2023053016404960400_ref036","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1016\/j.techfore.2017.02.027","article-title":"'Science foresight using life-cycle analysis, text mining and clustering: a case study on natural ventilation","volume":"118","year":"2017","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref037","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1111\/j.1468-2958.1988.tb00184.x","article-title":"Citation networks of communication journals, 1977-1985 cliques and positions, citations made and citations received","volume":"15","year":"1988","journal-title":"Human Communication Research"},{"key":"key2023053016404960400_ref038","first-page":"399","article-title":"Exploring the space of topic coherence measures","year":"2015"},{"key":"key2023053016404960400_ref039","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1007\/s11192-019-03125-9","article-title":"Discovering related scientific literature beyond semantic similarity: a new co-citation approach","volume":"120","year":"2019","journal-title":"Scientometrics"},{"key":"key2023053016404960400_ref040","article-title":"An overview of systematic literature reviews in social media marketing","volume":"0","year":"2019","journal-title":"Journal of Information Science"},{"key":"key2023053016404960400_ref041","doi-asserted-by":"crossref","first-page":"1013","DOI":"10.1016\/j.techfore.2006.05.020","article-title":"Text mining as a valuable tool in foresight exercises: a study on nanotechnology","volume":"73","year":"2006","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref042","first-page":"421","volume-title":"Measuring the Funding Landscape of COVID-19 Research","year":"2021"},{"key":"key2023053016404960400_ref043","first-page":"952","article-title":"Exploring topic coherence over many models and many topics","year":"2012"},{"key":"key2023053016404960400_ref044","first-page":"673","article-title":"Research output, intellectual structures and contributors of digital humanities research: a longitudinal analysis 2005-2020","volume":"78","year":"2021","journal-title":"Journal of Documentation"},{"key":"key2023053016404960400_ref045","doi-asserted-by":"crossref","first-page":"10049","DOI":"10.1016\/j.eswa.2012.02.042","article-title":"Applying text-mining to personalization and customization research literature \u2013 who, what and where?","volume":"39","year":"2012","journal-title":"Expert Systems with Applications"},{"key":"key2023053016404960400_ref046","first-page":"165","article-title":"Full-text or abstract? examining topic coherence scores using latent Dirichlet\u00a0allocation","year":"2017"},{"key":"key2023053016404960400_ref047","volume-title":"Research Methods for Business Students","year":"2009"},{"key":"key2023053016404960400_ref048","first-page":"207","article-title":"Towards a methodology for developing evidence-informed management knowledge by means of systematic review","volume":"14","year":"2003","journal-title":"British Journal of Management"},{"key":"key2023053016404960400_ref049","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1852102.1852106","article-title":"A similarity measure for indefinite rankings","volume":"28","year":"2010","journal-title":"ACM Transactions on Information Systems (TOIS)"},{"key":"key2023053016404960400_ref050","article-title":"A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts","volume-title":"PLoS Computational Biology","year":"2018"},{"key":"key2023053016404960400_ref051","doi-asserted-by":"crossref","first-page":"1606","DOI":"10.1111\/cobi.12605","article-title":"Text analysis tools for identification of emerging topics and research gaps in conservation science","volume":"29","year":"2015","journal-title":"Conservation Biology"},{"key":"key2023053016404960400_ref052","unstructured":"WHO (2020), \u201cNovel coronavirus (COVID-19) situation [WWW Document]\u201d, available at: https:\/\/www.who.int\/emergencies\/diseases\/novel-coronavirus-2019 (accessed 3 Janurary 20)."},{"key":"key2023053016404960400_ref053","doi-asserted-by":"crossref","first-page":"689","DOI":"10.1016\/S0140-6736(20)30260-9","article-title":"'Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study","volume":"395","year":"2020","journal-title":"The Lancet"},{"key":"key2023053016404960400_ref054","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.techfore.2013.12.019","article-title":"\u2018Term clumping\u2019 for technical intelligence: a case study on dye-sensitized solar cells","volume":"85","year":"2014","journal-title":"Technological Forecasting and Social Change"},{"key":"key2023053016404960400_ref055","doi-asserted-by":"crossref","first-page":"518","DOI":"10.1016\/j.jclepro.2018.11.028","article-title":"How do low-carbon policies promote green diffusion among alliance-based firms in China? An evolutionary-game model of complex networks","volume":"210","year":"2019","journal-title":"Journal of Cleaner Production"},{"key":"key2023053016404960400_ref056","first-page":"495","volume-title":"A Dependency-Based Machine Learning Approach to the Identification of Research Topics: A Case in COVID-19 Studies\u2019","year":"2021"},{"key":"key2023053016404960400_ref057","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1108\/LHT-10-2017-0211","article-title":"Text mining based theme logic structure identification: application in library journals","volume":"36","year":"2018","journal-title":"Library Hi Tech"}],"container-title":["Library Hi Tech"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/LHT-03-2022-0144\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/LHT-03-2022-0144\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:14:17Z","timestamp":1753395257000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/lht\/article\/41\/2\/543-569\/454310"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,10]]},"references-count":57,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2022,5,10]]},"published-print":{"date-parts":[[2023,6,1]]}},"alternative-id":["10.1108\/LHT-03-2022-0144"],"URL":"https:\/\/doi.org\/10.1108\/lht-03-2022-0144","relation":{},"ISSN":["0737-8831"],"issn-type":[{"value":"0737-8831","type":"print"}],"subject":[],"published":{"date-parts":[[2022,5,10]]}}}