{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:29:41Z","timestamp":1777854581288,"version":"3.51.4"},"reference-count":60,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2023,7,14]],"date-time":"2023-07-14T00:00:00Z","timestamp":1689292800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:p>\n                    Over the past two decades, databases and the tools to access them in a simple manner have become increasingly available, allowing historical and modern-day topics to be merged and studied. Throughout the recent COVID-19 pandemic, for example, many researchers have reflected on whether any lessons learned from the Spanish flu pandemic of 1918 could have been helpful in the present pandemic. Most studies using text-mining applications rarely use full-text journal articles. This article provides a methodology used to develop a full-text journal article corpus using the R\n                    <jats:italic toggle=\"yes\">fulltext<\/jats:italic>\n                    package. Using the proposed methodology, 2743 full-text journal articles were obtained. The aim of this article is to provide a methodology and supplementary codes for researchers to use the R\n                    <jats:italic toggle=\"yes\">fulltext<\/jats:italic>\n                    package to curate a full-text journal corpus.\n                  <\/jats:p>","DOI":"10.1177\/01655515231171362","type":"journal-article","created":{"date-parts":[[2023,7,15]],"date-time":"2023-07-15T01:20:35Z","timestamp":1689384035000},"page":"1457-1470","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Using R to develop a corpus of full-text journal articles"],"prefix":"10.1177","volume":"51","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1327-7004","authenticated-orcid":false,"given":"Billie","family":"Anderson","sequence":"first","affiliation":[{"name":"Henry W. Bloch School of Management, University of Missouri-Kansas City (UMKC), USA"}]},{"given":"Majid","family":"Bani-Yaghoub","sequence":"additional","affiliation":[{"name":"School of Science and Engineering, University of Missouri-Kansas City (UMKC), USA"}]},{"given":"Vagmi","family":"Kantheti","sequence":"additional","affiliation":[{"name":"University of Missouri-Kansas City (UMKC), USA"}]},{"given":"Scott","family":"Curtis","sequence":"additional","affiliation":[{"name":"University Libraries, University of Missouri - Kansas City (UMKC)"}]}],"member":"179","published-online":{"date-parts":[[2023,7,14]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1177\/1094428117719322"},{"key":"e_1_3_2_3_2","first-page":"329","article-title":"The application of text mining methods in innovation research: current state, evolution patterns, and development priorities","volume":"50","author":"Antons D","year":"2020","unstructured":"Antons D, Gr\u00fcnwald E, Cichy P, et al. The application of text mining methods in innovation research: current state, evolution patterns, and development priorities. RD Manage 2020; 50: 329\u2013351.","journal-title":"RD Manage"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-018-2103-8"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1038\/nj7612-457a"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.3000959"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbaa296"},{"key":"e_1_3_2_8_2","first-page":"1","article-title":"Application of text mining techniques on scholarly research articles: methods and tools","volume":"28","author":"Thakur K","year":"2022","unstructured":"Thakur K, Kumar V. Application of text mining techniques on scholarly research articles: methods and tools. New Rev Acad Libr 2022; 28: 1\u201325.","journal-title":"New Rev Acad Libr"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-67056-0_18"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1057\/jors.2009.137"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.enbuild.2021.110885"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1005962"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bav020"},{"key":"e_1_3_2_14_2","article-title":"Large-scale event extraction from literature with multi-level gene normalization","volume":"8","author":"Landeghem SV","year":"2013","unstructured":"Landeghem SV, Bj\u00f6rne J, Wei CH, et al. Large-scale event extraction from literature with multi-level gene normalization. PLoS ONE 2013; 8: e55814.","journal-title":"PLoS ONE"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1038\/nmeth.4471"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gky355"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1186\/s13326-017-0113-5"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2009.11.001"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-10-46"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-11-492"},{"key":"e_1_3_2_21_2","article-title":"BEST: next-generation biomedical entity search tool for knowledge discovery from biomedical literature","volume":"11","author":"Lee S","year":"2016","unstructured":"Lee S, Kim D, Lee K, et al. BEST: next-generation biomedical entity search tool for knowledge discovery from biomedical literature. PLoS ONE 2016; 11: e0164680.","journal-title":"PLoS ONE"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.3390\/biochem1020007"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1080\/13614533.2020.1819352"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1009884"},{"key":"e_1_3_2_25_2","unstructured":"WHO COVID-19 research database: user guide and information https:\/\/www.who.int\/publications\/m\/item\/quick-search-guide-who-covid-19-database (accessed 9 February 2023)."},{"key":"e_1_3_2_26_2","unstructured":"WHO COVID-19 research database https:\/\/search.bvsalud.org\/global-literature-on-novel-coronavirus-2019-ncov\/ (accessed 9 February 2023)."},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3166.3197"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.132.3434.1099"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.168.3929.335"},{"key":"e_1_3_2_30_2","first-page":"243","volume-title":"Proceedings of the 8th annual international ACM SIGIR conference on research and development in information retrieval \u2013 SIGIR \u201985","author":"Tong RM","unstructured":"Tong RM, Askman VN, Cunningham JF, et al. RUBRIC: an environment for full text information retrieval. In: Proceedings of the 8th annual international ACM SIGIR conference on research and development in information retrieval \u2013 SIGIR \u201985, Montreal, QC, Canada, 5\u20137 June 1985, pp. 243\u2013251. New York: ACM Press."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0011273"},{"key":"e_1_3_2_32_2","first-page":"214","article-title":"Towards electronic journals: realities for scientists, librarians & publishers","volume":"31","author":"Tenopir C","year":"2000","unstructured":"Tenopir C, King D, Campbell R. Towards electronic journals: realities for scientists, librarians & publishers. J Scholar Publ 2000; 31: 214.","journal-title":"J Scholar Publ"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1186\/1741-7015-10-124"},{"key":"e_1_3_2_34_2","doi-asserted-by":"crossref","unstructured":"Lo K Wang LL Neumann M et al. S2ORC: the semantic scholar open research corpus http:\/\/arxiv.org\/abs\/1911.02782 (2020 accessed 26 April 2022).","DOI":"10.18653\/v1\/2020.acl-main.447"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-10-S2-S6"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btz070"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bas043"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bau074"},{"key":"e_1_3_2_39_2","unstructured":"Recology code and such: fulltext \u2013 a package to help you mine text https:\/\/recology.info\/2015\/08\/full-text\/ (2015 accessed 27 April 2022)."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.21105\/joss.00721"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12874-020-0897-3"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.3389\/frai.2021.685298"},{"key":"e_1_3_2_43_2","unstructured":"R-bloggers: fulltext v1: text-mining scholarly works https:\/\/www.r-bloggers.com\/2018\/01\/fulltext-v1-text-mining-scholarly-works\/ (2018 accessed 9 February 2023)."},{"key":"e_1_3_2_44_2","first-page":"77","article-title":"The openURL and SFX linking","volume":"44","author":"Lagace N","year":"2003","unstructured":"Lagace N. The openURL and SFX linking. Ser Libr 2003; 44: 77\u201389.","journal-title":"Ser Libr"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1080\/00987913.2020.1759361"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1001\/jama.1994.03510380059038"},{"key":"e_1_3_2_47_2","unstructured":"Chamberlain S. Chapter 1: fulltext manual https:\/\/books.ropensci.org\/fulltext\/ (2019 accessed 14 April 2022)."},{"key":"e_1_3_2_48_2","unstructured":"R-bloggers: fulltext: behind the scenes https:\/\/www.r-bloggers.com\/2020\/11\/fulltext-behind-the-scenes\/ (2020 accessed 28 April 2022)."},{"key":"e_1_3_2_49_2","unstructured":"DataCite. Pagination https:\/\/support.datacite.org\/docs\/pagination (accessed 15 April 2022)."},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.isci.2021.102155"},{"key":"e_1_3_2_51_2","first-page":"177","volume-title":"Proceedings of the 2013 ACM symposium on document engineering","author":"Constantin A","unstructured":"Constantin A, Pettifer S, Voronkov A. PDFX: fully-automated PDF-to-XML conversion of scientific literature. In: Proceedings of the 2013 ACM symposium on document engineering, Florence, 10\u201313 September 2013, pp. 177\u2013180. New York: ACM."},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3012542"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10032-019-00317-0"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.5195\/jmla.2018.468"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1136\/bmjos-2020-100131"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1038\/d41586-019-02142-1"},{"key":"e_1_3_2_57_2","unstructured":"Hackaday. Malamud\u2019s general index: research gist no slap on the wrist https:\/\/hackaday.com\/2021\/11\/02\/malamuds-general-index-research-gist-no-slap-on-the-wrist\/ (2021 accessed 20 April 2022)."},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1108\/JD-06-2020-0090"},{"key":"e_1_3_2_59_2","first-page":"31","article-title":"Improving authentication and authorization: seamlessaccess and GetFTR","volume":"58","author":"Tay A","year":"2022","unstructured":"Tay A. Improving authentication and authorization: seamlessaccess and GetFTR. Libr Technol Rep 2022; 58: 31\u201339.","journal-title":"Libr Technol Rep"},{"key":"e_1_3_2_60_2","unstructured":"Geng Y Cao RM Han XP et al. Scientists are working overtime and at the weekends: comparison of publication downloading from copyrighted and pirated platforms http:\/\/arxiv.org\/abs\/2111.02664 (2021 accessed 28 April 2022)."},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.3145\/epi.2019.ene.12"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01655515231171362","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/01655515231171362","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01655515231171362","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:10:08Z","timestamp":1777504208000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/01655515231171362"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,14]]},"references-count":60,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["10.1177\/01655515231171362"],"URL":"https:\/\/doi.org\/10.1177\/01655515231171362","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,14]]}}}