{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:48:47Z","timestamp":1753876127415,"version":"3.41.2"},"reference-count":62,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2024,10,21]],"date-time":"2024-10-21T00:00:00Z","timestamp":1729468800000},"content-version":"vor","delay-in-days":294,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["72004012 72074014"],"award-info":[{"award-number":["72004012 72074014"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["72004012 72074014"],"award-info":[{"award-number":["72004012 72074014"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,21]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The ever-increasing volume of COVID-19-related articles presents a significant challenge for the manual curation and multilabel topic classification of LitCovid. For this purpose, a novel multilabel topic classification framework is developed in this study, which considers both the correlation and imbalance of topic labels, while empowering the pretrained model. With the help of this framework, this study devotes to answering the following question: Do full texts, MeSH (Medical Subject Heading), and biological entities of articles about COVID-19 encode more discriminative information than metadata (title, abstract, keyword, and journal name)? From extensive experiments on our enriched version of the BC7-LitCovid corpus and Hallmarks of Cancer corpus, the following conclusions can be drawn. Our framework demonstrates superior performance and robustness. The metadata of scientific publications about COVID-19 carries valuable information for multilabel topic classification. Compared to biological entities, full texts and MeSH can further enhance the performance of our framework for multilabel topic classification, but the improved performance is very limited.<\/jats:p>\n               <jats:p>Database URL: https:\/\/github.com\/pzczxs\/Enriched-BC7-LitCovid<\/jats:p>","DOI":"10.1093\/database\/baae106","type":"journal-article","created":{"date-parts":[[2024,10,21]],"date-time":"2024-10-21T14:02:33Z","timestamp":1729519353000},"source":"Crossref","is-referenced-by-count":0,"title":["Is metadata of articles about COVID-19 enough for multilabel topic classification task?"],"prefix":"10.1093","volume":"2024","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8602-1819","authenticated-orcid":false,"given":"Shuo","family":"Xu","sequence":"first","affiliation":[{"name":"College of Economics and Management, Beijing University of Technology , No. 100 PingLeYuan, Chaoyang District, Beijing 100124,","place":["P.R. China"]}]},{"given":"Yuefu","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Economics and Management, Beijing University of Technology , No. 100 PingLeYuan, Chaoyang District, Beijing 100124,","place":["P.R. China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3235-9806","authenticated-orcid":false,"given":"Liang","family":"Chen","sequence":"additional","affiliation":[{"name":"Institute of Scientific and Technical Information of China , No. 15 Fuxing Road, Haidian District, Beijing 100038,","place":["P.R. China"]}]},{"given":"Xin","family":"An","sequence":"additional","affiliation":[{"name":"School of Economics and Management, Beijing Forestry University , No. 35 Qinghua East Road, Haidian District, Beijing 100083,","place":["P.R. China"]}]}],"member":"286","published-online":{"date-parts":[[2024,10,21]]},"reference":[{"key":"2025030713453387800_R1","doi-asserted-by":"publisher","first-page":"D1534","DOI":"10.1093\/nar\/gkaa952","article-title":"LitCovid: an open database of COVID-19 literature","volume":"49","author":"Chen","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2025030713453387800_R2","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1038\/d41586-020-00694-1","article-title":"Keep up with the latest coronavirus research","volume":"579","author":"Chen","year":"2020","journal-title":"Nature"},{"key":"2025030713453387800_R3","doi-asserted-by":"publisher","first-page":"D1512","DOI":"10.1093\/nar\/gkac1005","article-title":"LitCovid in 2022: an information resource for the COVID-19 literature","volume":"51","author":"Chen","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2025030713453387800_R4","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baac069","article-title":"Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations","author":"Chen","year":"2022","journal-title":"Database"},{"key":"2025030713453387800_R5","doi-asserted-by":"publisher","first-page":"2584","DOI":"10.1109\/TCBB.2022.3173562","article-title":"LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation","volume":"19","author":"Chen","year":"2022","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2025030713453387800_R6","article-title":"Bert: pre-training of deep bidirectional transformers for language understanding","volume-title":"arXiv preprint, arXiv:1810.04805","author":"Devlin","year":"2018"},{"key":"2025030713453387800_R7","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"2025030713453387800_R8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3458754","article-title":"Domain-specific language model pretraining for biomedical natural language processing","volume":"3","author":"Gu","year":"2021","journal-title":"ACM Trans Comput Healthc"},{"key":"2025030713453387800_R9","doi-asserted-by":"publisher","first-page":"1040","DOI":"10.1108\/EL-09-2019-0207","article-title":"ML2S-SVM: multi-label least-squares support vector machine classifiers","volume":"37","author":"Xu","year":"2019","journal-title":"Electron Libr"},{"key":"2025030713453387800_R10","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baac103","article-title":"LitCovid ensemble learning for COVID-19 multi-label classification","author":"Gu","year":"2022","journal-title":"Database"},{"key":"2025030713453387800_R11","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baac056","article-title":"A BERT-based ensemble learning approach for the BioCreative VII challenges: full-text chemical identification and multi-label classification in PubMed articles","author":"Lin","year":"2022","journal-title":"Database"},{"article-title":"Team DUT914 at BioCreative VII LitCovid Track: A BioBERT-based feature enhancement approach","year":"2021","author":"Tang","key":"2025030713453387800_R12"},{"key":"2025030713453387800_R13","doi-asserted-by":"publisher","first-page":"3084","DOI":"10.1016\/j.patcog.2012.03.004","article-title":"An extensive experimental comparison of methods for multi-label learning","volume":"45","author":"Madjarov","year":"2012","journal-title":"Pattern Recognit"},{"article-title":"Detecting emotion in music","year":"2003","author":"Li","key":"2025030713453387800_R14"},{"key":"2025030713453387800_R15","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1007\/s10994-011-5256-5","article-title":"Classifier chains for multi-label classification","volume":"85","author":"Read","year":"2011","journal-title":"Mach Learn"},{"key":"2025030713453387800_R16","first-page":"145","article-title":"Pairwise preference learning and ranking","author":"F\u00fcrnkranz","year":"2003"},{"key":"2025030713453387800_R17","first-page":"999","article-title":"Multi-label learning by exploiting label dependency","author":"Zhang","year":"2010"},{"key":"2025030713453387800_R18","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1007\/s10994-008-5064-8","article-title":"Multilabel classification via calibrated label ranking","volume":"73","author":"F\u00fcrnkranz","year":"2008","journal-title":"Mach Learn"},{"key":"2025030713453387800_R19","doi-asserted-by":"publisher","first-page":"2096","DOI":"10.1016\/j.patcog.2015.01.004","article-title":"Scalable multi-output label prediction: from classifier chains to classifier trellises","volume":"48","author":"Read","year":"2015","journal-title":"Pattern Recognit"},{"key":"2025030713453387800_R20","first-page":"42","article-title":"Knowledge discovery in multi-label phenotype data","author":"Clare","year":"2001"},{"key":"2025030713453387800_R21","first-page":"1401","article-title":"A brief introduction to boosting","author":"Schapire","year":"1999"},{"key":"2025030713453387800_R22","doi-asserted-by":"publisher","first-page":"2038","DOI":"10.1016\/j.patcog.2006.12.019","article-title":"ML-KNN: a lazy learning approach to multi-label learning","volume":"40","author":"Zhang","year":"2007","journal-title":"Pattern Recognit"},{"key":"2025030713453387800_R23","first-page":"681","article-title":"A kernel method for multi-labelled classification","author":"Elisseeff","year":"2001"},{"key":"2025030713453387800_R24","doi-asserted-by":"publisher","first-page":"1338","DOI":"10.1109\/TKDE.2006.162","article-title":"Multilabel neural networks with applications to functional genomics and text categorization","volume":"18","author":"Zhang","year":"2006","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2025030713453387800_R25","doi-asserted-by":"crossref","DOI":"10.3115\/v1\/D14-1181","article-title":"Convolutional neural networks for sentence classification","volume-title":"arXiv preprint arXiv:1408.5882","author":"Kim","year":"2014"},{"key":"2025030713453387800_R26","first-page":"207","article-title":"Attention-based bidirectional long short-term memory networks for relation classification","author":"Zhou","year":"2016"},{"key":"2025030713453387800_R27","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1007\/s11192-021-04179-4","article-title":"PatentNet: multi-label classification of patent documents using deep learning based language understanding","volume":"127","author":"Haghighian Roudsari","year":"2022","journal-title":"Scientometrics"},{"key":"2025030713453387800_R28","article-title":"Bioformer: an efficient transformer language model for biomedical text mining","volume-title":"arXiv preprint, arXiv:2302.01588","author":"Fang","year":"2023"},{"article-title":"Team BJUT-BJFU at BioCreative VII LitCovid Track: A deep learning based method for multi-label topic classification in COVID-19 literature","year":"2021","author":"Xu","key":"2025030713453387800_R29"},{"key":"2025030713453387800_R30","article-title":"Fasttext. zip: compressing text classification models","volume-title":"arXiv preprint, arXiv:1612.03651","author":"Joulin","year":"2016"},{"key":"2025030713453387800_R31","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/978-981-97-8749-4_6","article-title":"Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets","volume":"9","author":"Xu","year":"2024","journal-title":"J Data Inf Sci"},{"key":"2025030713453387800_R32","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1007\/s11265-016-1137-2","article-title":"A classifier chain algorithm with k-means for multi-label classification on clouds","volume":"86","author":"Yu","year":"2017","journal-title":"J Signal Process Syst"},{"key":"2025030713453387800_R33","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1007\/s10115-021-01647-4","article-title":"Ensemble of classifier chains and decision templates for multi-label classification","volume":"64","author":"Freitas Rocha","year":"2022","journal-title":"Knowl Inf Syst"},{"key":"2025030713453387800_R34","first-page":"995","article-title":"Multi-label classification using ensembles of pruned sets","author":"Read","year":"2008"},{"key":"2025030713453387800_R35","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1109\/TKDE.2010.164","article-title":"Random k-labelsets for multilabel classification","volume":"23","author":"Tsoumakas","year":"2011","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2025030713453387800_R36","doi-asserted-by":"publisher","first-page":"1819","DOI":"10.1109\/TKDE.2013.39","article-title":"A review on multi-label learning algorithms","volume":"26","author":"Zhang","year":"2013","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2025030713453387800_R37","first-page":"65","article-title":"Learn from the information contained in the false splice sites as well as in the true splice sites using SVM","author":"Xu","year":"2007"},{"article-title":"CLaC at BioCreative VII LitCovid Track: Independent modules for multi-label classification of Covid articles","year":"2021","author":"Bagherzadeh","key":"2025030713453387800_R38"},{"key":"2025030713453387800_R39","doi-asserted-by":"publisher","first-page":"1427","DOI":"10.1007\/s11192-019-03162-4","article-title":"Types of DOI errors of cited references in web of science with a cleaning method","volume":"120","author":"Xu","year":"2019","journal-title":"Scientometrics"},{"key":"2025030713453387800_R40","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bat064","article-title":"BioC: a minimalist approach to interoperability for biomedical text processing","author":"Comeau","year":"2013","journal-title":"Database"},{"key":"2025030713453387800_R41","article-title":"Longformer: the long-document transformer","volume-title":"arXiv preprint, arXiv:2004.05150","author":"Beltagy","year":"2020"},{"key":"2025030713453387800_R42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-7-S1-S11","article-title":"A CRF-based system for recognizing chemical entity mentions (CEMs) in biomedical literature","volume":"7","author":"Xu","year":"2015","journal-title":"J Cheminf"},{"key":"2025030713453387800_R43","doi-asserted-by":"publisher","first-page":"W587","DOI":"10.1093\/nar\/gkz389","article-title":"PubTator central: automated concept annotation for biomedical full text articles","volume":"47","author":"Wei","year":"2019","journal-title":"Nucleic Acids Res"},{"author":"Mork","key":"2025030713453387800_R44","article-title":"The NLM medical text indexer system for indexing biomedical literature"},{"key":"2025030713453387800_R45","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-020-03583-6","article-title":"pyMeSHSim: an integrative python package for biomedical named entity recognition, normalization, and comparison of MeSH terms","volume":"21","author":"Luo","year":"2020","journal-title":"BMC Bioinf"},{"key":"2025030713453387800_R46","doi-asserted-by":"publisher","first-page":"i70","DOI":"10.1093\/bioinformatics\/btw294","article-title":"DeepMeSH: deep semantic representation for improving large-scale MeSH indexing","volume":"32","author":"Peng","year":"2016","journal-title":"Bioinformatics"},{"key":"2025030713453387800_R47","first-page":"406","article-title":"Random k-labelsets: an ensemble method for multilabel classification","author":"Tsoumakas","year":"2007"},{"key":"2025030713453387800_R48","doi-asserted-by":"publisher","first-page":"11515","DOI":"10.1016\/j.eswa.2011.03.028","article-title":"Feature subset selection using differential evolution and a statistical repair mechanism","volume":"38","author":"Khushaba","year":"2011","journal-title":"Expert Syst Appl"},{"key":"2025030713453387800_R49","article-title":"Efficient estimation of word representations in vector space","volume-title":"arXiv preprint, arXiv:1301.3781","author":"Mikolov","year":"2013"},{"key":"2025030713453387800_R50","first-page":"1532","article-title":"Glove: global vectors for word representation","author":"Pennington","year":"2014"},{"key":"2025030713453387800_R51","first-page":"19","article-title":"Aligning books and movies: Towards story-like visual explanations by watching movies and reading books","author":"Zhu","year":"2015"},{"key":"2025030713453387800_R52","article-title":"Roberta: A robustly optimized Bert pretraining approach","volume-title":"arXiv preprint, arXiv:1907.11692","author":"Liu","year":"2019"},{"key":"2025030713453387800_R53","doi-asserted-by":"publisher","DOI":"10.4018\/jdwm.2007070101","article-title":"Multi-label classification: an overview international journal of data warehousing and mining","volume":"3","author":"Tsoumakas","year":"2007","journal-title":"Int J Data Warehous Min"},{"article-title":"A Practical Guide to Support Vector Classification","year":"2003","author":"Hsu","key":"2025030713453387800_R54"},{"key":"2025030713453387800_R55","doi-asserted-by":"publisher","first-page":"415","DOI":"10.1109\/72.991427","article-title":"A comparison of methods for multiclass support vector machines","volume":"13","author":"Hsu","year":"2002","journal-title":"IEEE Trans Neural Netw"},{"key":"2025030713453387800_R56","doi-asserted-by":"publisher","first-page":"1279","DOI":"10.1093\/jamia\/ocz085","article-title":"ML-net: multi-label classification of biomedical texts with deep neural networks","volume":"26","author":"Du","year":"2019","journal-title":"J Am Med Inform Assoc"},{"article-title":"Team Bioformer at BioCreative VII LitCovid Track: multic-label topic classification for COVID-19 literature with a compact BERT model","year":"2021","author":"Fang","key":"2025030713453387800_R57"},{"key":"2025030713453387800_R58","doi-asserted-by":"publisher","first-page":"432","DOI":"10.1093\/bioinformatics\/btv585","article-title":"Automatic semantic classification of scientific literature according to the hallmarks of cancer","volume":"32","author":"Baker","year":"2016","journal-title":"Bioinformatics"},{"key":"2025030713453387800_R59","doi-asserted-by":"publisher","first-page":"2792","DOI":"10.1093\/bioinformatics\/btab042","article-title":"HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition","volume":"37","author":"Weber","year":"2021","journal-title":"Bioinformatics"},{"article-title":"CORD-19: The COVID-19 open research dataset","year":"2020","author":"Wang","key":"2025030713453387800_R60"},{"key":"2025030713453387800_R61","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0273725","article-title":"An active learning based approach for screening scholarly articles about the origins of SARS-CoV-2","volume":"17","author":"An","year":"2022","journal-title":"PLoS One"},{"key":"2025030713453387800_R62","article-title":"Leave no context behind: efficient infinite context transformers with infini-attention","volume-title":"arXiv preprint, arXiv:2404.07143","author":"Munkhdalai","year":"2024"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baae106\/60003107\/baae106.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baae106\/60003107\/baae106.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T13:46:29Z","timestamp":1741355189000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baae106\/7828987"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":62,"URL":"https:\/\/doi.org\/10.1093\/database\/baae106","relation":{},"ISSN":["1758-0463"],"issn-type":[{"type":"electronic","value":"1758-0463"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]},"article-number":"baae106"}}