{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T07:10:24Z","timestamp":1767337824161,"version":"build-2065373602"},"reference-count":60,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,7,4]],"date-time":"2022-07-04T00:00:00Z","timestamp":1656892800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>The COVID-19 pandemic has impacted daily lives around the globe. Since 2019, the amount of literature focusing on COVID-19 has risen exponentially. However, it is almost impossible for humans to read all of the studies and classify them. This article proposes a method of making an unsupervised model called a zero-shot classification model, based on the pre-trained BERT model. We used the CORD-19 dataset in conjunction with the LitCovid database to construct new vocabulary and prepare the test dataset. For NLI downstream task, we used three corpora: SNLI, MultiNLI, and MedNLI. We significantly reduced the training time by 98.2639% to build a task-specific machine learning model, using only one Nvidia Tesla V100. The final model can run faster and use fewer resources than its comparators. It has an accuracy of 27.84%, which is lower than the best-achieved accuracy by 6.73%, but it is comparable. Finally, we identified that the tokenizer and vocabulary more specific to COVID-19 could not outperform the generalized ones. Additionally, it was found that BART architecture affects the classification results.<\/jats:p>","DOI":"10.3390\/make4030030","type":"journal-article","created":{"date-parts":[[2022,7,4]],"date-time":"2022-07-04T11:15:05Z","timestamp":1656933305000},"page":"641-664","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Do We Need a Specific Corpus and Multiple High-Performance GPUs for Training the BERT Model? An Experiment on COVID-19 Dataset"],"prefix":"10.3390","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0978-7810","authenticated-orcid":false,"given":"Nontakan","family":"Nuntachit","sequence":"first","affiliation":[{"name":"Data Science Consortium, Faculty of Engineering, Chiang Mai University, Chiang Mai 50200, Thailand"},{"name":"Department of Internal Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5805-0866","authenticated-orcid":false,"given":"Prompong","family":"Sugunnasil","sequence":"additional","affiliation":[{"name":"College of Art, Media, and Technology, Chiang Mai University, Chiang Mai 50200, Thailand"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,4]]},"reference":[{"key":"ref_1","unstructured":"Wang, L.L., Lo, K., Chandrasekhar, Y., Reas, R., Yang, J., Eide, D., Funk, K., Kinney, R., Liu, Z., and Merrill, W. (2020). CORD-19: The COVID-19 Open Research Dataset. arXiv."},{"key":"ref_2","unstructured":"Pushp, P.K., and Srivastava, M.M. (2021, August 10). Train Once, Test Anywhere: Zero-Shot Learning for Text Classification. Available online: http:\/\/arxiv.org\/abs\/1712.05972."},{"key":"ref_3","unstructured":"(2022, February 27). COVID-19 Open Research Dataset Challenge (CORD-19)|Kaggle. Available online: https:\/\/www.kaggle.com\/datasets\/allen-institute-for-ai\/CORD-19-research-challenge\/code?datasetId=551982&searchQuery=zero-shot."},{"key":"ref_4","first-page":"830","article-title":"Importance of semantic representation: Dataless classification","volume":"2","author":"Chang","year":"2008","journal-title":"Proc. Natl. Conf. Artif. Intell."},{"key":"ref_5","unstructured":"Xian, Y., Lampert, C.H., Schiele, B., and Akata, Z. (2021, August 10). Zero-Shot Learning\u2014A Comprehensive Evaluation of the Good, the Bad and the Ugly. Available online: http:\/\/arxiv.org\/abs\/1707.00600."},{"key":"ref_6","first-page":"5998","article-title":"Attention Is All You Need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Processing Syst."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. (2018, January 1\u20136). Deep Contextualized word representations. Proceedings of the NAACL HLT 2018\u20142018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LO, USA.","DOI":"10.18653\/v1\/N18-1202"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_9","unstructured":"Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the NAACL HLT 2019\u20142019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA. Available online: http:\/\/arxiv.org\/abs\/1810.04805."},{"key":"ref_10","unstructured":"Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv."},{"key":"ref_11","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv."},{"key":"ref_12","unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3458754","article-title":"Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing","volume":"3","author":"Gu","year":"2022","journal-title":"ACM Trans. Comput. Healthc."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Alsentzer, E., Murphy, J.R., Boag, W., Weng, W.H., Jin, D., Naumann, T., and McDermott, M. (2019). Publicly Available Clinical BERT Embeddings. arXiv.","DOI":"10.18653\/v1\/W19-1909"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/s41746-021-00455-y","article-title":"Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction","volume":"4","author":"Rasmy","year":"2021","journal-title":"NPJ Digit. Med."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci. Data"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: A pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Beltagy, I., Lo, K., and Cohan, A. (2019). SciBERT: A Pretrained Language Model for Scientific Text. arXiv.","DOI":"10.18653\/v1\/D19-1371"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., and Fidler, S. (2015). Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books. arXiv.","DOI":"10.1109\/ICCV.2015.11"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Barbieri, F., Camacho-Collados, J., Neves, L., and Espinosa-Anke, L. (2020). TWEETEVAL: Unified benchmark and comparative evaluation for tweet classification. Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020, The Association for Computational Linguistics. Available online: https:\/\/doi.org\/10.18653\/v1\/2020.findings-emnlp.148.","DOI":"10.18653\/v1\/2020.findings-emnlp.148"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Romera-Paredes, B., and Torr, P.H.S. (2015, January 6\u201311). An embarrassingly simple approach to zero-shot learning. Proceedings of the 32nd International Conference on Machine Learning, Lille, France.","DOI":"10.1007\/978-3-319-50077-5_2"},{"key":"ref_22","unstructured":"(2022, March 03). Zero-Shot Learning in Modern NLP|Joe Davison Blog. Available online: https:\/\/joeddav.github.io\/blog\/2020\/05\/29\/ZSL.html."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019, January 3\u20137). Sentence-BERT: Sentence embeddings using siamese BERT-networks. Proceedings of the EMNLP-IJCNLP 2019\u20142019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Hong Kong, China.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_24","unstructured":"(2022, March 07). Cognitive Computation Group. Available online: https:\/\/cogcomp.seas.upenn.edu\/page\/resource_view\/89."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Yin, W., Hay, J., and Roth, D. (2019, January 3\u20137). Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach. Proceedings of the EMNLP-IJCNLP 2019\u20142019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Hong Kong, China.","DOI":"10.18653\/v1\/D19-1404"},{"key":"ref_26","unstructured":"(2022, June 16). ScienceDirect Search Results\u2014Keywords (Zero Shot Classification). Available online: https:\/\/www.sciencedirect.com\/search?qs=zero%20shot%20classification&articleTypes=FLA&lastSelectedFacet=articleTypes."},{"key":"ref_27","unstructured":"(2022, June 16). ScienceDirect Search Results\u2014Keywords (Zero Shot Classification). Available online: https:\/\/www.sciencedirect.com\/search?qs=zero%20shot%20classification&articleTypes=FLA&lastSelectedFacet=years&years=2022%2C2021%2C2020."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Chalkidis, I., Fergadiotis, M., and Androutsopoulos, I. (2021, January 7\u201311). MultiEURLEX\u2014A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer. Proceedings of the EMNLP 2021\u20142021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic.","DOI":"10.18653\/v1\/2021.emnlp-main.559"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Mahapatra, D., Bozorgtabar, B., and Ge, Z. (2021, January 11\u201317). Medical Image Classification Using Generalized Zero Shot Learning. Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00373"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Huang, S.C., Shen, L., Lungren, M.P., and Yeung, S. (2021, January 11\u201317). GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition. Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00391"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Mahapatra, D., Ge, Z., and Reyes, M. (IEEE Trans. Med. Imaging, 2022). Self-Supervised Generalized Zero Shot Learning For Medical Image Classification Using Novel Interpretable Saliency Maps, IEEE Trans. Med. Imaging, online ahead of print.","DOI":"10.1109\/TMI.2022.3163232"},{"key":"ref_32","unstructured":"(2022, February 27). Models\u2014Hugging Face. Available online: https:\/\/huggingface.co\/models?search=bert."},{"key":"ref_33","unstructured":"(2022, February 27). Models\u2014Hugging Face. Available online: https:\/\/huggingface.co\/models?pipeline_tag=zero-shot-classification&sort=downloads."},{"key":"ref_34","unstructured":"Lupart, S., Favre, B., Nikoulina, V., and Ait-Mokhtar, S. (2021, August 10). Zero-Shot and Few-Shot Classification of Biomedical Articles in Context of the COVID-19 Pandemic. Available online: www.aaai.org."},{"key":"ref_35","unstructured":"(2022, March 01). COVID-19 Open Research Dataset Challenge (CORD-19)|Kaggle. Available online: https:\/\/www.kaggle.com\/dataset\/08dd9ead3afd4f61ef246bfd6aee098765a19d9f6dbf514f0142965748be859b\/version\/87."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Bowman, S.R., Angeli, G., Potts, C., and Manning, C.D. (2015, January 17\u201321). A large annotated corpus for learning natural language inference. Proceedings of the Conference Proceedings\u2014EMNLP 2015: Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal.","DOI":"10.18653\/v1\/D15-1075"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Williams, A., Nangia, N., and Bowman, S.R. (2018, January 1\u20136). A broad-coverage challenge corpus for sentence understanding through inference. Proceedings of the NAACL HLT 2018\u20142018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LO, USA.","DOI":"10.18653\/v1\/N18-1101"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Romanov, A., and Shivade, C. (November, January 31). Lessons from Natural Language Inference in the Clinical Domain. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-1187"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"D1534","DOI":"10.1093\/nar\/gkaa952","article-title":"LitCovid: An open database of COVID-19 literature","volume":"49","author":"Chen","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"ref_40","unstructured":"(2022, March 26). joeddav\/xlm-roberta-large-xnli Hugging Face. Available online: https:\/\/huggingface.co\/joeddav\/xlm-roberta-large-xnli."},{"key":"ref_41","unstructured":"(2022, March 26). joeddav\/bart-large-mnli-yahoo-answers Hugging Face. Available online: https:\/\/huggingface.co\/joeddav\/bart-large-mnli-yahoo-answers."},{"key":"ref_42","unstructured":"(2022, March 26). digitalepidemiologylab\/covid-twitter-bert-v2-mnli Hugging Face. Available online: https:\/\/huggingface.co\/digitalepidemiologylab\/covid-twitter-bert-v2-mnli."},{"key":"ref_43","unstructured":"(2022, April 05). Tesla P100 Data Center Accelerator|NVIDIA. Available online: https:\/\/www.nvidia.com\/en-us\/data-center\/tesla-p100\/."},{"key":"ref_44","unstructured":"(2022, April 05). Comparison between NVIDIA GeForce and Tesla GPUs. Available online: https:\/\/www.microway.com\/knowledge-center-articles\/comparison-of-nvidia-geforce-gpus-and-nvidia-tesla-gpus\/."},{"key":"ref_45","unstructured":"NVIDIA Tesla P100 16 GB, vs. (2022, April 05). Titan Xp Comparison. Available online: https:\/\/www.gpuzoo.com\/Compare\/NVIDIA_Tesla_P100_16_GB__vs__NVIDIA_Titan_Xp\/."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. (2019). BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation. arXiv.","DOI":"10.18653\/v1\/2020.acl-main.703"},{"key":"ref_47","unstructured":"(2022, April 07). BART training time Issue #1525 pytorch\/fairseq GitHub. Available online: https:\/\/github.com\/pytorch\/fairseq\/issues\/1525."},{"key":"ref_48","unstructured":"M\u00fcller, M., Salath\u00e9, M., and Kummervold, P.E. (2020). COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter. arXiv."},{"key":"ref_49","unstructured":"You, Y., Li, J., Reddi, S., Hseu, J., Kumar, S., Bhojanapalli, S., Song, X., Demmel, J., Keutzer, K., and Hsieh, C.J. (2019). Large Batch Optimization for Deep Learning: Training BERT in 76 minutes. arXiv."},{"key":"ref_50","unstructured":"Wu, Y., Schuster, M., Chen, Z., Le, Q., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., and Macherey, K. (2022). Google\u2019s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv."},{"key":"ref_51","unstructured":"Guti\u00e9rrez, B.J., Zeng, J., Zhang, D., Zhang, P., and Su, Y. (2020). Document Classification for COVID-19 Literature. arXiv."},{"key":"ref_52","unstructured":"Beltagy, I., Peters, M.E., and Cohan, A. (2020). Longformer: The Long-Document Transformer. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Wahle, J.P., Ashok, N., Ruas, T., Meuschke, N., Ghosal, T., and Gipp, B. (\u2013, January 28). Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection. Proceedings of the Information for a Better World: Shaping the Global Future: 17th International Conference, iConference 2022, Virtual.","DOI":"10.22541\/au.167528154.41917807\/v1"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"106401","DOI":"10.1016\/j.dib.2020.106401","article-title":"A stance data set on polarized conversations on Twitter about the efficacy of hydroxychloroquine as a treatment for COVID-19","volume":"33","author":"Mutlu","year":"2020","journal-title":"Data in brief."},{"key":"ref_55","unstructured":"Cui, L., and Lee, D. (2020). CoAID: COVID-19 Healthcare Misinformation Dataset. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Zhou, X., Mulay, A., Ferrara, E., and Zafarani, R. (2020, January 19\u201323). ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research. Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM \u201920). Association for Computing Machinery, Virtual.","DOI":"10.1145\/3340531.3412880"},{"key":"ref_57","unstructured":"Memon, S.A., and Carley, K.M. (2020). Characterizing COVID-19 Misinformation Communities Using a Novel Twitter Dataset. arXiv."},{"key":"ref_58","unstructured":"Agarwal, I. (2021, April 20). COVID19FN. Available online: https:\/\/data.mendeley.com\/datasets\/b96v5hmfv6\/2."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Eren, M.E., Solovyev, N., Raff, E., Nicholas, C., and Johnson, B. (October, January 29). COVID-19 Kaggle Literature Organization. Proceedings of the ACM Symposium on Document Engineering, DocEng 2020, Virtual.","DOI":"10.1145\/3395027.3419591"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzm\u00e1n, F., Grave, E., Ott, M., Zettlemoyer, L., and Stoyanov, V. (2019). Unsupervised Cross-lingual Representation Learning at Scale. arXiv.","DOI":"10.18653\/v1\/2020.acl-main.747"}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/4\/3\/30\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:42:25Z","timestamp":1760139745000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/4\/3\/30"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,4]]},"references-count":60,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["make4030030"],"URL":"https:\/\/doi.org\/10.3390\/make4030030","relation":{},"ISSN":["2504-4990"],"issn-type":[{"type":"electronic","value":"2504-4990"}],"subject":[],"published":{"date-parts":[[2022,7,4]]}}}