{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,24]],"date-time":"2026-07-24T12:41:20Z","timestamp":1784896880815,"version":"3.55.0"},"reference-count":54,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2022,9,24]],"date-time":"2022-09-24T00:00:00Z","timestamp":1663977600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,11,19]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain. Among the two main branches of pre-trained language models in the general language domain, i.e. BERT (and its variants) and GPT (and its variants), the first one has been extensively studied in the biomedical domain, such as BioBERT and PubMedBERT. While they have achieved great success on a variety of discriminative downstream biomedical tasks, the lack of generation ability constrains their application scope. In this paper, we propose BioGPT, a domain-specific generative Transformer language model pre-trained on large-scale biomedical literature. We evaluate BioGPT on six biomedical natural language processing tasks and demonstrate that our model outperforms previous models on most tasks. Especially, we get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks, respectively, and 78.2% accuracy on PubMedQA, creating a new record. Our case study on text generation further demonstrates the advantage of BioGPT on biomedical literature to generate fluent descriptions for biomedical terms.<\/jats:p>","DOI":"10.1093\/bib\/bbac409","type":"journal-article","created":{"date-parts":[[2022,9,26]],"date-time":"2022-09-26T08:48:38Z","timestamp":1664182118000},"source":"Crossref","is-referenced-by-count":1012,"title":["BioGPT: generative pre-trained transformer for biomedical text generation and mining"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9062-3484","authenticated-orcid":false,"given":"Renqian","family":"Luo","sequence":"first","affiliation":[{"name":"Microsoft Research Asia , Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Liai","family":"Sun","sequence":"additional","affiliation":[{"name":"Peking University , Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9823-9033","authenticated-orcid":false,"given":"Yingce","family":"Xia","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia , Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9095-0776","authenticated-orcid":false,"given":"Tao","family":"Qin","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia , Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3672-5436","authenticated-orcid":false,"given":"Sheng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Microsoft Research , Redmond, WA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9067-0918","authenticated-orcid":false,"given":"Hoifung","family":"Poon","sequence":"additional","affiliation":[{"name":"Microsoft Research , Redmond, WA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Tie-Yan","family":"Liu","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia , Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2022,9,24]]},"reference":[{"key":"2022112111112296600_ref1","volume-title":"International Conference on Learning Representations","author":"Wang","year":"2019"},{"key":"2022112111112296600_ref2","first-page":"4171","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2022112111112296600_ref3","article-title":"Roberta: A robustly optimized bert pretraining approach","author":"Liu","year":"2019"},{"key":"2022112111112296600_ref4","volume-title":"International Conference on Learning Representations","author":"Clark","year":"2019"},{"key":"2022112111112296600_ref5","article-title":"Improving language understanding by generative pre-training","author":"Radford","year":"2018"},{"issue":"8","key":"2022112111112296600_ref6","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI blog"},{"key":"2022112111112296600_ref7","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Advances in neural information processing systems"},{"key":"2022112111112296600_ref8","volume-title":"Proceedings of the 18th BioNLP Workshop and Shared Task","author":"Peng"},{"issue":"1","key":"2022112111112296600_ref9","first-page":"1","article-title":"Domain-specific language model pretraining for biomedical natural language processing","volume":"3","author":"Yu","year":"2021","journal-title":"ACM Transactions on Computing for Healthcare (HEALTH)"},{"issue":"4","key":"2022112111112296600_ref10","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2019","journal-title":"Bioinformatics"},{"key":"2022112111112296600_ref11","article-title":"Gpt-3 models are poor few-shot learners in the biomedical domain","author":"Moradi","year":"2021"},{"key":"2022112111112296600_ref12","article-title":"Thinking about gpt-3 in-context learning for biomedical ie? think again","author":"Guti\u00e9rrez","year":"2022"},{"key":"2022112111112296600_ref13","article-title":"BioCreative V CDR task corpus: a resource for chemical disease relation extraction","author":"Li","journal-title":"Database : the journal of biological databases and curation"},{"key":"2022112111112296600_ref14","article-title":"Discovering drug-target interaction knowledge from biomedical literature","author":"Hou","year":"2021"},{"issue":"5","key":"2022112111112296600_ref15","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1016\/j.jbi.2013.07.011","article-title":"The ddi corpus: An annotated corpus with pharmacological substances and drug\u2013drug interactions","volume":"46","author":"Herrero-Zazo","year":"2013","journal-title":"J Biomed Inform"},{"key":"2022112111112296600_ref16","doi-asserted-by":"crossref","first-page":"2567","DOI":"10.18653\/v1\/D19-1259","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Jin","year":"2019"},{"issue":"3","key":"2022112111112296600_ref17","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/bioinformatics\/btv585","article-title":"Automatic semantic classification of scientific literature according to the hallmarks of cancer","volume":"32","author":"Baker","year":"2016","journal-title":"Bioinformatics"},{"key":"2022112111112296600_ref18","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Beltagy"},{"issue":"1","key":"2022112111112296600_ref19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","article-title":"Mimic-iii, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Scientific data"},{"key":"2022112111112296600_ref20","article-title":"Electramed: a new pre-trained language representation model for biomedical nlp","author":"Miolo","year":"2021"},{"key":"2022112111112296600_ref21","article-title":"Dare: Data augmented relation extraction with gpt-2","author":"Papanikolaou","year":"2020"},{"key":"2022112111112296600_ref22","article-title":"Large language models are zero-shot clinical information extractors","author":"Agrawal","year":"2022"},{"key":"2022112111112296600_ref23","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.emnlp-main.303","article-title":"Global-to-local neural networks for document-level relation extraction","author":"Wang","year":"2020"},{"key":"2022112111112296600_ref24","doi-asserted-by":"crossref","first-page":"2370","DOI":"10.18653\/v1\/2021.findings-emnlp.204","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Cabot","year":"2021"},{"key":"2022112111112296600_ref25","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2022.bionlp-1.2","article-title":"A sequence-to-sequence approach for document-level relation extraction","author":"Giorgi","year":"2022"},{"key":"2022112111112296600_ref26","article-title":"Qanet: Combining local convolution with global self-attention for reading comprehension","author":"Yu","year":"2018"},{"key":"2022112111112296600_ref27","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.emnlp-main.523","article-title":"Luke: deep contextualized entity representations with entity-aware self-attention","author":"Yamada","year":"2020"},{"key":"2022112111112296600_ref28","first-page":"143","volume-title":"Proceedings of the 20th Workshop on Biomedical Language Processing","author":"Kanakarajan"},{"key":"2022112111112296600_ref29","doi-asserted-by":"crossref","first-page":"8003","DOI":"10.18653\/v1\/2022.acl-long.551","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Yasunaga","year":"2022"},{"issue":"1","key":"2022112111112296600_ref30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-015-0564-6","article-title":"An overview of the bioasq large-scale biomedical semantic indexing and question answering competition","volume":"16","author":"Tsatsaronis","year":"2015","journal-title":"BMC bioinformatics"},{"key":"2022112111112296600_ref31","volume-title":"Joint European Conference on Machine Learning and Knowledge Discovery in Databases","author":"Nentidis"},{"key":"2022112111112296600_ref32","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.207","article-title":"Specter: Document-level representation learning using citation-informed transformers","author":"Cohan","year":"2020"},{"key":"2022112111112296600_ref33","first-page":"2335","volume-title":"Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers","author":"Zeng","year":"2014"},{"key":"2022112111112296600_ref34","doi-asserted-by":"crossref","first-page":"207","DOI":"10.18653\/v1\/P16-2034","volume-title":"Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers)","author":"Zhou","year":"2016"},{"key":"2022112111112296600_ref35","doi-asserted-by":"crossref","first-page":"1361","DOI":"10.18653\/v1\/P19-1131","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Sun","year":"2019"},{"key":"2022112111112296600_ref36","doi-asserted-by":"crossref","first-page":"4054","DOI":"10.24963\/ijcai.2020\/561","volume-title":"Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence","author":"Yuan","year":"2020"},{"key":"2022112111112296600_ref37","first-page":"3787","volume-title":"Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence","author":"Liu","year":"2020"},{"key":"2022112111112296600_ref38","doi-asserted-by":"crossref","first-page":"1476","DOI":"10.18653\/v1\/2020.acl-main.136","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Wei","year":"2020"},{"key":"2022112111112296600_ref39","first-page":"1409","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Tsu-Jui","year":"2019"},{"key":"2022112111112296600_ref40","first-page":"1572","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics","author":"Wang"},{"key":"2022112111112296600_ref41","first-page":"185","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Yan"},{"key":"2022112111112296600_ref42","doi-asserted-by":"crossref","first-page":"506","DOI":"10.18653\/v1\/P18-1047","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Zeng","year":"2018"},{"key":"2022112111112296600_ref43","doi-asserted-by":"crossref","first-page":"236","DOI":"10.18653\/v1\/2020.findings-emnlp.23","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Zhang","year":"2020"},{"key":"2022112111112296600_ref44","volume-title":"Joint entity and relation extraction with set prediction networks","author":"Sui","year":"2020"},{"key":"2022112111112296600_ref45","article-title":"Reinforced mnemonic reader for machine reading comprehension","author":"Hu","year":"2017"},{"key":"2022112111112296600_ref46","doi-asserted-by":"crossref","first-page":"1715","DOI":"10.18653\/v1\/P16-1162","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sennrich","year":"2016"},{"key":"2022112111112296600_ref47","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Advances in neural information processing systems"},{"key":"2022112111112296600_ref48","article-title":"Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing","author":"Liu","year":"2021"},{"key":"2022112111112296600_ref49","first-page":"4582","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Li"},{"key":"2022112111112296600_ref50","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)","author":"Ott"},{"key":"2022112111112296600_ref51","volume-title":"International Conference on Learning Representations","author":"Kingma","year":"2015"},{"key":"2022112111112296600_ref52","first-page":"38","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Wolf"},{"key":"2022112111112296600_ref53","first-page":"7871","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lewis"},{"key":"2022112111112296600_ref54","article-title":"Scifive: a text-to-text transformer model for biomedical literature","author":"Phan","year":"2021"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/6\/bbac409\/47144271\/bbac409.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/6\/bbac409\/47144271\/bbac409.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T22:52:22Z","timestamp":1676501542000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac409\/6713511"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,24]]},"references-count":54,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,11,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac409","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,11]]},"published":{"date-parts":[[2022,9,24]]},"article-number":"bbac409"}}