{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,24]],"date-time":"2026-07-24T10:26:39Z","timestamp":1784888799355,"version":"3.55.0"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T00:00:00Z","timestamp":1634256000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Science Foundation","award":["CNS-1738645 and DRL-1837446"],"award-info":[{"award-number":["CNS-1738645 and DRL-1837446"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Healthcare"],"published-print":{"date-parts":[[2022,1,31]]},"abstract":"<jats:p>\n            Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this article, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition. To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding &amp; Reasoning Benchmark) at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"url\" xlink:href=\"https:\/\/aka.ms\/BLURB\">https:\/\/aka.ms\/BLURB<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3458754","type":"journal-article","created":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T01:39:53Z","timestamp":1634434793000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1284,"title":["Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing"],"prefix":"10.1145","volume":"3","author":[{"given":"Yu","family":"Gu","sequence":"first","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Robert","family":"Tinn","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hao","family":"Cheng","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Michael","family":"Lucas","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Naoto","family":"Usuyama","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaodong","family":"Liu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2150-1747","authenticated-orcid":false,"given":"Tristan","family":"Naumann","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jianfeng","family":"Gao","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hoifung","family":"Poon","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2021,10,15]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-1909"},{"key":"e_1_3_2_3_2","volume-title":"Proceedings of the 12th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2018, New Orleans, Louisiana, USA, June 5-6, 2018","author":"Apidianaki Marianna","year":"2018","unstructured":"Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, and Marine Carpuat (Eds.). 2018. Proceedings of the 12th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2018, New Orleans, Louisiana, USA, June 5-6, 2018. Association for Computational Linguistics. https:\/\/www.aclweb.org\/anthology\/volumes\/S18-1\/."},{"issue":"8","key":"e_1_3_2_4_2","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-12-S8-S4","article-title":"BioCreative III interactive task: An overview","volume":"12","author":"Arighi Cecilia N.","year":"2011","unstructured":"Cecilia N. Arighi, Phoebe M. Roberts, Shashank Agarwal, Sanmitra Bhattacharya, Gianni Cesareni, Andrew Chatr-aryamontri, Simon Clematide, et\u00a0al. 2011. BioCreative III interactive task: An overview. BMC Bioinformatics 12, 8 (Oct. 2011), S4. https:\/\/doi.org\/10.1186\/1471-2105-12-S8-S4","journal-title":"BMC Bioinformatics"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.5555\/2145432.2145474"},{"issue":"24","key":"e_1_3_2_6_2","doi-asserted-by":"crossref","first-page":"3973","DOI":"10.1093\/bioinformatics\/btx454","article-title":"Cancer hallmarks analytics tool (CHAT): A text mining approach to organize and evaluate scientific literature on cancer","volume":"33","author":"Baker Simon","year":"2017","unstructured":"Simon Baker, Imran Ali, Ilona Silins, Sampo Pyysalo, Yufan Guo, Johan H\u00f6gberg, Ulla Stenius, and Anna Korhonen. 2017. Cancer hallmarks analytics tool (CHAT): A text mining approach to organize and evaluate scientific literature on cancer. Bioinformatics 33, 24 (2017), 3973\u20133981.","journal-title":"Bioinformatics"},{"issue":"3","key":"e_1_3_2_7_2","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/bioinformatics\/btv585","article-title":"Automatic semantic classification of scientific literature according to the hallmarks of cancer","volume":"32","author":"Baker Simon","year":"2015","unstructured":"Simon Baker, Ilona Silins, Yufan Guo, Imran Ali, Johan H\u00f6gberg, Ulla Stenius, and Anna Korhonen. 2015. Automatic semantic classification of scientific literature according to the hallmarks of cancer. Bioinformatics 32, 3 (2015), 432\u2013440.","journal-title":"Bioinformatics"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1279"},{"key":"e_1_3_2_9_2","first-page":"3615","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing","author":"Beltagy Iz","year":"2019","unstructured":"Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919). 3615\u20133620. https:\/\/doi.org\/10.18653\/v1\/D19-1371"},{"key":"e_1_3_2_10_2","volume-title":"Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL 2017, Vancouver, Canada, August 3-4, 2017","author":"Bethard Steven","year":"2017","unstructured":"Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel M. Cer, and David Jurgens (Eds.). 2017. Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL 2017, Vancouver, Canada, August 3-4, 2017. Association for Computational Linguistics. https:\/\/www.aclweb.org\/anthology\/volumes\/S17-2\/."},{"key":"e_1_3_2_11_2","volume-title":"Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016, San Diego, CA, USA, June 16-17, 2016","author":"Bethard Steven","year":"2016","unstructured":"Steven Bethard, Daniel M. Cer, Marine Carpuat, David Jurgens, Preslav Nakov, and Torsten Zesch (Eds.). 2016. Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016, San Diego, CA, USA, June 16-17, 2016. Association for Computational Linguistics. https:\/\/www.aclweb.org\/anthology\/volumes\/S16-1\/."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-015-0472-9"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3176258.3176316"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007379606734"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-017-1776-8"},{"key":"e_1_3_2_16_2","volume-title":"Proceedings of the 18th BioNLP Workshop and Shared Task, BioNLP@ACL 2019, Florence, Italy, August 1, 2019","author":"Demner-Fushman Dina","year":"2019","unstructured":"Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, and Junichi Tsujii (Eds.). 2019. Proceedings of the 18th BioNLP Workshop and Shared Task, BioNLP@ACL 2019, Florence, Italy, August 1, 2019. Association for Computational Linguistics. https:\/\/www.aclweb.org\/anthology\/volumes\/W19-50\/."},{"key":"e_1_3_2_17_2","first-page":"4171","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers)","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers). 4171\u20134186."},{"key":"e_1_3_2_18_2","volume-title":"Proceedings of the 7th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2013, Atlanta, Georgia, USA, June 14-15, 2013","author":"Diab Mona T.","year":"2013","unstructured":"Mona T. Diab, Timothy Baldwin, and Marco Baroni (Eds.). 2013. Proceedings of the 7th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2013, Atlanta, Georgia, USA, June 14-15, 2013. Association for Computational Linguistics. https:\/\/www.aclweb.org\/anthology\/volumes\/S13-2\/."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.5555\/2598938.2599127"},{"issue":"11","key":"e_1_3_2_20_2","doi-asserted-by":"crossref","first-page":"1279","DOI":"10.1093\/jamia\/ocz085","article-title":"ML-Net: Multi-label classification of biomedical texts with deep neural networks","volume":"26","author":"Du Jingcheng","year":"2019","unstructured":"Jingcheng Du, Qingyu Chen, Yifan Peng, Yang Xiang, Cui Tao, and Zhiyong Lu. 2019. ML-Net: Multi-label classification of biomedical texts with deep neural networks. Journal of the American Medical Informatics Association 26, 11 (2019), 1279\u20131285. https:\/\/doi.org\/10.1093\/jamia\/ocz085","journal-title":"Journal of the American Medical Informatics Association"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0092-8674(00)81683-9"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2013.07.011"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1031"},{"key":"e_1_3_2_25_2","volume-title":"Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Jia Robin","year":"2019","unstructured":"Robin Jia, Cliff Wong, and Hoifung Poon. 2019. Document-level N-ary relation extraction with multiscale representation learning. In Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT\u201919)."},{"key":"e_1_3_2_26_2","first-page":"2567","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing","author":"Jin Qiao","year":"2019","unstructured":"Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William Cohen, and Xinghua Lu. 2019. PubMedQA: A dataset for biomedical research question answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP\u201919). 2567\u20132577. https:\/\/doi.org\/10.18653\/v1\/D19-1259"},{"issue":"1","key":"e_1_3_2_27_2","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson Alistair E. W.","year":"2016","unstructured":"Alistair E. W. Johnson, Tom J. Pollard, Lu Shen, Li-Wei H. Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G. Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3, 1 (May 2016), 160035. https:\/\/doi.org\/10.1038\/sdata.2016.35","journal-title":"Scientific Data"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.5555\/1567594.1567610"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.5555\/2107691.2107693"},{"key":"e_1_3_2_30_2","first-page":"1","volume-title":"Proceedings of the Fifth BioCreative Challenge Evaluation Workshop, Sevilla, Spain","author":"Kim Sun","year":"2015","unstructured":"Sun Kim, Rezarta Islamaj Dogan, Andrew Chatr-aryamontri, Mike Tyers, W. John Wilbur, and Donald C. Comeau. 2015. Overview of BioCreative V BioC track. In Proceedings of the Fifth BioCreative Challenge Evaluation Workshop, Sevilla, Spain. 1\u20139."},{"key":"e_1_3_2_31_2","volume-title":"(ICLR\u201915)","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR\u201915). http:\/\/arxiv.org\/abs\/1412.6980."},{"key":"e_1_3_2_32_2","first-page":"141","volume-title":"Proceedings of the 6th BioCreative Challenge Evaluation Workshop","volume":"1","author":"Krallinger Martin","year":"2017","unstructured":"Martin Krallinger, Obdulia Rabal, Saber A. Akhondi, Mart\u0131n P\u00e9rez P\u00e9rez, Jes\u00fas Santamar\u00eda, G. P. Rodr\u00edguez, G. Tsatsaronis, et\u00a0al. 2017. Overview of the BioCreative VI Chemical-Protein Interaction Track. In Proceedings of the 6th BioCreative Challenge Evaluation Workshop, Vol. 1. 141\u2013146."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-2012"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/645530.655813"},{"issue":"4","key":"e_1_3_2_35_2","first-page":"1234","article-title":"BioBERT: A pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee Jinhyuk","year":"2019","unstructured":"Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2019. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2019), 1234\u20131240. https:\/\/doi.org\/10.1093\/bioinformatics\/btz682","journal-title":"Bioinformatics"},{"key":"e_1_3_2_36_2","article-title":"BioCreative V CDR task corpus: A resource for chemical disease relation extraction","author":"Li Jiao","year":"2016","unstructured":"Jiao Li, Yueping Sun, Robin J. Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J. Mattingly, Thomas C. Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR task corpus: A resource for chemical disease relation extraction. Database. Online, May 8, 2016.","journal-title":"Database."},{"key":"e_1_3_2_37_2","volume-title":"Semi-Supervised Learning for Natural Language","author":"Liang Percy","year":"2005","unstructured":"Percy Liang. 2005. Semi-Supervised Learning for Natural Language. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA."},{"key":"e_1_3_2_38_2","unstructured":"Xiaodong Liu Hao Cheng Pengcheng He Weizhu Chen Yu Wang Hoifung Poon and Jianfeng Gao. 2020. Adversarial training for large neural language models. arXiv:2004.08994."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1092"},{"key":"e_1_3_2_40_2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692."},{"key":"e_1_3_2_41_2","article-title":"Overview of the gene ontology task at BioCreative IV","author":"Mao Yuqing","year":"2014","unstructured":"Yuqing Mao, Kimberly Van Auken, Donghui Li, Cecilia N. Arighi, Peter McQuilton, G. Thomas Hayman, Susan Tweedie, et\u00a0al. 2014. Overview of the gene ontology task at BioCreative IV. Database. Online, August 25, 2014.","journal-title":"Database."},{"key":"e_1_3_2_42_2","unstructured":"Tomas Mikolov Kai Chen Greg Corrado and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301. 3781."},{"key":"e_1_3_2_43_2","first-page":"553","volume-title":"Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases","author":"Nentidis Anastasios","year":"2019","unstructured":"Anastasios Nentidis, Konstantinos Bougiatiotis, Anastasia Krithara, and Georgios Paliouras. 2019. Results of the seventh edition of the BioASQ challenge. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 553\u2013568."},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1006\/csla.1994.1001"},{"key":"e_1_3_2_45_2","first-page":"197","volume-title":"Proceedings of the Conference of the Association for Computational Linguistics,","volume":"2018","author":"Nye Benjamin","year":"2018","unstructured":"Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain J. Marshall, Ani Nenkova, and Byron C. Wallace. 2018. A corpus with multi-level annotations of patients, interventions and outcomes to support language processing for medical literature. In Proceedings of the Conference of the Association for Computational Linguistics, Vol. 2018. 197."},{"key":"e_1_3_2_46_2","doi-asserted-by":"crossref","first-page":"58","DOI":"10.18653\/v1\/W19-5006","volume-title":"Proceedings of the 18th BioNLP Workshop and Shared Task","author":"Peng Yifan","year":"2019","unstructured":"Yifan Peng, Shankai Yan, and Zhiyong Lu. 2019. Transfer learning in biomedical natural language processing: An evaluation of BERT and ELMo on ten benchmarking datasets. In Proceedings of the 18th BioNLP Workshop and Shared Task. 58\u201365. https:\/\/doi.org\/10.18653\/v1\/W19-5006"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_48_2","first-page":"2227","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)","author":"Peters Matthew","year":"2018","unstructured":"Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2227\u20132237. https:\/\/doi.org\/10.18653\/v1\/N18-1202"},{"key":"e_1_3_2_49_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. https:\/\/www.cs.ubc.ca\/amuham01\/LING530\/papers\/radford2018improving.pdf."},{"issue":"8","key":"e_1_3_2_50_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"issue":"140","key":"e_1_3_2_51_2","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, 140 (2020), 1\u201367. http:\/\/jmlr.org\/papers\/v21\/20-074.html.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1162"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocz096"},{"key":"e_1_3_2_54_2","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/gb-2008-9-s2-s2","article-title":"Overview of BioCreative II gene mention recognition","author":"Smith Larry","year":"2008","unstructured":"Larry Smith, Lorraine K. Tanabe, Rie Johnson nee Ando, Cheng-Ju Kuo, I.-Fang Chung, Chun-Nan Hsu, Yu-Shi Lin, et\u00a0al. 2008. Overview of BioCreative II gene mention recognition. Genome Biology 9 (2008), S2.","journal-title":"Genome Biology"},{"issue":"14","key":"e_1_3_2_55_2","first-page":"i49\u2013i58","article-title":"BIOSSES: A semantic sentence similarity estimation system for the biomedical domain","volume":"33","author":"So\u011fanc\u0131o\u011flu Gizem","year":"2017","unstructured":"Gizem So\u011fanc\u0131o\u011flu, Hakime \u00d6zt\u00fcrk, and Arzucan \u00d6zg\u00fcr. 2017. BIOSSES: A semantic sentence similarity estimation system for the biomedical domain. Bioinformatics 33, 14 (2017), i49\u2013i58.","journal-title":"Bioinformatics"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.5555\/3454287.3454581"},{"key":"e_1_3_2_58_2","volume-title":"(ICLR\u201919)","author":"Wang Alex","year":"2019","unstructured":"Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2019 International Conference on Learning Representations (ICLR\u201919)."},{"key":"e_1_3_2_59_2","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Wang Hai","year":"2018","unstructured":"Hai Wang and Hoifung Poon. 2018. Deep probabilistic logic: A unifying framework for indirect supervision. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP\u201918)."},{"key":"e_1_3_2_60_2","first-page":"2644","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers)","author":"Xu Yichong","year":"2019","unstructured":"Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, and Jianfeng Gao. 2019. Multi-task learning with sample re-weighting for machine reading comprehension. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers). 2644\u20132655. https:\/\/doi.org\/10.18653\/v1\/N19-1271"},{"key":"e_1_3_2_61_2","first-page":"1819","volume-title":"IEEE Transactions on Knowledge and Data Engineering","volume":"26","author":"Zhang M.","year":"2014","unstructured":"M. Zhang and Z. Zhou. 2014. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering 26, 8 (2014), 1819\u20131837. https:\/\/doi.org\/10.1109\/TKDE.2013.39"},{"issue":"5","key":"e_1_3_2_62_2","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1093\/bioinformatics\/btx659","article-title":"Drug\u2013drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths","volume":"34","author":"Zhang Yijia","year":"2018","unstructured":"Yijia Zhang, Wei Zheng, Hongfei Lin, Jian Wang, Zhihao Yang, and Michel Dumontier. 2018. Drug\u2013drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths. Bioinformatics 34, 5 (2018), 828\u2013835.","journal-title":"Bioinformatics"},{"key":"e_1_3_2_63_2","volume-title":"Proceedings of the 2015 IEEE International Conference on Computer Vision","author":"Zhu Yukun","year":"2015","unstructured":"Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV\u201915)."}],"container-title":["ACM Transactions on Computing for Healthcare"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3458754","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3458754","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3458754","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:49:06Z","timestamp":1750268946000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3458754"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,15]]},"references-count":62,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,1,31]]}},"alternative-id":["10.1145\/3458754"],"URL":"https:\/\/doi.org\/10.1145\/3458754","relation":{},"ISSN":["2691-1957","2637-8051"],"issn-type":[{"value":"2691-1957","type":"print"},{"value":"2637-8051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,15]]},"assertion":[{"value":"2020-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}