{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:48:13Z","timestamp":1753876093526,"version":"3.41.2"},"reference-count":22,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T00:00:00Z","timestamp":1661385600000},"content-version":"vor","delay-in-days":236,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Labex DigiCosme","award":["ANR-11-LABEX-0045-DIGICOSME"],"award-info":[{"award-number":["ANR-11-LABEX-0045-DIGICOSME"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,8,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Collecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g. Bidirectional Encoder Representations from Transformers (BERT), is shown to have limitations on biomedical texts; more specifically, the lack of annotated data makes relation extraction (RE) from biomedical texts very challenging. In this paper, we hypothesize that enriching a pre-trained transformer model with syntactic information may help improve its performance on chemical\u2013drug RE tasks. For this purpose, we propose three syntax-enhanced models based on the domain-specific BioBERT model: Chunking-Enhanced-BioBERT and Constituency-Tree-BioBERT in which constituency information is integrated and a Multi-Task-Learning framework Multi-Task-Syntactic (MTS)-BioBERT in which syntactic information is injected implicitly by adding syntax-related tasks as training objectives. Besides, we test an existing model Late-Fusion which is enhanced by syntactic dependency information and build ensemble systems combining syntax-enhanced models and non-syntax-enhanced models. Experiments are conducted on the BioCreative VII DrugProt corpus, a manually annotated corpus for the development and evaluation of RE systems. Our results reveal that syntax-enhanced models in general degrade the performance of BioBERT in the scenario of biomedical RE but improve the performance when the subject\u2013object distance of candidate semantic relation is long. We also explore the impact of quality of dependency parses. [Our code is available at: https:\/\/github.com\/Maple177\/syntax-enhanced-RE\/tree\/drugprot (for only MTS-BioBERT); https:\/\/github.com\/Maple177\/drugprot-relation-extraction (for the rest of experiments)]<\/jats:p>\n               <jats:p>Database URL https:\/\/github.com\/Maple177\/drugprot-relation-extraction<\/jats:p>","DOI":"10.1093\/database\/baac070","type":"journal-article","created":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T16:26:45Z","timestamp":1661444805000},"source":"Crossref","is-referenced-by-count":2,"title":["Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical\u2013drug relation extraction?"],"prefix":"10.1093","volume":"2022","author":[{"given":"Anfu","family":"Tang","sequence":"first","affiliation":[{"name":"INRAE, MaIAGE, Universit\u00e9 Paris-Saclay , Domaine de Vilvert, Jouy-en-Josas 78352, France"},{"name":"CNRS, Laboratoire interdisciplinaire des sciences du num\u00e9rique, Universit\u00e9 Paris-Saclay , Campus universitaire b\u00e2t 507, Rue du Belved\u00e8re, Orsay 91405, France"}]},{"given":"Louise","family":"Del\u00e9ger","sequence":"additional","affiliation":[{"name":"INRAE, MaIAGE, Universit\u00e9 Paris-Saclay , Domaine de Vilvert, Jouy-en-Josas 78352, France"}]},{"given":"Robert","family":"Bossy","sequence":"additional","affiliation":[{"name":"INRAE, MaIAGE, Universit\u00e9 Paris-Saclay , Domaine de Vilvert, Jouy-en-Josas 78352, France"}]},{"given":"Pierre","family":"Zweigenbaum","sequence":"additional","affiliation":[{"name":"CNRS, Laboratoire interdisciplinaire des sciences du num\u00e9rique, Universit\u00e9 Paris-Saclay , Campus universitaire b\u00e2t 507, Rue du Belved\u00e8re, Orsay 91405, France"}]},{"given":"Claire","family":"N\u00e9dellec","sequence":"additional","affiliation":[{"name":"INRAE, MaIAGE, Universit\u00e9 Paris-Saclay , Domaine de Vilvert, Jouy-en-Josas 78352, France"}]}],"member":"286","published-online":{"date-parts":[[2022,8,25]]},"reference":[{"key":"2022082516263451700_R1","first-page":"6000","article-title":"Attention is all you need","author":"Vaswani","year":"2017","journal-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems"},{"key":"2022082516263451700_R2","doi-asserted-by":"publisher","first-page":"4171","DOI":"10.18653\/v1\/N19-1423","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2019"},{"key":"2022082516263451700_R3","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"2022082516263451700_R4","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/D19-1371","article-title":"SciBERT: pretrained language model for scientific text","author":"Beltagy","year":"2019"},{"article-title":"Tree-structured attention with hierarchical accumulation","year":"2020","author":"Nguyen","key":"2022082516263451700_R5"},{"key":"2022082516263451700_R6","doi-asserted-by":"publisher","first-page":"5027","DOI":"10.18653\/v1\/D18-1548","article-title":"Linguistically-informed self-attention for semantic role labeling","author":"Strubell","year":"2018"},{"key":"2022082516263451700_R7","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.eacl-main.228","article-title":"Do syntax trees help pre-trained transformers extract information?","author":"Sachan","year":"2021"},{"key":"2022082516263451700_R8","doi-asserted-by":"publisher","first-page":"4129","DOI":"10.18653\/v1\/N19-1419","article-title":"A structural probe for finding syntax in word representations","author":"Hewitt","year":"2019"},{"article-title":"Visualizing and measuring the geometry of BERT","year":"2019","author":"Coenen","key":"2022082516263451700_R9"},{"key":"2022082516263451700_R10","doi-asserted-by":"publisher","first-page":"35","DOI":"10.18653\/v1\/D17-1004","article-title":"Position-aware attention and supervised data improve slot filling","author":"Zhang","year":"2017"},{"key":"2022082516263451700_R11","first-page":"3842","article-title":"Learning to prune dependency trees with rethinking for neural relation extraction","author":"Bowen","year":"2020"},{"key":"2022082516263451700_R12","doi-asserted-by":"publisher","first-page":"241","DOI":"10.18653\/v1\/P19-1024","article-title":"Attention guided graph convolutional networks for relation extraction","author":"Guo","year":"2019"},{"key":"2022082516263451700_R13","doi-asserted-by":"publisher","first-page":"5412","DOI":"10.18653\/v1\/2021.acl-long.420","article-title":"Syntax-enhanced pre-trained model","author":"Zenan","year":"2021"},{"key":"2022082516263451700_R14","article-title":"Google\u2019s neural machine translation system: bridging the gap between human and machine translation","volume-title":"CoRR, Abs\/1609.08144","author":"Wu","year":"2016"},{"key":"2022082516263451700_R15","first-page":"2377","article-title":"Training very deep networks","author":"Srivastava","year":"2015","journal-title":"Proceedings of the 28th International Conference on Neural Information Processing Systems"},{"article-title":"Overview of DrugProt BioCreative VII track: quality evaluation and large scale text mining of drug-gene\/protein relations","year":"2021","author":"Miranda","key":"2022082516263451700_R16"},{"key":"2022082516263451700_R17","doi-asserted-by":"publisher","first-page":"1892","DOI":"10.1093\/jamia\/ocab090","article-title":"Biomedical and clinical English model packages for the Stanza Python NLP library","volume":"28","author":"Zhang","year":"2021","journal-title":"J. Am. Med. Informat. Assoc."},{"key":"2022082516263451700_R18","doi-asserted-by":"publisher","first-page":"i180","DOI":"10.1093\/bioinformatics\/btg1023","article-title":"Genia corpus - a semantically annotated corpus for bio-textmining","volume":"19","author":"Kim","year":"2003","journal-title":"Bioinformatics"},{"key":"2022082516263451700_R19","first-page":"2676","article-title":"Constituency parsing with a self-attentive encoder","author":"Kitaev","year":"2018"},{"key":"2022082516263451700_R20","doi-asserted-by":"publisher","first-page":"38","DOI":"10.18653\/v1\/2020.emnlp-demos.6","article-title":"Transformers: state-of-the-art natural language processing","author":"Wolf","year":"2020"},{"article-title":"Pytorch: an imperative style, high-performance deep learning library","year":"2019","author":"Paszke","key":"2022082516263451700_R21"},{"key":"2022082516263451700_R22","article-title":"Adam: a method for stochastic optimization","volume-title":"CoRR, Abs\/1412.6980","author":"Kingma","year":"2015"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baac070\/45535391\/baac070.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baac070\/45535391\/baac070.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T16:26:55Z","timestamp":1661444815000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baac070\/6675625"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,1]]},"references-count":22,"URL":"https:\/\/doi.org\/10.1093\/database\/baac070","relation":{},"ISSN":["1758-0463"],"issn-type":[{"type":"electronic","value":"1758-0463"}],"subject":[],"published-other":{"date-parts":[[2022,1,1]]},"published":{"date-parts":[[2022,1,1]]},"article-number":"baac070"}}