{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:49:21Z","timestamp":1753876161806,"version":"3.41.2"},"reference-count":70,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2019,10,17]],"date-time":"2019-10-17T00:00:00Z","timestamp":1571270400000},"content-version":"vor","delay-in-days":289,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Foundation for Science and Technology","doi-asserted-by":"publisher","award":["SFRH\/BD\/137000\/2018","UID\/CEC\/00127\/2019"],"award-info":[{"award-number":["SFRH\/BD\/137000\/2018","UID\/CEC\/00127\/2019"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Integrated Programme of SR&TD \u2018SOCA\u2019","award":["CENTRO-01-0145-FEDER-000010"],"award-info":[{"award-number":["CENTRO-01-0145-FEDER-000010"]}]},{"DOI":"10.13039\/501100008530","name":"European Regional Development Fund","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100008530","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The scientific literature contains large amounts of information on genes, proteins, chemicals and their interactions. Extraction and integration of this information in curated knowledge bases help researchers support their experimental results, leading to new hypotheses and discoveries. This is especially relevant for precision medicine, which aims to understand the individual variability across patient groups in order to select the most appropriate treatments. Methods for improved retrieval and automatic relation extraction from biomedical literature are therefore required for collecting structured information from the growing number of published works. In this paper, we follow a deep learning approach for extracting mentions of chemical\u2013protein interactions from biomedical articles, based on various enhancements over our participation in the BioCreative VI CHEMPROT task. A significant aspect of our best method is the use of a simple deep learning model together with a very narrow representation of the relation instances, using only up to 10 words from the shortest dependency path and the respective dependency edges. Bidirectional long short-term memory recurrent networks or convolutional neural networks are used to build the deep learning models. We report the results of several experiments and show that our best model is competitive with more complex sentence representations or network structures, achieving an F1-score of 0.6306 on the test set. The source code of our work, along with detailed statistics, is publicly available.<\/jats:p>","DOI":"10.1093\/database\/baz095","type":"journal-article","created":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T11:59:42Z","timestamp":1584446382000},"source":"Crossref","is-referenced-by-count":6,"title":["Extraction of chemical\u2013protein interactions from the literature using neural networks and narrow instance representation"],"prefix":"10.1093","volume":"2019","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3533-8872","authenticated-orcid":false,"given":"Rui","family":"Antunes","sequence":"first","affiliation":[{"name":"Department of Electronics, Telecommunications and Informatics (DETI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1941-3983","authenticated-orcid":false,"given":"S\u00e9rgio","family":"Matos","sequence":"additional","affiliation":[{"name":"Department of Electronics, Telecommunications and Informatics (DETI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal"}]}],"member":"286","published-online":{"date-parts":[[2019,10,17]]},"reference":[{"key":"2020071623552563300_ref1","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1109\/TBME.2016.2573285","article-title":"Omic and electronic health record big data analytics for precision medicine","volume":"64","author":"Wu","year":"2017","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"2020071623552563300_ref2","doi-asserted-by":"crossref","first-page":"baw119","DOI":"10.1093\/database\/baw119","article-title":"Overview of the interactive task in BioCreative V","volume":"2016","author":"Wang","year":"2016","journal-title":"Database"},{"key":"2020071623552563300_ref3","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1186\/1471-2105-14-281","article-title":"A modular framework for biomedical concept recognition","volume":"14","author":"Campos","year":"2013","journal-title":"BMC Bioinform."},{"key":"2020071623552563300_ref4","doi-asserted-by":"crossref","first-page":"1915","DOI":"10.1093\/bioinformatics\/btt317","article-title":"BeCAS: biomedical concept recognition services and visualization","volume":"29","author":"Nunes","year":"2013","journal-title":"Bioinformatics"},{"key":"2020071623552563300_ref5","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1093\/bfgp\/elu015","article-title":"Event-based text mining for biology and functional genomics","volume":"14","author":"Ananiadou","year":"2015","journal-title":"Brief. Funct. Genomics"},{"key":"2020071623552563300_ref6","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/1471-2105-12-S8-S3","article-title":"The protein\u2013protein interaction tasks of BioCreative III: classification\/ranking of articles and linking bio-ontology concepts to full text","volume":"12","author":"Krallinger","year":"2011","journal-title":"BMC Bioinform."},{"key":"2020071623552563300_ref7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pcbi.1005017","article-title":"Text mining genotype\u2013phenotype relationships from biomedical literature for database curation and precision medicine","volume":"12","author":"Singhal","year":"2016","journal-title":"PLOS Comput. Biol."},{"key":"2020071623552563300_ref8","doi-asserted-by":"crossref","first-page":"7673","DOI":"10.1021\/acs.chemrev.6b00851","article-title":"Information retrieval and text mining technologies for chemistry","volume":"117","author":"Krallinger","year":"2017","journal-title":"Chem. Rev."},{"key":"2020071623552563300_ref9","first-page":"141","article-title":"Overview of the BioCreative VI chemical\u2013protein interaction Track","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Krallinger","year":"2017"},{"key":"2020071623552563300_ref10","first-page":"151","article-title":"Extracting chemical\u2013protein interactions using long short-term memory networks","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Matos","year":"2017"},{"key":"2020071623552563300_ref11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pcbi.1000943","article-title":"Literature mining for the discovery of hidden connections between drugs, genes and diseases","volume":"6","author":"Frijters","year":"2010","journal-title":"PLOS Comput. Biol."},{"key":"2020071623552563300_ref12","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2020071623552563300_ref13","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.jbi.2017.08.001","article-title":"Word embeddings and recurrent neural networks based on long-short term memory nodes in supervised biomedical word sense disambiguation","volume":"73","author":"Jimeno-Yepes","year":"2017","journal-title":"J. Biomed. Inform."},{"key":"2020071623552563300_ref14","doi-asserted-by":"crossref","first-page":"1746","DOI":"10.3115\/v1\/D14-1181","article-title":"Convolutional neural networks for sentence classification","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Kim","year":"2014"},{"key":"2020071623552563300_ref15","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1109\/ICMLA.2017.0-134","article-title":"HDLTex: hierarchical deep learning for text classification","volume-title":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","author":"Kowsari","year":"2017"},{"key":"2020071623552563300_ref16","doi-asserted-by":"crossref","first-page":"i37","DOI":"10.1093\/bioinformatics\/btx228","article-title":"Deep learning with word embeddings improves biomedical named entity recognition","volume":"33","author":"Habibi","year":"2017","journal-title":"Bioinformatics"},{"key":"2020071623552563300_ref17","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1186\/s12859-017-1868-5","article-title":"Long short-term memory RNN for biomedical named entity recognition","volume":"18","author":"Lyu","year":"2017","journal-title":"BMC Bioinform."},{"key":"2020071623552563300_ref18","doi-asserted-by":"crossref","first-page":"39","DOI":"10.3115\/v1\/W15-1506","article-title":"Relation extraction: perspective from convolutional neural networks","volume-title":"Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing","author":"Nguyen","year":"2015"},{"key":"2020071623552563300_ref19","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1186\/s12859-017-1962-8","article-title":"Dependency-based long short term memory network for drug\u2013drug interaction extraction","volume":"18","author":"Wang","year":"2017","journal-title":"BMC Bioinform."},{"key":"2020071623552563300_ref20","first-page":"btx659","article-title":"Drug\u2013drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths","volume":"34","author":"Zhang","year":"2017","journal-title":"Bioinformatics"},{"key":"2020071623552563300_ref21","first-page":"baw032","article-title":"Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task.","author":"Wei","year":"2016"},{"key":"2020071623552563300_ref22","doi-asserted-by":"crossref","first-page":"baw048","DOI":"10.1093\/database\/baw048","article-title":"Exploiting syntactic and semantics information for chemical-disease relation extraction","volume":"2016","author":"Zhou","year":"2016","journal-title":"Database"},{"key":"2020071623552563300_ref23","first-page":"bax024","article-title":"Chemical-induced disease relation extraction via convolutional neural network","volume-title":"Database","author":"Jinghang","year":"2017"},{"key":"2020071623552563300_ref24","first-page":"147","article-title":"Chemical\u2013protein relation extraction with ensembles of SVM, CNN, and RNN models","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Peng","year":"2017"},{"key":"2020071623552563300_ref25","first-page":"bay073","article-title":"Extracting chemical\u2013protein relations with ensembles of SVM and deep learning models.","author":"Peng","year":"2018"},{"key":"2020071623552563300_ref26","first-page":"180","article-title":"Improving the learning of chemical\u2013protein interactions from literature using transfer learning and word embeddings","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Corbett","year":"2017"},{"key":"2020071623552563300_ref27","first-page":"61","article-title":"Chemlistem\u2014chemical named entity recognition using recurrent neural networks","volume-title":"Proceedings of the BioCreative V.5 Challenge Evaluation Workshop","author":"Corbett","year":"2017"},{"key":"2020071623552563300_ref28","first-page":"bay066","article-title":"Improving the learning of chemical\u2013protein interactions from literature using transfer learning and specialized word embeddings.","author":"Corbett","year":"2018"},{"key":"2020071623552563300_ref29","first-page":"175","article-title":"Combining support vector machines and LSTM networks for chemical\u2013protein relation extraction","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Mehryary","year":"2017"},{"key":"2020071623552563300_ref30","first-page":"bay120","article-title":"Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical\u2013protein relation extraction.","author":"Mehryary","year":"2018"},{"key":"2020071623552563300_ref31","first-page":"190","article-title":"Chemical\u2013gene relation extraction using recursive neural network","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Lim","year":"2017"},{"key":"2020071623552563300_ref32","first-page":"bay060","article-title":"Chemical-gene relation extraction using recursive neural network","volume-title":"Database","author":"Lim","year":"2018"},{"key":"2020071623552563300_ref33","first-page":"159","article-title":"Extracting chemical\u2013protein interactions from literature","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Lung","year":"2017"},{"key":"2020071623552563300_ref34","first-page":"bay138","article-title":"Extracting chemical\u2013protein interactions from literature using sentence structure analysis and feature engineering","author":"Lung","year":"2019"},{"key":"2020071623552563300_ref35","first-page":"155","article-title":"Attention-based neural networks for chemical protein relation extraction","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Liu","year":"2017"},{"key":"2020071623552563300_ref36","first-page":"bay102","article-title":"Extracting chemical-protein relations using attention-based neural networks","author":"Liu","year":"2018"},{"key":"2020071623552563300_ref37","first-page":"187","article-title":"Predicting chemical protein relations with biaffine relation attention networks","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Verga","year":"2017"},{"article-title":"Neural machine translation by jointly learning to align and translate","year":"2014","author":"Bahdanau","key":"2020071623552563300_ref38"},{"key":"2020071623552563300_ref39","first-page":"5998","article-title":"Attention is all you need","volume-title":"31st Conference on Neural Information Processing Systems (NIPS 2017)","author":"Vaswani","year":"2017"},{"key":"2020071623552563300_ref40","first-page":"1480","article-title":"Hierarchical attention networks for document classification","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Yang","year":"2016"},{"key":"2020071623552563300_ref41","first-page":"2526","article-title":"Attention-based convolutional neural network for semantic relation extraction","volume-title":"Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers","author":"Shen","year":"2016"},{"key":"2020071623552563300_ref42","doi-asserted-by":"crossref","DOI":"10.1016\/j.ymeth.2019.02.021","article-title":"Exploring semi-supervised variational autoencoders for biomedical relation extraction","author":"Zhang","year":"2019","journal-title":"Methods"},{"key":"2020071623552563300_ref43","doi-asserted-by":"crossref","DOI":"10.1093\/database\/baz054","article-title":"Chemical\u2013protein interaction extraction via contextualized word representations and multihead attention","volume":"2019","author":"Zhang","year":"2019","journal-title":"Database"},{"key":"2020071623552563300_ref44","first-page":"bay108","article-title":"LPTK: a linguistic pattern-aware dependency tree kernel approach for the BioCreative VI CHEMPROT task.","author":"Warikoo","year":"2018"},{"key":"2020071623552563300_ref45","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-16-S16-S4","article-title":"TEES 2.2: biomedical event extraction for diverse corpora","volume":"16","author":"Bj\u00f6rne","year":"2015","journal-title":"BMC Bioinform."},{"key":"2020071623552563300_ref46","first-page":"209","article-title":"AKANE system: protein\u2013protein interaction pairs in the BioCreAtIvE2 challenge, PPI-IPS subtask","volume-title":"Proceedings of the Second BioCreative Challenge Evaluation Workshop","author":"S\u00e6tre","year":"2007"},{"key":"2020071623552563300_ref47","first-page":"173","article-title":"Coarse-to-fine n-best parsing and MaxEnt discriminative reranking","volume-title":"Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics","author":"Charniak","year":"2005"},{"key":"2020071623552563300_ref48","first-page":"101","article-title":"Self-training for biomedical parsing","volume-title":"Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers","author":"McClosky","year":"2008"},{"key":"2020071623552563300_ref49","doi-asserted-by":"crossref","first-page":"740","DOI":"10.3115\/v1\/D14-1082","article-title":"A fast and accurate dependency parser using neural networks","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Chen","year":"2014"},{"key":"2020071623552563300_ref50","first-page":"724","article-title":"A shortest path dependency kernel for relation extraction","volume-title":"Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing","author":"Bunescu","year":"2005"},{"key":"2020071623552563300_ref51","first-page":"1137","article-title":"A neural probabilistic language model","volume":"3","author":"Bengio","year":"2003","journal-title":"J. Mach. Learn. Res."},{"article-title":"Efficient estimation of word representations in vector space","year":"2013","author":"Mikolov","key":"2020071623552563300_ref52"},{"key":"2020071623552563300_ref53","first-page":"45","article-title":"Software framework for topic modelling with large corpora","volume-title":"Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks","author":"\u0158ehu\u0159ek","year":"2010"},{"key":"2020071623552563300_ref54","doi-asserted-by":"crossref","DOI":"10.1515\/jib-2017-0055","article-title":"Protein\u2013protein interaction article classification using a convolutional recurrent neural network with pre-trained word embeddings","volume":"14","author":"Matos","year":"2017","journal-title":"J. Integr. Bioinform."},{"key":"2020071623552563300_ref55","doi-asserted-by":"crossref","DOI":"10.1515\/jib-2017-0051","article-title":"Supervised learning and knowledge-based approaches applied to biomedical word sense disambiguation","volume":"14","author":"Antunes","year":"2017","journal-title":"J. Integr. Bioinform."},{"key":"2020071623552563300_ref56","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1162\/tacl_a_00051","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Trans. Assoc. Comput. Linguist."},{"article-title":"BioSentVec: creating sentence embeddings for biomedical texts","year":"2018","author":"Chen","key":"2020071623552563300_ref57"},{"key":"2020071623552563300_ref58","first-page":"160035","article-title":"MIMIC-III, a freely accessible critical care database.","author":"Johnson","year":"2016"},{"article-title":"Keras","year":"2015","author":"Chollet","key":"2020071623552563300_ref59"},{"key":"2020071623552563300_ref60","first-page":"265","article-title":"TensorFlow: a system for large-scale machine learning","volume-title":"12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Abadi","year":"2016"},{"volume-title":"Deep Learning With Python","year":"2017","author":"Chollet","key":"2020071623552563300_ref61"},{"issue":"D1","key":"2020071623552563300_ref62","doi-asserted-by":"crossref","first-page":"D369","DOI":"10.1093\/nar\/gkw1102","article-title":"The BioGRID interaction database: 2017 update","volume":"45","author":"Chatr-aryamontri","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2020071623552563300_ref63","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pone.0171929","article-title":"Extracting microRNA-gene relations from biomedical literature using distant supervision","volume":"12","author":"Lamurias","year":"2017","journal-title":"PLoS One"},{"key":"2020071623552563300_ref64","doi-asserted-by":"crossref","first-page":"496","DOI":"10.18653\/v1\/P18-1046","article-title":"DSGAN: generative adversarial training for distant supervision relation extraction","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Qin","year":"2018"},{"key":"2020071623552563300_ref65","article-title":"Manning, and Christopher Potts","volume-title":"A fast unified model for parsing and sentence understanding","author":"Bowman","year":"2016"},{"volume-title":"Neural networks for machine learning\u2014Lecture 6a\u2014Overview of mini-batch gradient descent","year":"2012","author":"Hinton","key":"2020071623552563300_ref66"},{"key":"2020071623552563300_ref67","first-page":"171","article-title":"Extracting chemical\u2013protein interactions via bidirectional long short-term memory network","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Wang","year":"2017"},{"key":"2020071623552563300_ref68","first-page":"163","article-title":"Knowledge-base-enriched relation extraction","volume-title":"Proceedings of thedraftrules BioCreative VI Workshop","author":"Tripodi","year":"2017"},{"key":"2020071623552563300_ref69","first-page":"167","article-title":"CTCPI\u2014convolution tree kernel-based chemical-protein interaction detection","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Warikoo","year":"2017"},{"key":"2020071623552563300_ref70","first-page":"184","article-title":"CNN-based chemical\u2013protein interactions classification","volume-title":"Proceedings of the BioCreative VI Workshop","author":"Y\u00fcksel","year":"2017"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baz095\/32921657\/baz095.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baz095\/32921657\/baz095.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,4]],"date-time":"2021-03-04T17:48:26Z","timestamp":1614880106000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baz095\/5587825"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,1,1]]},"references-count":70,"URL":"https:\/\/doi.org\/10.1093\/database\/baz095","relation":{},"ISSN":["1758-0463"],"issn-type":[{"type":"electronic","value":"1758-0463"}],"subject":[],"published-other":{"date-parts":[[2019]]},"published":{"date-parts":[[2019,1,1]]},"article-number":"baz095"}}