{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T16:26:38Z","timestamp":1770049598038,"version":"3.49.0"},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2020,7,29]],"date-time":"2020-07-29T00:00:00Z","timestamp":1595980800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,4,19]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The automatic extraction of published relationships between molecular entities has important applications in many biomedical fields, ranging from Systems Biology to Personalized Medicine. Existing works focused on extracting relationships described in single articles or in single sentences. However, a single record is rarely sufficient to judge upon the biological correctness of a relation, as experimental evidence might be weak or only valid in a certain context. Furthermore, statements may be more speculative than confirmative, and different articles often contradict each other. Experts therefore always take the complete literature into account to take a reliable decision upon a relationship. It is an open research question how to do this effectively in an automatic manner.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We propose two novel relation extraction approaches which use recent representation learning techniques to create comprehensive models of biomedical entities or entity-pairs, respectively. These representations are learned by considering all publications from PubMed mentioning an entity or a pair. They are used as input for a neural network for classifying relations globally, i.e. the derived predictions are corpus-based, not sentence- or article based as in prior art. Experiments on the extraction of mutation\u2013disease, drug\u2013disease and drug\u2013drug relationships show that the learned embeddings indeed capture semantic information of the entities under study and outperform traditional methods by 4\u201329% regarding F1 score.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>Source codes are available at: https:\/\/github.com\/mariosaenger\/bio-re-with-entity-embeddings.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa674","type":"journal-article","created":{"date-parts":[[2020,7,22]],"date-time":"2020-07-22T19:24:53Z","timestamp":1595445893000},"page":"236-242","source":"Crossref","is-referenced-by-count":10,"title":["Large-scale entity representation learning for biomedical relationship extraction"],"prefix":"10.1093","volume":"37","author":[{"given":"Mario","family":"S\u00e4nger","sequence":"first","affiliation":[{"name":"Computer Science Department, Knowledge Management in Bioinformatics, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"}]},{"given":"Ulf","family":"Leser","sequence":"additional","affiliation":[{"name":"Computer Science Department, Knowledge Management in Bioinformatics, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"}]}],"member":"286","published-online":{"date-parts":[[2020,7,29]]},"reference":[{"key":"2023051510593340500_btaa674-B1","doi-asserted-by":"crossref","first-page":"806","DOI":"10.1038\/nmeth.4000","article-title":"DoCM: a database of curated mutations in cancer","volume":"13","author":"Ainscough","year":"2016","journal-title":"Nat. Methods"},{"key":"2023051510593340500_btaa674-B2","doi-asserted-by":"crossref","first-page":"e0193094","DOI":"10.1371\/journal.pone.0193094","article-title":"Jointly learning word embeddings using a corpus and a knowledge base","volume":"13","author":"Alsuhaibani","year":"2018","journal-title":"PLoS One"},{"key":"2023051510593340500_btaa674-B3","doi-asserted-by":"crossref","first-page":"D948","DOI":"10.1093\/nar\/gky868","article-title":"The comparative toxicogenomics database: update 2019","volume":"47","author":"Davis","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023051510593340500_btaa674-B4","author":"Giuliano","year":"2006"},{"key":"2023051510593340500_btaa674-B5","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1038\/ng.3774","article-title":"CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer","volume":"49","author":"Griffith","year":"2017","journal-title":"Nat. Genet"},{"key":"2023051510593340500_btaa674-B6","doi-asserted-by":"crossref","first-page":"3604","DOI":"10.1093\/bioinformatics\/bth451","article-title":"Discovering patterns to extract protein-protein interactions from full texts","volume":"20","author":"Huang","year":"2004","journal-title":"Bioinformatics"},{"key":"2023051510593340500_btaa674-B7","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1186\/s12859-018-2200-8","article-title":"Relation extraction for biological pathway construction using node2vec","volume":"19","author":"Kim","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051510593340500_btaa674-B8","first-page":"1188","volume-title":"Proceedings of the 31st International Conference on Machine Learning, Volume 32 of Proceedings of Machine Learning Research","author":"Le","year":"2014"},{"key":"2023051510593340500_btaa674-B9","doi-asserted-by":"crossref","first-page":"2909","DOI":"10.1093\/bioinformatics\/btt474","article-title":"DNorm: disease name normalization with pairwise learning to rank","volume":"29","author":"Leaman","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051510593340500_btaa674-B10","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051510593340500_btaa674-B11","first-page":"3111","volume-title":"Advances in Neural Information Processing Systems","author":"Mikolov","year":"2013"},{"key":"2023051510593340500_btaa674-B12","doi-asserted-by":"crossref","first-page":"686","DOI":"10.1016\/j.sapharm.2014.11.004","article-title":"Quality of pharmacy-specific medical subject headings (MeSH) assignment in pharmacy journals indexed in MEDLINE","volume":"11","author":"Minguet","year":"2015","journal-title":"Res. Soc. Adm. Pharm"},{"key":"2023051510593340500_btaa674-B13","first-page":"195","author":"Newman-Griffis","year":"2018"},{"key":"2023051510593340500_btaa674-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1200\/PO.18.00371","article-title":"Comparative analysis of public knowledge bases for precision oncology","volume":"3","author":"Pallarz","year":"2019","journal-title":"JCO Precis. Oncol"},{"key":"2023051510593340500_btaa674-B15","first-page":"39","article-title":"Distributional semantics resources for biomedical text processing","author":"et","year":"2013","journal-title":"Proceedings of the 5th International Symposium on Languages in Biology and Medicine"},{"key":"2023051510593340500_btaa674-B16","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1186\/s12859-019-2958-3","article-title":"VIST \u2013 a Variant-Information search tool for precision oncology","volume":"20","author":"\u0160eva","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023051510593340500_btaa674-B17","doi-asserted-by":"crossref","first-page":"W585","DOI":"10.1093\/nar\/gks563","article-title":"GeneView: a comprehensive semantic search engine for PubMed","volume":"40","author":"Thomas","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023051510593340500_btaa674-B18","doi-asserted-by":"crossref","first-page":"1258","DOI":"10.1093\/bioinformatics\/btu795","article-title":"Computer-assisted curation of a human regulatory core network from the biological literature","volume":"31","author":"Thomas","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051510593340500_btaa674-B19","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1007\/978-1-62703-435-7_20","article-title":"PharmGKB: the pharmacogenomics knowledge base","volume":"1015","author":"Thorn","year":"2013","journal-title":"Methods Mol. Biol"},{"key":"2023051510593340500_btaa674-B20","doi-asserted-by":"crossref","first-page":"e1000837","DOI":"10.1371\/journal.pcbi.1000837","article-title":"A comprehensive benchmark of kernel methods to extract protein\u2013protein interactions from literature","volume":"6","author":"Tikk","year":"2010","journal-title":"PLoS Comput. Biol"},{"key":"2023051510593340500_btaa674-B21","doi-asserted-by":"crossref","first-page":"W518","DOI":"10.1093\/nar\/gkt441","article-title":"PubTator: a web-based text mining tool for assisting biocuration","volume":"41","author":"Wei","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023051510593340500_btaa674-B22","doi-asserted-by":"crossref","first-page":"D1074","DOI":"10.1093\/nar\/gkx1037","article-title":"DrugBank 5.0: a major update to the DrugBank database for 2018","volume":"46","author":"Wishart","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023051510593340500_btaa674-B23","doi-asserted-by":"crossref","first-page":"3444","DOI":"10.1093\/bioinformatics\/btw486","article-title":"Drug\u2013drug interaction extraction from biomedical literature using syntax convolutional neural network","volume":"32","author":"Zhao","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051510593340500_btaa674-B24","first-page":"1","article-title":"Biomedical relation extraction: from binary to complex","volume":"2014","author":"Zhou","year":"2014","journal-title":"Comput. Math. Methods Med"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa674\/33821961\/btaa674.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/2\/236\/50321346\/btaa674.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/2\/236\/50321346\/btaa674.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,10]],"date-time":"2024-08-10T15:30:10Z","timestamp":1723303810000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/2\/236\/5877941"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,7,29]]},"references-count":24,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa674","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,1,15]]},"published":{"date-parts":[[2020,7,29]]}}}