{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T08:19:24Z","timestamp":1772525964080,"version":"3.50.1"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,9,21]],"date-time":"2020-09-21T00:00:00Z","timestamp":1600646400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,21]],"date-time":"2020-09-21T00:00:00Z","timestamp":1600646400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["PTDC\/CCI-BIO\/28685\/2017"],"award-info":[{"award-number":["PTDC\/CCI-BIO\/28685\/2017"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["UIDB\/00408\/2020"],"award-info":[{"award-number":["UIDB\/00408\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n<jats:title>Background<\/jats:title>\n<jats:p>Named Entity Linking systems are a powerful aid to the manual curation of digital libraries, which is getting increasingly costly and inefficient due to the information overload. Models based on the Personalized PageRank (PPR) algorithm are one of the state-of-the-art approaches, but these have low performance when the disambiguation graphs are sparse.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Findings<\/jats:title>\n<jats:p>This work proposes a Named Entity Linking framework designated by Relation Extraction for Entity Linking (REEL) that uses automatically extracted relations to overcome this limitation. Our method builds a disambiguation graph, where the nodes are the ontology candidates for the entities and the edges are added according to the relations established in the text, which the method extracts automatically. The PPR algorithm and the information content of each ontology are then applied to choose the candidate for each entity that maximises the coherence of the disambiguation graph. We evaluated the method on three gold standards: the subset of the CRAFT corpus with ChEBI annotations (CRAFT-ChEBI), the subset of the BC5CDR corpus with disease annotations from the MEDIC vocabulary (BC5CDR-Diseases) and the subset with chemical annotations from the CTD-Chemical vocabulary (BC5CDR-Chemicals). The F1-Score achieved by REEL was 85.8%, 80.9% and 90.3% in these gold standards, respectively, outperforming baseline approaches.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Conclusions<\/jats:title>\n<jats:p>We demonstrated that RE tools can improve Named Entity Linking by capturing semantic information expressed in text missing in Knowledge Bases and use it to improve the disambiguation graph of Named Entity Linking models. REEL can be adapted to any text mining pipeline and potentially to any domain, as long as there is an ontology or other knowledge Base available.<\/jats:p>\n<\/jats:sec>","DOI":"10.1186\/s13321-020-00461-4","type":"journal-article","created":{"date-parts":[[2020,9,21]],"date-time":"2020-09-21T13:03:52Z","timestamp":1600693432000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Linking chemical and disease entities to ontologies by integrating PageRank with extracted relations from literature"],"prefix":"10.1186","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1293-4199","authenticated-orcid":false,"given":"Pedro","family":"Ruas","sequence":"first","affiliation":[]},{"given":"Andre","family":"Lamurias","sequence":"additional","affiliation":[]},{"given":"Francisco M.","family":"Couto","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,21]]},"reference":[{"key":"461_CR1","unstructured":"MEDLINE: MEDLINE PubMed production statistics; 2019. https:\/\/www.nlm.nih.gov\/bsd\/medline_pubmed_production_stats.html. Accessed 15 Jan 2020"},{"key":"461_CR2","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/9780262527811.001.0001","volume-title":"Building ontologies with basic formal ontology","author":"R Arp","year":"2015","unstructured":"Arp R, Smith B, Spear AD (2015) Building ontologies with basic formal ontology. MIT Press, Cambridge"},{"key":"461_CR3","doi-asserted-by":"publisher","unstructured":"Rao D, McNamee P, Dredze M (2013) Entity linking: finding extracted entities in a knowledge base. In: Multi-source, multilingual information extraction and summarization. Theory and applications of nature language processing. pp 93\u2013115. https:\/\/doi.org\/10.1007\/978-3-642-28569-1_5","DOI":"10.1007\/978-3-642-28569-1_5"},{"issue":"1","key":"461_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1472-6947-15-S1-S4","volume":"15","author":"JG Zheng","year":"2015","unstructured":"Zheng JG, Howsmon D, Zhang B, Hahn J, McGuinness D, Hendler J, Ji H (2015) Entity linking for biomedical literature. BMC Med Inf Decis Making 15(1):1\u20139. https:\/\/doi.org\/10.1186\/1472-6947-15-S1-S4","journal-title":"BMC Med Inf Decis Making"},{"issue":"1","key":"461_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-019-3157-y","volume":"20","author":"A Lamurias","year":"2019","unstructured":"Lamurias A, Ruas P, Couto FM (2019) PPR-SSM: personalized PageRank and semantic similarity measures for entity linking. BMC Bioinform 20(1):1\u201312. https:\/\/doi.org\/10.1186\/s12859-019-3157-y","journal-title":"BMC Bioinform"},{"key":"461_CR6","unstructured":"Bunescu R, Pasca M (2006) Using encyclopedic knowledge for named entity disambiguation. In: Proceedings of the 11th conference of the European chapter of the association for. pp 9\u201316"},{"key":"461_CR7","unstructured":"Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: bringing order to the web. Technical report, Stanford InfoLab"},{"key":"461_CR8","doi-asserted-by":"publisher","DOI":"10.1186\/2047-217X-2-11","author":"MK Ganapathiraju","year":"2013","unstructured":"Ganapathiraju MK, Orii N (2013) Research prioritization through prediction of future impact on biomedical science: a position paper on inference-analytics. GigaScience. https:\/\/doi.org\/10.1186\/2047-217X-2-11","journal-title":"GigaScience"},{"key":"461_CR9","doi-asserted-by":"crossref","unstructured":"Alhelbawy A, Gaizauskas R (2014) Graph ranking for collective Named Entity Disambiguation. In: 52nd annual meeting of the association for computational linguistics, ACL 2014\u2014proceedings of the conference, vol. 2, pp 75\u201380","DOI":"10.3115\/v1\/P14-2013"},{"issue":"4","key":"461_CR10","doi-asserted-by":"publisher","first-page":"459","DOI":"10.3233\/SW-170273","volume":"9","author":"Z Guo","year":"2018","unstructured":"Guo Z, Barbosa D (2018) Robust named entity disambiguation with random walks. Seman Web 9(4):459\u2013479. https:\/\/doi.org\/10.3233\/SW-170273","journal-title":"Seman Web"},{"key":"461_CR11","doi-asserted-by":"crossref","unstructured":"Pershina M, He Y, Grishman R (2015) Personalized page rank for named entity disambiguation. In: Human language technologies: the 2015 annual conference of the north american chapter of the ACL. pp 238\u2013243","DOI":"10.3115\/v1\/N15-1026"},{"key":"461_CR12","doi-asserted-by":"publisher","unstructured":"Ganea O-E, Hofmann T (2017) Deep joint entity disambiguation with local neural attention. In: Proceedings of the 2017 conference on empirical methods in natural language processing, Copenhagen, Denmark, September 7\u201311, 2017, pp 2619\u20132629. https:\/\/doi.org\/10.18653\/v1\/d17-1277","DOI":"10.18653\/v1\/d17-1277"},{"key":"461_CR13","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arxiv:1810.04805"},{"key":"461_CR14","doi-asserted-by":"publisher","first-page":"169434","DOI":"10.1109\/ACCESS.2019.2955498","volume":"7","author":"X Yin","year":"2019","unstructured":"Yin X, Huang Y, Zhou B, Li A, Lan L, Jia Y (2019) Deep entity linking via eliminating semantic ambiguity with BERT. IEEE Access 7:169434\u2013169445. https:\/\/doi.org\/10.1109\/ACCESS.2019.2955498","journal-title":"IEEE Access"},{"key":"461_CR15","unstructured":"Yamada I, Shindo H (2019) Pre-training of deep contextualized embeddings of words and entities for named entity disambiguation. arxiv:1909.00426"},{"key":"461_CR16","unstructured":"Arighi C, Hirschman L, Lemberger T, Bayer S, Liechti R, Comeau D, Wu C (2017) Bio-ID track overview. In: Proceedings of the BioCreative VI challenge evaluation workshop. pp 14\u201319"},{"issue":"18","key":"461_CR17","doi-asserted-by":"publisher","first-page":"2839","DOI":"10.1093\/bioinformatics\/btw343","volume":"32","author":"R Leaman","year":"2016","unstructured":"Leaman R, Lu Z (2016) TaggerOne: joint named entity recognition and normalization with semi-Markov Models. Bioinformatics 32(18):2839\u20132846. https:\/\/doi.org\/10.1093\/bioinformatics\/btw343","journal-title":"Bioinformatics"},{"issue":"22","key":"461_CR18","doi-asserted-by":"publisher","first-page":"2909","DOI":"10.1093\/bioinformatics\/btt474","volume":"29","author":"Z Lu","year":"2013","unstructured":"Lu Z, Leaman R, Dog RI (2013) DNorm: disease name normalization with pairwise learning to rank. Bioinformatics 29(22):2909\u20132917. https:\/\/doi.org\/10.1093\/bioinformatics\/btt474","journal-title":"Bioinformatics"},{"key":"461_CR19","doi-asserted-by":"publisher","unstructured":"D\u2019Souza J, Ng V (2015) Sieve-based entity linking for the biomedical domain. In: Proceedings ofthe 53rd annual meeting ofthe association for computational linguistics and the 7th international joint conference on natural language processing (short papers). pp 297\u2013302. https:\/\/doi.org\/10.3115\/V1\/P15-2049","DOI":"10.3115\/V1\/P15-2049"},{"key":"461_CR20","unstructured":"Ji Z, Wei Q, Xu H (2019) BERT-based ranking for biomedical entity normalization. arxiv:1908.03548"},{"key":"461_CR21","doi-asserted-by":"publisher","unstructured":"Nguyen DB, Theobald M, Weikum G (2017) J-REED: joint relation extraction and entity disambiguation. In: Proceedings of the 2017 ACM on conference on information and knowledge management\u2014CIKM \u201917. pp 2227\u20132230. https:\/\/doi.org\/10.1145\/3132847.3133090","DOI":"10.1145\/3132847.3133090"},{"key":"461_CR22","doi-asserted-by":"publisher","unstructured":"Couto FM, Lamurias A (2018) Semantic similarity definition. Reference module in life sciences (January) 0\u201316: https:\/\/doi.org\/10.1016\/B978-0-12-809633-8.20401-9","DOI":"10.1016\/B978-0-12-809633-8.20401-9"},{"key":"461_CR23","doi-asserted-by":"publisher","unstructured":"Cohen KB, Verspoor K, Funk C, Bada M, Palmer M, Hunter LE (2017) The Colorado Richly Annotated Full Text (CRAFT) corpus: multi-model annotation in the biomedical domain the colorado richly annotated full text (CRAFT) Corpus : multi-model annotation in the biomedical domain. In: The handbook of linguistic annotation. https:\/\/doi.org\/10.1007\/978-94-024-0881-2","DOI":"10.1007\/978-94-024-0881-2"},{"key":"461_CR24","unstructured":"Corpus C (2018) CRAFT Corpus. https:\/\/github.com\/UCDenver-ccp\/CRAFT\/releases\/download\/3.0\/craft-3.0.zip. Accessed 1 Oct 2019"},{"key":"461_CR25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/database\/baw068","volume":"2016","author":"J Li","year":"2016","unstructured":"Li J, Sun Y, Johnson RJ, Sciaky D, Wei CH, Leaman R, Davis AP, Mattingly CJ, Wiegers TC, Lu Z (2016) BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016:1\u201310. https:\/\/doi.org\/10.1093\/database\/baw068","journal-title":"Database"},{"key":"461_CR26","unstructured":"corpus BVC (2018) BioCreative V CDR Corpus. https:\/\/github.com\/JHnlp\/BioCreative-V-CDR-Corpus\/blob\/master\/CDR_Data.zip. Accessed 5 Jan 2020"},{"key":"461_CR27","doi-asserted-by":"publisher","first-page":"1214","DOI":"10.1093\/nar\/gkv1031","volume":"44","author":"J Hastings","year":"2016","unstructured":"Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, Turner S, Swainston N, Mendes P, Steinbeck C (2016) ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res 44:1214\u20131219. https:\/\/doi.org\/10.1093\/nar\/gkv1031","journal-title":"Nucleic Acids Res"},{"key":"461_CR28","unstructured":"ChEBI: ChEBI Statistics (2019) https:\/\/www.ebi.ac.uk\/chebi\/statisticsForward.do. Accessed 1 Oct 2019"},{"key":"461_CR29","unstructured":"ChEBI: ChEBI ontology files, release 179 (2019) ftp:\/\/ftp.ebi.ac.uk\/pub\/databases\/chebi\/archive\/rel179\/ontology\/. Accessed 1 Oct 2019"},{"issue":"D1","key":"461_CR30","doi-asserted-by":"publisher","first-page":"948","DOI":"10.1093\/nar\/gky868","volume":"47","author":"AP Davis","year":"2019","unstructured":"Davis AP, Grondin CJ, Johnson RJ, Sciaky D, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ (2019) The comparative toxicogenomics database: update 2019. Nucleic Acids Res 47(D1):948\u2013954. https:\/\/doi.org\/10.1093\/nar\/gky868","journal-title":"Nucleic Acids Res"},{"key":"461_CR31","unstructured":"CTD: Comparative toxicogenomics database. Data Status: May 2020. (2020) http:\/\/www.ctdbase.org\/about\/dataStatus.go. Accessed 7 May 2020"},{"key":"461_CR32","unstructured":"CTD: CTD\u2019s MEDIC Disease vocabulary ontology file. (2020) http:\/\/www.ctdbase.org\/reports\/CTD_diseases.obo.gz. Accessed 2 May 2020"},{"key":"461_CR33","unstructured":"CTD: CTD\u2019s Chemical vocabulary ontology file. (2020) http:\/\/www.ctdbase.org\/reports\/CTD_chemicals.tsv.gz. Accessed 2 May 2020"},{"key":"461_CR34","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-018-2584-5","author":"A Lamurias","year":"2019","unstructured":"Lamurias A, Sousa D, Clarke LA, Couto FM (2019) BO-LSTM: classifying relations via long short-term memory networks along biomedical ontologies. BMC Bioinform. https:\/\/doi.org\/10.1186\/s12859-018-2584-5","journal-title":"BMC Bioinform"},{"issue":"5","key":"461_CR35","doi-asserted-by":"publisher","first-page":"914","DOI":"10.1016\/j.jbi.2013.07.011","volume":"46","author":"M Herrero-Zazo","year":"2013","unstructured":"Herrero-Zazo M, Segura-Bedmar I, Mart\u00ednez P, Declerck T (2013) The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. J Biomed Inf 46(5):914\u2013920. https:\/\/doi.org\/10.1016\/j.jbi.2013.07.011","journal-title":"J Biomed Inf"},{"key":"461_CR36","doi-asserted-by":"crossref","unstructured":"Fogaras D, R\u00e1cz B (2004) Towards scaling fully personalized PageRank. In: Algorithms and models for the web-graph, vol 3243","DOI":"10.1007\/978-3-540-30216-2_9"},{"key":"461_CR37","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btz682","author":"J Lee","year":"2019","unstructured":"Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2019) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. https:\/\/doi.org\/10.1093\/bioinformatics\/btz682","journal-title":"Bioinformatics"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00461-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-020-00461-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00461-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,20]],"date-time":"2021-09-20T23:39:38Z","timestamp":1632181178000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-020-00461-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,21]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["461"],"URL":"https:\/\/doi.org\/10.1186\/s13321-020-00461-4","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,21]]},"assertion":[{"value":"18 June 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 September 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare that they have no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"57"}}