{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T19:52:03Z","timestamp":1775245923545,"version":"3.50.1"},"reference-count":51,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2018,10,9]],"date-time":"2018-10-09T00:00:00Z","timestamp":1539043200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000690","name":"Research Councils UK","doi-asserted-by":"publisher","award":["MR\/M013049\/1"],"award-info":[{"award-number":["MR\/M013049\/1"]}],"id":[{"id":"10.13039\/501100000690","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Cancer Research UK Cambridge Institute Core","award":["C14303\/A17197"],"award-info":[{"award-number":["C14303\/A17197"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>The overwhelming size and rapid growth of the biomedical literature make it impossible for scientists to read all studies related to their work, potentially leading to missed connections and wasted time and resources. Literature-based discovery (LBD) aims to alleviate these issues by identifying implicit links between disjoint parts of the literature. While LBD has been studied in depth since its introduction three decades ago, there has been limited work making use of recent advances in biomedical text processing methods in LBD.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present LION LBD, a literature-based discovery system that enables researchers to navigate published information and supports hypothesis generation and testing. The system is built with a particular focus on the molecular biology of cancer using state-of-the-art machine learning and natural language processing methods, including named entity recognition and grounding to domain ontologies covering a wide range of entity types and a novel approach to detecting references to the hallmarks of cancer in text. LION LBD implements a broad selection of co-occurrence based metrics for analyzing the strength of entity associations, and its design allows real-time search to discover indirect associations between entities in a database of tens of millions of publications while preserving the ability of users to explore each mention in its original context in the literature. Evaluations of the system demonstrate its ability to identify undiscovered links and rank relevant concepts highly among potential connections.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The LION LBD system is available via a web-based user interface and a programmable API, and all components of the system are made available under open licenses from the project home page http:\/\/lbd.lionproject.net.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty845","type":"journal-article","created":{"date-parts":[[2018,10,8]],"date-time":"2018-10-08T07:10:36Z","timestamp":1538982636000},"page":"1553-1561","source":"Crossref","is-referenced-by-count":56,"title":["LION LBD: a literature-based discovery system for cancer biology"],"prefix":"10.1093","volume":"35","author":[{"given":"Sampo","family":"Pyysalo","sequence":"first","affiliation":[{"name":"Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK"}]},{"given":"Simon","family":"Baker","sequence":"additional","affiliation":[{"name":"Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK"}]},{"given":"Imran","family":"Ali","sequence":"additional","affiliation":[{"name":"Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden"}]},{"given":"Stefan","family":"Haselwimmer","sequence":"additional","affiliation":[{"name":"Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK"}]},{"given":"Tejas","family":"Shah","sequence":"additional","affiliation":[{"name":"Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK"}]},{"given":"Andrew","family":"Young","sequence":"additional","affiliation":[{"name":"Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, UK"}]},{"given":"Yufan","family":"Guo","sequence":"additional","affiliation":[{"name":"Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK"}]},{"given":"Johan","family":"H\u00f6gberg","sequence":"additional","affiliation":[{"name":"Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden"}]},{"given":"Ulla","family":"Stenius","sequence":"additional","affiliation":[{"name":"Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden"}]},{"given":"Masashi","family":"Narita","sequence":"additional","affiliation":[{"name":"Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, UK"}]},{"given":"Anna","family":"Korhonen","sequence":"additional","affiliation":[{"name":"Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK"}]}],"member":"286","published-online":{"date-parts":[[2018,10,9]]},"reference":[{"key":"2023012806495642500_bty845-B1","first-page":"17","article-title":"Effective mapping of biomedical text to the UMLS metathesaurus: the MetaMap program","volume-title":"Proceedings of the AMIA Symposium","author":"Aronson","year":"2001"},{"key":"2023012806495642500_bty845-B2","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/W17-2339","article-title":"Initializing neural networks for hierarchical multi-label text classification","author":"Baker","year":"2017"},{"key":"2023012806495642500_bty845-B3","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/bioinformatics\/btv585","article-title":"Automatic semantic classification of scientific literature according to the hallmarks of cancer","volume":"32","author":"Baker","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012806495642500_bty845-B4","doi-asserted-by":"crossref","first-page":"3973","DOI":"10.1093\/bioinformatics\/btx454","article-title":"Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer","volume":"33","author":"Baker","year":"2017","journal-title":"Bioinformatics"},{"key":"2023012806495642500_bty845-B5","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B6","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gkm791","article-title":"ChEBI: a database and ontology for chemical entities of biological interest","volume":"36","author":"Degtyarenko","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B7","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1038\/nature10189","article-title":"Oncogene-induced Nrf2 transcription promotes ROS detoxification and tumorigenesis","volume":"475","author":"DeNicola","year":"2011","journal-title":"Nature"},{"key":"2023012806495642500_bty845-B8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.jbi.2013.12.006","article-title":"NCBI disease corpus: a resource for disease name recognition and concept normalization","volume":"47","author":"Do\u011fan","year":"2014","journal-title":"J. Biomed. Inf."},{"key":"2023012806495642500_bty845-B9","doi-asserted-by":"crossref","first-page":"D136","DOI":"10.1093\/nar\/gkr1178","article-title":"The NCBI taxonomy database","volume":"40","author":"Federhen","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B10","volume-title":"Architectural styles and the design of network-based software architectures","author":"Fielding","year":"2000"},{"key":"2023012806495642500_bty845-B11","doi-asserted-by":"crossref","first-page":"fs5","DOI":"10.1126\/scisignal.aaa8398","article-title":"Integrating p38\u03b1-MAPK immune signals in non-immune cells","volume":"8","author":"Gaffen","year":"2015","journal-title":"Sci. Signal."},{"key":"2023012806495642500_bty845-B12","first-page":"116","article-title":"Toward discovery support systems: a replication, re-examination, and extension of swanson\u2019s work on literature-based discovery of a connection between raynaud\u2019s and fish oil","volume":"47","author":"Gordon","year":"1996","journal-title":"J. Assoc. Inf. Sci. Technol."},{"key":"2023012806495642500_bty845-B13","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/S0092-8674(00)81683-9","article-title":"The hallmarks of cancer","volume":"100","author":"Hanahan","year":"2000","journal-title":"Cell"},{"key":"2023012806495642500_bty845-B14","doi-asserted-by":"crossref","first-page":"646","DOI":"10.1016\/j.cell.2011.02.013","article-title":"Hallmarks of cancer: the next generation","volume":"144","author":"Hanahan","year":"2011","journal-title":"Cell"},{"key":"2023012806495642500_bty845-B15","first-page":"1","article-title":"Bcl-2 is a critical mediator of intestinal transformation","volume":"7","author":"Heijden","year":"2016","journal-title":"Nat. Commun."},{"key":"2023012806495642500_bty845-B16","doi-asserted-by":"crossref","first-page":"979","DOI":"10.1038\/ncb3397","article-title":"NOTCH1 mediates a switch between two distinct secretomes during senescence","volume":"18","author":"Hoare","year":"2016","journal-title":"Nat. Cell Biol."},{"key":"2023012806495642500_bty845-B17","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1016\/j.ijmedinf.2004.04.024","article-title":"Using literature-based discovery to identify disease candidate genes","volume":"74","author":"Hristovski","year":"2005","journal-title":"Int. J. Med. Inf."},{"key":"2023012806495642500_bty845-B18","doi-asserted-by":"crossref","first-page":"1190","DOI":"10.1093\/bioinformatics\/btr101","article-title":"A comprehensive protein-centric ID mapping service for molecular data integration","volume":"27","author":"Huang","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012806495642500_bty845-B19","first-page":"70","article-title":"Introduction to the bio-entity recognition task at JNLPBA","volume-title":"Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications","author":"Kim","year":"2004"},{"key":"2023012806495642500_bty845-B20","first-page":"1","article-title":"Senescent tumor cells lead the collective invasion in thyroid cancer","volume":"8","author":"Kim","year":"2017","journal-title":"Nat. Commun."},{"key":"2023012806495642500_bty845-B21","doi-asserted-by":"crossref","first-page":"S1","DOI":"10.1186\/1758-2946-7-S1-S1","article-title":"CHEMDNER: the drugs and chemical names extraction challenge","volume":"7","author":"Krallinger","year":"2015","journal-title":"J. Cheminf."},{"key":"2023012806495642500_bty845-B22","first-page":"574","article-title":"Literature-based discovery by lexical statistics","volume":"50","author":"Lindsay","year":"1999","journal-title":"J. Assoc. Inf. Sci. Technol."},{"key":"2023012806495642500_bty845-B23","first-page":"265","article-title":"Medical subject headings (MeSH)","volume":"88","author":"Lipscomb","year":"2000","journal-title":"Bull. Med. Library Assoc."},{"key":"2023012806495642500_bty845-B24","doi-asserted-by":"crossref","first-page":"D54","DOI":"10.1093\/nar\/gki031","article-title":"Entrez Gene: gene-centered information at NCBI","volume":"33","author":"Maglott","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B25","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/gb-2008-9-s2-s3","article-title":"Overview of BioCreative II gene normalization","volume":"9","author":"Morgan","year":"2008","journal-title":"Genome Biol."},{"key":"2023012806495642500_bty845-B26","doi-asserted-by":"crossref","first-page":"D415","DOI":"10.1093\/nar\/gkt1173","article-title":"Protein Ontology: a controlled structured network of protein entities","volume":"42","author":"Natale","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B27","first-page":"7","article-title":"Towards semantic literature based discovery","volume-title":"2012 AAAI Fall Symposium Series: Information Retrieval and Knowledge Discovery in Biomedical Text","author":"Preiss","year":"2012"},{"key":"2023012806495642500_bty845-B28","article-title":"Web annotation data model","author":"Sanderson","year":"2017","journal-title":"W3C Recommendation"},{"key":"2023012806495642500_bty845-B29","doi-asserted-by":"crossref","first-page":"2498","DOI":"10.1101\/gr.1239303","article-title":"Cytoscape: a software environment for integrated models of biomolecular interaction networks","volume":"13","author":"Shannon","year":"2003","journal-title":"Genome Res."},{"key":"2023012806495642500_bty845-B30","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1007\/978-1-4614-3223-4_14","article-title":"Biomedical text mining: a survey of recent progress","volume-title":"Mining Text Data","author":"Simpson","year":"2012"},{"key":"2023012806495642500_bty845-B31","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1212\/WNL.46.2.583","article-title":"Indomethacin and Alzheimer\u2019s disease","volume":"46","author":"Smalheiser","year":"1996","journal-title":"Neurology"},{"key":"2023012806495642500_bty845-B32","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1212\/WNL.47.3.809","article-title":"Linking estrogen to Alzheimer\u2019s disease an informatics approach","volume":"47","author":"Smalheiser","year":"1996","journal-title":"Neurology"},{"key":"2023012806495642500_bty845-B33","doi-asserted-by":"crossref","first-page":"752","DOI":"10.1001\/archpsyc.55.8.752-a","article-title":"Calcium-independent phospholipase a2 and schizophrenia","volume":"55","author":"Smalheiser","year":"1998","journal-title":"Arch. Gen. Psychiatry"},{"key":"2023012806495642500_bty845-B34","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/gb-2008-9-s2-s2","article-title":"Overview of BioCreative II gene mention recognition","volume":"9","author":"Smith","year":"2008","journal-title":"Genome Biol."},{"key":"2023012806495642500_bty845-B35","doi-asserted-by":"crossref","first-page":"396","DOI":"10.1002\/asi.10389","article-title":"Text mining: generating hypotheses from MEDLINE","volume":"55","author":"Srinivasan","year":"2004","journal-title":"J. Assoc. Inf. Sci. Technol."},{"key":"2023012806495642500_bty845-B36","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1353\/pbm.1986.0087","article-title":"Fish oil, raynaud\u2019s syndrome, and undiscovered public knowledge","volume":"30","author":"Swanson","year":"1986","journal-title":"Perspect. Biol. Med."},{"key":"2023012806495642500_bty845-B37","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1086\/601720","article-title":"Undiscovered public knowledge","volume":"56","author":"Swanson","year":"1986","journal-title":"Library Q."},{"key":"2023012806495642500_bty845-B38","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1002\/(SICI)1097-4571(198707)38:4<228::AID-ASI2>3.0.CO;2-G","article-title":"Two medical literatures that are logically but not bibliographically connected","volume":"38","author":"Swanson","year":"1987","journal-title":"J. Am. Soc. Inf. Sci."},{"key":"2023012806495642500_bty845-B39","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1353\/pbm.1988.0009","article-title":"Migraine and magnesium: eleven neglected connections","volume":"31","author":"Swanson","year":"1988","journal-title":"Perspect. Biol. Med."},{"key":"2023012806495642500_bty845-B40","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1353\/pbm.1990.0031","article-title":"Somatomedin c and arginine: implicit connections between mutually isolated literatures","volume":"33","author":"Swanson","year":"1990","journal-title":"Perspect. Biol. Med."},{"key":"2023012806495642500_bty845-B41","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1016\/S0004-3702(97)00008-8","article-title":"An interactive system for finding complementary literatures: a stimulus to scientific discovery","volume":"91","author":"Swanson","year":"1997","journal-title":"Artif. Intell."},{"key":"2023012806495642500_bty845-B42","doi-asserted-by":"crossref","first-page":"2559","DOI":"10.1093\/bioinformatics\/btn469","article-title":"FACTA: a text search engine for finding associated biomedical concepts","volume":"24","author":"Tsuruoka","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012806495642500_bty845-B43","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1096\/fj.14-262659","article-title":"Lysophosphatidate signaling stabilizes nrf2 and increases the expression of genes involved in drug resistance and oxidative stress responses: implications for cancer treatment","volume":"29","author":"Venkatraman","year":"2015","journal-title":"FASEB J."},{"key":"2023012806495642500_bty845-B44","doi-asserted-by":"crossref","first-page":"548","DOI":"10.1002\/asi.1104","article-title":"Using concepts in literature-based discovery: simulating swanson\u2019s raynaud\u2013fish oil and migraine\u2013magnesium discoveries","volume":"52","author":"Weeber","year":"2001","journal-title":"J. Assoc. Inf. Sci. Technol."},{"key":"2023012806495642500_bty845-B45","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1093\/bib\/6.3.277","article-title":"Online tools to support literature-based discovery in the life sciences","volume":"6","author":"Weeber","year":"2005","journal-title":"Brief. Bioinf."},{"key":"2023012806495642500_bty845-B46","doi-asserted-by":"crossref","first-page":"W518","DOI":"10.1093\/nar\/gkt441","article-title":"PubTator: a web-based text mining tool for assisting biocuration","volume":"41","author":"Wei","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B47","doi-asserted-by":"crossref","first-page":"baw032","DOI":"10.1093\/database\/baw032","article-title":"Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task","volume":"2016","author":"Wei","year":"2016","journal-title":"Database"},{"key":"2023012806495642500_bty845-B48","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1093\/nar\/gkg033","article-title":"Database resources of the national center for biotechnology","volume":"31","author":"Wheeler","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012806495642500_bty845-B49","unstructured":"WWW Consortium (2014). JSON-LD 1.0: a JSON-based serialization for linked data."},{"key":"2023012806495642500_bty845-B50","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1007\/978-3-540-68690-3_7","article-title":"Evaluation of literature-based discovery systems","volume-title":"Literature-Based Discovery","author":"Yetisgen-Yildiz","year":"2008"},{"key":"2023012806495642500_bty845-B51","doi-asserted-by":"crossref","first-page":"633","DOI":"10.1016\/j.jbi.2008.12.001","article-title":"A new evaluation methodology for literature-based discovery systems","volume":"42","author":"Yetisgen-Yildiz","year":"2009","journal-title":"J. Biomed. Inf."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/9\/1553\/48942041\/bioinformatics_35_9_1553.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/9\/1553\/48942041\/bioinformatics_35_9_1553.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T18:55:03Z","timestamp":1775242503000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/9\/1553\/5124276"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,10,9]]},"references-count":51,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2019,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty845","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,5,1]]},"published":{"date-parts":[[2018,10,9]]}}}