{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T05:15:19Z","timestamp":1774415719355,"version":"3.50.1"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"15","license":[{"start":{"date-parts":[[2018,2,27]],"date-time":"2018-02-27T00:00:00Z","timestamp":1519689600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["LM05652"],"award-info":[{"award-number":["LM05652"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["GM061374"],"award-info":[{"award-number":["GM061374"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["GM102365"],"award-info":[{"award-number":["GM102365"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Morgridge Family Stanford Interdisciplinary Graduate Fellowship"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The biomedical community\u2019s collective understanding of how chemicals, genes and phenotypes interact is distributed across the text of over 24 million research articles. These interactions offer insights into the mechanisms behind higher order biochemical phenomena, such as drug-drug interactions and variations in drug response across individuals. To assist their curation at scale, we must understand what relationship types are possible and map unstructured natural language descriptions onto these structured classes. We used NCBI\u2019s PubTator annotations to identify instances of chemical, gene and disease names in Medline abstracts and applied the Stanford dependency parser to find connecting dependency paths between pairs of entities in single sentences. We combined a published ensemble biclustering algorithm (EBC) with hierarchical clustering to group the dependency paths into semantically-related categories, which we annotated with labels, or \u2018themes\u2019 (\u2018inhibition\u2019 and \u2018activation\u2019, for example). We evaluated our theme assignments against six human-curated databases: DrugBank, Reactome, SIDER, the Therapeutic Target Database, OMIM and PharmGKB.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Clustering revealed 10 broad themes for chemical-gene relationships, 7 for chemical-disease, 10 for gene-disease and 9 for gene\u2013gene. In most cases, enriched themes corresponded directly to known database relationships. Our final dataset, represented as a network, contained 37\u2009491 thematically-labeled chemical-gene edges, 2\u2009021\u2009192 chemical-disease edges, 136\u2009206 gene-disease edges and 41\u2009418 gene\u2013gene edges, each representing a single-sentence description of an interaction from somewhere in the literature.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The complete network is available on Zenodo (https:\/\/zenodo.org\/record\/1035500). We have also provided the full set of dependency paths connecting biomedical entities in Medline abstracts, with associated sentences, for future use by the biomedical research community.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty114","type":"journal-article","created":{"date-parts":[[2018,2,27]],"date-time":"2018-02-27T04:10:30Z","timestamp":1519704630000},"page":"2614-2624","source":"Crossref","is-referenced-by-count":112,"title":["A global network of biomedical relationships derived from text"],"prefix":"10.1093","volume":"34","author":[{"given":"Bethany","family":"Percha","sequence":"first","affiliation":[{"name":"Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA"},{"name":"Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY, USA"}]},{"given":"Russ B","family":"Altman","sequence":"additional","affiliation":[{"name":"Department of Bioengineering, Stanford University, Stanford, CA, USA"},{"name":"Department of Genetics, Stanford University, Stanford, CA, USA"},{"name":"Department of Medicine, Stanford University, Stanford, CA, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,2,27]]},"reference":[{"key":"2023012713064429500_bty114-B1","first-page":"556","author":"Alex","year":"2008"},{"key":"2023012713064429500_bty114-B2","author":"Baker","year":"1998"},{"key":"2023012713064429500_bty114-B3","doi-asserted-by":"crossref","first-page":"1075","DOI":"10.1198\/jasa.2011.tm10183","article-title":"Hierarchical clustering with prototypes via minimax linkage","volume":"106","author":"Bien","year":"2011","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012713064429500_bty114-B4","author":"Bollegala","year":"2010"},{"key":"2023012713064429500_bty114-B5","first-page":"376","author":"Buyko","year":"2012"},{"key":"2023012713064429500_bty114-B6","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1097\/00008571-200409000-00002","article-title":"Extracting and characterizing gene-drug relationships from the literature","volume":"14","author":"Chang","year":"2004","journal-title":"Pharmacogenetics"},{"key":"2023012713064429500_bty114-B7","author":"Cho","year":"2014"},{"key":"2023012713064429500_bty114-B8","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1093\/bib\/6.1.57","article-title":"A survey of current work in biomedical text mining","volume":"6","author":"Cohen","year":"2005","journal-title":"Brief. Bioinformatics"},{"key":"2023012713064429500_bty114-B9","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1016\/j.jbi.2009.02.002","article-title":"Empirical distributional semantics: methods and biomedical applications","volume":"42","author":"Cohen","year":"2009","journal-title":"J. Biomed. Inform"},{"key":"2023012713064429500_bty114-B10","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1016\/j.jbi.2010.08.005","article-title":"Using text to build semantic networks for pharmacogenomics","volume":"43","author":"Coulet","year":"2010","journal-title":"J. Biomed. Informatics"},{"key":"2023012713064429500_bty114-B11","first-page":"D691","article-title":"Reactome: a database of reactions, pathways and biological processes","volume":"39(suppl_1)","author":"Croft","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023012713064429500_bty114-B12","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-02151-0","volume-title":"Recognizing textual entailment: models and applications","author":"Dagan","year":"2013"},{"key":"2023012713064429500_bty114-B13","author":"De Marneffe","year":"2008"},{"key":"2023012713064429500_bty114-B14","author":"De Marneffe","year":"2008"},{"key":"2023012713064429500_bty114-B15","doi-asserted-by":"crossref","first-page":"391.","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9","article-title":"Indexing by latent semantic analysis","volume":"41","author":"Deerwester","year":"1990","journal-title":"J. Am. Soc. Inform. Sci"},{"key":"2023012713064429500_bty114-B16","author":"Dhillon","year":"2003"},{"key":"2023012713064429500_bty114-B17","first-page":"401","article-title":"Exploiting shallow linguistic information for relation extraction from biomedical literature","volume":"18","author":"Giuliano","year":"2006","journal-title":"EACL"},{"key":"2023012713064429500_bty114-B18","first-page":"D514","article-title":"Online mendelian inheritance in man (OMIM): a knowledge base of human genes and genetic disorders","volume":"33(Suppl. 1)","author":"Hamosh","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023012713064429500_bty114-B19","author":"Hasegawa","year":"2004"},{"key":"2023012713064429500_bty114-B20","author":"Jonnalagadda","year":"2010"},{"key":"2023012713064429500_bty114-B21","author":"Kim","year":"2014"},{"key":"2023012713064429500_bty114-B22","author":"Kok","year":"2008"},{"key":"2023012713064429500_bty114-B23","doi-asserted-by":"crossref","first-page":"D1075","DOI":"10.1093\/nar\/gkv1075","article-title":"The SIDER database of drugs and side effects","volume":"44","author":"Kuhn","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012713064429500_bty114-B24","author":"Le","year":"2014"},{"key":"2023012713064429500_bty114-B25","author":"Levy","year":"2015"},{"key":"2023012713064429500_bty114-B26","author":"Lin","year":"2001"},{"key":"2023012713064429500_bty114-B27","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1016\/j.jbi.2010.07.006","article-title":"Natural language processing methods and systems for biomedical ontology learning","volume":"44","author":"Liu","year":"2011","journal-title":"J.Biomed. Informatics"},{"key":"2023012713064429500_bty114-B28","author":"Liu","year":"2016"},{"key":"2023012713064429500_bty114-B29","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1093\/bioinformatics\/btv476","article-title":"Large-scale extraction of gene interactions from full-text literature using DeepDive","volume":"32","author":"Mallory","year":"2015","journal-title":"Bioinformatics"},{"key":"2023012713064429500_bty114-B30","author":"Mikolov","year":"2013"},{"key":"2023012713064429500_bty114-B31","author":"Mikolov","year":"2013"},{"key":"2023012713064429500_bty114-B32","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1126\/science.298.5594.824","article-title":"Network motifs: simple building blocks of complex networks","volume":"298","author":"Milo","year":"2002","journal-title":"Science"},{"key":"2023012713064429500_bty114-B33","author":"Passos","year":"2014"},{"key":"2023012713064429500_bty114-B34","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1016\/j.tips.2013.01.006","article-title":"Informatics confronts drug\u2013drug interactions","volume":"34","author":"Percha","year":"2013","journal-title":"Trends Pharm. Sci"},{"key":"2023012713064429500_bty114-B35","doi-asserted-by":"crossref","first-page":"e1004216.","DOI":"10.1371\/journal.pcbi.1004216","article-title":"Learning the structure of biomedical relationships from unstructured text","volume":"11","author":"Percha","year":"2015","journal-title":"PLoS Comput. Biol"},{"key":"2023012713064429500_bty114-B37","author":"Riedel","year":"2013"},{"key":"2023012713064429500_bty114-B38","author":"Rosenfeld","year":"2007"},{"key":"2023012713064429500_bty114-B39","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1016\/j.jbi.2011.04.005","article-title":"Using a shallow linguistic kernel for drug-drug interaction extraction","volume":"44","author":"Segura-Bedmar","year":"2011","journal-title":"J. Biomed. Informatics"},{"key":"2023012713064429500_bty114-B40","author":"Shinyama","year":"2006"},{"key":"2023012713064429500_bty114-B41","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1007\/978-1-4614-3223-4_14","volume-title":"Mining Text Data","author":"Simpson","year":"2012"},{"key":"2023012713064429500_bty114-B42","doi-asserted-by":"crossref","first-page":"e1005017","DOI":"10.1371\/journal.pcbi.1005017","article-title":"Text mining genotype-phenotype relationships from biomedical literature for database curation and precision medicine","volume":"12","author":"Singhal","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023012713064429500_bty114-B43","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1086\/601720","article-title":"Undiscovered public knowledge","volume":"56","author":"Swanson","year":"1986","journal-title":"Library Quarterly"},{"key":"2023012713064429500_bty114-B44","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1353\/pbm.1986.0087","article-title":"Fish oil, Raynaud\u2019s syndrome, and undiscovered public knowledge","volume":"30","author":"Swanson","year":"1986","journal-title":"Perspectives Biol. Med"},{"key":"2023012713064429500_bty114-B45","author":"Turney","year":"2005"},{"key":"2023012713064429500_bty114-B46","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1613\/jair.2934","article-title":"From frequency to meaning: vector space models of semantics","volume":"37","author":"Turney","year":"2010","journal-title":"J. Artif. Intel. Res"},{"key":"2023012713064429500_bty114-B47","doi-asserted-by":"crossref","first-page":"W518","DOI":"10.1093\/nar\/gkt441","article-title":"PubTator: a web-based text mining tool for assisting biocuration","volume":"41","author":"Wei","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023012713064429500_bty114-B48","doi-asserted-by":"crossref","first-page":"D668","DOI":"10.1093\/nar\/gkj067","article-title":"DrugBank: a comprehensive resource for in silico drug discovery and exploration","volume":"34","author":"Wishart","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023012713064429500_bty114-B49","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1038\/clpt.2012.96","article-title":"Pharmacogenomics knowledge for personalized medicine","volume":"92","author":"Whirl-Carrillo","year":"2012","journal-title":"Clin. Pharmacol. Ther"},{"key":"2023012713064429500_bty114-B50","author":"Yao","year":"2011"},{"key":"2023012713064429500_bty114-B51","doi-asserted-by":"crossref","first-page":"i331","DOI":"10.1093\/bioinformatics\/btg1046","article-title":"Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup","volume":"19(suppl_1)","author":"Yeh","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012713064429500_bty114-B52","doi-asserted-by":"crossref","first-page":"358","DOI":"10.1093\/bib\/bbm045","article-title":"Frontiers of biomedical text mining: current progress","volume":"8","author":"Zweigenbaum","year":"2007","journal-title":"Brief. Bioinformatics"},{"key":"2023012713064429500_bty114-B53","author":"Zhang","year":"2005"},{"key":"2023012713064429500_bty114-B54","doi-asserted-by":"crossref","first-page":"D1128","DOI":"10.1093\/nar\/gkr797","article-title":"Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery","volume":"40","author":"Zhu","year":"2011","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/15\/2614\/48935544\/bioinformatics_34_15_2614.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/15\/2614\/48935544\/bioinformatics_34_15_2614.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,1]],"date-time":"2023-09-01T09:09:23Z","timestamp":1693559363000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/15\/2614\/4911883"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,2,27]]},"references-count":53,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2018,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty114","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,8,1]]},"published":{"date-parts":[[2018,2,27]]}}}