{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T16:01:08Z","timestamp":1780761668884,"version":"3.54.1"},"reference-count":114,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2022,9,23]],"date-time":"2022-09-23T00:00:00Z","timestamp":1663891200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,11,19]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene\u2013disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.<\/jats:p>","DOI":"10.1093\/bib\/bbac404","type":"journal-article","created":{"date-parts":[[2022,9,24]],"date-time":"2022-09-24T05:58:18Z","timestamp":1663999098000},"source":"Crossref","is-referenced-by-count":91,"title":["A review of biomedical datasets relating to drug discovery: a knowledge graph perspective"],"prefix":"10.1093","volume":"23","author":[{"given":"Stephen","family":"Bonner","sequence":"first","affiliation":[{"name":"Data Sciences and Quantitative Biology , Discovery Sciences, R&D, , Cambridge, UK"},{"name":"AstraZeneca , Discovery Sciences, R&D, , Cambridge, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ian P","family":"Barrett","sequence":"additional","affiliation":[{"name":"Data Sciences and Quantitative Biology , Discovery Sciences, R&D, , Cambridge, UK"},{"name":"AstraZeneca , Discovery Sciences, R&D, , Cambridge, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Cheng","family":"Ye","sequence":"additional","affiliation":[{"name":"Data Sciences and Quantitative Biology , Discovery Sciences, R&D, , Cambridge, UK"},{"name":"AstraZeneca , Discovery Sciences, R&D, , Cambridge, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Rowan","family":"Swiers","sequence":"additional","affiliation":[{"name":"Data Sciences and Quantitative Biology , Discovery Sciences, R&D, , Cambridge, UK"},{"name":"AstraZeneca , Discovery Sciences, R&D, , Cambridge, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ola","family":"Engkvist","sequence":"additional","affiliation":[{"name":"Molecular AI , Discovery Sciences, R&D, , Gothenburg, Sweeden"},{"name":"AstraZeneca , Discovery Sciences, R&D, , Gothenburg, Sweeden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Andreas","family":"Bender","sequence":"additional","affiliation":[{"name":"Centre for Molecular Informatics , Department of Chemistry, , UK"},{"name":"University of Cambridge , Department of Chemistry, , UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Charles Tapley","family":"Hoyt","sequence":"additional","affiliation":[{"name":"Laboratory of Systems Pharmacology, Harvard Medical School , USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"William L","family":"Hamilton","sequence":"additional","affiliation":[{"name":"School of Computer Science, McGill University , Canada"},{"name":"Mila-Quebec AI Institute , Montreal, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2022,9,23]]},"reference":[{"issue":"3","key":"2022112111112326300_ref1","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1038\/nrd.2017.244","article-title":"Impact of a five-dimensional framework on R&D productivity at AstraZeneca","volume":"17","author":"Morgan","year":"2018","journal-title":"Nat Rev Drug Discov"},{"issue":"1","key":"2022112111112326300_ref2","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/S0165-6147(00)01584-4","article-title":"In silico research in drug discovery","volume":"22","author":"Terstappen","year":"2001","journal-title":"Trends Pharmacol Sci"},{"issue":"6","key":"2022112111112326300_ref3","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1038\/s41573-019-0024-5","article-title":"Applications of machine learning in drug discovery and development","volume":"18","author":"Vamathevan","year":"2019","journal-title":"Nat Rev Drug Discov"},{"issue":"7873","key":"2022112111112326300_ref4","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"issue":"2","key":"2022112111112326300_ref5","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1093\/bib\/bbp002","article-title":"Semantic web for integrated network analysis in biomedicine","volume":"10","author":"Chen","year":"2009","journal-title":"Brief Bioinform"},{"issue":"2","key":"2022112111112326300_ref6","doi-asserted-by":"crossref","first-page":"566","DOI":"10.1093\/bib\/bbz017","article-title":"Network-based methods for predicting essential genes or proteins: a survey","volume":"21","author":"Li","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref7","first-page":"06","article-title":"Network approaches for modeling the effect of drugs and diseases","author":"Rintala","year":"2022","journal-title":"Brief Bioinform"},{"issue":"4","key":"2022112111112326300_ref8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3447772","article-title":"Knowledge graphs","volume":"54","author":"Hogan","year":"2021","journal-title":"ACM Computing Surveys (CSUR)"},{"issue":"9","key":"2022112111112326300_ref9","doi-asserted-by":"crossref","first-page":"1338","DOI":"10.1093\/bioinformatics\/btt765","article-title":"The EBI RDF platform: linked open data for the life sciences","volume":"30","author":"Jupp","year":"2014","journal-title":"Bioinformatics"},{"key":"2022112111112326300_ref10","first-page":"6","article-title":"Exploring the Social Drivers of Health During a Pandemic: Leveraging Knowledge Graphs and Population Trends in COVID-19","volume":"275","author":"Bettencourt-Silva","year":"2020","journal-title":"Stud Health Technol Inform"},{"issue":"1","key":"2022112111112326300_ref11","doi-asserted-by":"crossref","DOI":"10.1136\/bmjhci-2020-100254","article-title":"Network graph representation of COVID-19 scientific publications to aid knowledge discovery","volume":"28","author":"Cernile","year":"2020","journal-title":"BMJ Health & Care Informatics"},{"key":"2022112111112326300_ref12","first-page":"09","article-title":"COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology","volume":"37","author":"Domingo-Fernandez","year":"2020","journal-title":"Bioinformatics"},{"key":"2022112111112326300_ref13","article-title":"DRKG - Drug Repurposing Knowledge Graph for Covid-19","author":"Ioannidis","year":"2020"},{"key":"2022112111112326300_ref14","article-title":"KG-COVID-19: a framework to produce customized knowledge graphs for COVID-19 response","volume":"2","author":"Reese","year":"2020","journal-title":"Patterns"},{"key":"2022112111112326300_ref15","first-page":"1","volume-title":"Proceedings of Knowledgeable NLP: the First Workshop on Integrating Structured Knowledge and Neural Networks for NLP","author":"Wise","year":"2020"},{"key":"2022112111112326300_ref16","doi-asserted-by":"crossref","first-page":"05","DOI":"10.1093\/bib\/bbab159","article-title":"Utilizing graph machine learning within drug discovery and development","volume":"22","author":"Gaudelet","year":"2021","journal-title":"Brief Bioinform"},{"issue":"D1","key":"2022112111112326300_ref17","doi-asserted-by":"crossref","first-page":"D1","DOI":"10.1093\/nar\/gkz1161","article-title":"The 27th annual Nucleic Acids Research database issue and molecular biology database collection","volume":"48","author":"Rigden","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2022112111112326300_ref18","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.26726","article-title":"Systematic integration of biomedical knowledge prioritizes drugs for repurposing","volume":"6","author":"Himmelstein","year":"2017","journal-title":"Elife"},{"issue":"13","key":"2022112111112326300_ref19","doi-asserted-by":"crossref","first-page":"i457","DOI":"10.1093\/bioinformatics\/bty294","article-title":"Modeling polypharmacy side effects with graph convolutional networks","volume":"34","author":"Zitnik","year":"2018","journal-title":"Bioinformatics"},{"key":"2022112111112326300_ref20","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1145\/3292500.3330961","volume-title":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Zhang","year":"2019"},{"issue":"3","key":"2022112111112326300_ref21","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-3-031-01588-5","article-title":"Graph representation learning","volume":"14","author":"Hamilton","year":"2020","journal-title":"Synthesis Lectures on Artifical Intelligence and Machine Learning"},{"key":"2022112111112326300_ref22","doi-asserted-by":"crossref","first-page":"1381","DOI":"10.3389\/fgene.2019.01381","article-title":"Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis","volume":"10","author":"Lee","year":"2020","journal-title":"Front Genet"},{"key":"2022112111112326300_ref23","article-title":"Exploration of databases and methods supporting drug repurposing: a comprehensive survey","volume":"22","author":"Tanoli","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref24","article-title":"Biomedical data and computational models for drug repositioning: a comprehensive review","volume":"22","author":"Luo","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref25","doi-asserted-by":"crossref","DOI":"10.1177\/1460458220937101","article-title":"Knowledge-driven drug repurposing using a comprehensive drug knowledge graph","volume":"26","author":"Zhu","year":"2020","journal-title":"Health Informatics J"},{"issue":"2","key":"2022112111112326300_ref26","doi-asserted-by":"crossref","first-page":"1087","DOI":"10.1016\/j.ygeno.2019.06.021","article-title":"Drug databases and their contributions to drug repurposing","volume":"112","author":"Masoudi-Sobhanzadeh","year":"2020","journal-title":"Genomics"},{"key":"2022112111112326300_ref27","article-title":"Machine learning approaches and databases for prediction of drug\u2013target interaction: a survey paper","volume":"22","author":"Bagherian","year":"2020","journal-title":"Brief Bioinform"},{"issue":"9","key":"2022112111112326300_ref28","doi-asserted-by":"crossref","first-page":"2208","DOI":"10.3390\/molecules23092208","article-title":"Machine learning for drug-target interaction prediction","volume":"23","author":"Chen","year":"2018","journal-title":"Molecules"},{"issue":"1","key":"2022112111112326300_ref29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-3284-5","article-title":"Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction in realistic settings","volume":"20","author":"Celebi","year":"2019","journal-title":"BMC bioinformatics"},{"issue":"5","key":"2022112111112326300_ref30","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1016\/j.jbi.2008.03.004","article-title":"Bio2RDF: towards a mashup to build bioinformatics knowledge systems","volume":"41","author":"Belleau","year":"2008","journal-title":"J Biomed Inform"},{"issue":"4","key":"2022112111112326300_ref31","doi-asserted-by":"crossref","first-page":"1308","DOI":"10.1093\/bib\/bbx169","article-title":"Drug knowledge bases and their applications in biomedical informatics research","volume":"20","author":"Zhu","year":"2019","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref32","article-title":"Biological applications of knowledge graph embedding models","volume":"22","author":"Mohamed","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref33","article-title":"Knowledge-Based Biomedical Data Science. Annual Review of Biomedical Data","volume":"3","author":"Callahan","year":"2020","journal-title":"Science"},{"issue":"5","key":"2022112111112326300_ref34","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1038\/nrd.2018.14","article-title":"Unexplored therapeutic opportunities in the human genome","volume":"17","author":"Oprea","year":"2018","journal-title":"Nat Rev Drug Discov"},{"issue":"4","key":"2022112111112326300_ref35","doi-asserted-by":"crossref","first-page":"1645","DOI":"10.1021\/acs.jcim.8b00663","article-title":"Evaluation of Cross-Validation Strategies in Sequence-Based Binding Prediction Using Deep Learning","volume":"59","author":"Lopez-Del Rio","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2022112111112326300_ref36","article-title":"On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link Prediction Methods","author":"Berrendorf","year":"2020"},{"issue":"1","key":"2022112111112326300_ref37","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nrg2918","article-title":"Network medicine: a network-based approach to human disease","volume":"12","author":"Barab\u00e1si","year":"2011","journal-title":"Nat Rev Genet"},{"issue":"9","key":"2022112111112326300_ref38","doi-asserted-by":"crossref","first-page":"843","DOI":"10.1038\/s41592-019-0509-5","article-title":"Assessment of network module identification across complex diseases","volume":"16","author":"Choobdar","year":"2019","journal-title":"Nat Methods"},{"key":"2022112111112326300_ref39","volume-title":"An NIH white paper by the QSP workshop group","author":"Sorger","year":"2011"},{"issue":"21","key":"2022112111112326300_ref40","article-title":"Ontologies for molecular biology. Computer and Information","volume":"6","author":"en Schulze-Kremer S","year":"2001","journal-title":"Science"},{"issue":"1","key":"2022112111112326300_ref41","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1093\/bib\/bbm059","article-title":"Biomedical ontologies: a functional perspective","volume":"9","author":"Rubin","year":"2008","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref42","article-title":"Mondo: Unifying diseases for the world, by the world","author":"Vasilevsky","year":"2022","journal-title":"medRxiv"},{"issue":"3","key":"2022112111112326300_ref43","first-page":"265","article-title":"Medical subject headings (MeSH)","volume":"88","author":"Lipscomb","year":"2000","journal-title":"Bull Med Libr Assoc"},{"issue":"5","key":"2022112111112326300_ref44","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1016\/j.ajhg.2008.09.017","article-title":"The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease","volume":"83","author":"Robinson","year":"2008","journal-title":"The American Journal of Human Genetics"},{"issue":"D1","key":"2022112111112326300_ref45","doi-asserted-by":"crossref","first-page":"D955","DOI":"10.1093\/nar\/gky1032","article-title":"Human Disease Ontology 2018 update: classification, content and workflow expansion","volume":"47","author":"Schriml","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref46","doi-asserted-by":"crossref","first-page":"D258","DOI":"10.1093\/nar\/gkh036","article-title":"The Gene Ontology (GO) database and informatics resource","volume":"32","author":"Consortium GO","year":"2004","journal-title":"Nucleic Acids Res"},{"issue":"8","key":"2022112111112326300_ref47","doi-asserted-by":"crossref","first-page":"1112","DOI":"10.1093\/bioinformatics\/btq099","article-title":"Modeling sample variables with an Experimental Factor Ontology","volume":"26","author":"Malone","year":"2010","journal-title":"Bioinformatics"},{"key":"2022112111112326300_ref48","doi-asserted-by":"crossref","DOI":"10.12688\/f1000research.9656.1","article-title":"Identifying ELIXIR core data resources","volume":"5","author":"Durinx","year":"2016","journal-title":"F1000Research"},{"issue":"D1","key":"2022112111112326300_ref49","doi-asserted-by":"crossref","first-page":"D985","DOI":"10.1093\/nar\/gkw1055","article-title":"Open Targets: a platform for therapeutic target identification and validation","volume":"45","author":"Koscielny","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref50","doi-asserted-by":"crossref","first-page":"D1056","DOI":"10.1093\/nar\/gky1133","article-title":"Open Targets Platform: new developments and updates two years on","volume":"47","author":"Carvalho-Silva","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref51","doi-asserted-by":"crossref","first-page":"D995","DOI":"10.1093\/nar\/gkw1072","article-title":"Pharos: Collating protein information to shed light on the druggable genome","volume":"45","author":"Nguyen","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref52","doi-asserted-by":"crossref","first-page":"D115","DOI":"10.1093\/nar\/gkh131","article-title":"UniProt: the universal protein knowledgebase","volume":"32","author":"Apweiler","year":"2004","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref53","first-page":"D682","article-title":"Ensembl 2020","volume":"48","author":"Yates","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref54","doi-asserted-by":"crossref","first-page":"D1250","DOI":"10.1093\/nar\/gky1206","article-title":"RNAcentral: a hub of information for non-coding RNA sequences","volume":"47","author":"Sweeney","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref55","first-page":"D54","article-title":"Entrez Gene: gene-centered information at NCBI","volume":"33","author":"Maglott","year":"2005","journal-title":"Nucleic Acids Res"},{"issue":"15","key":"2022112111112326300_ref56","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci"},{"issue":"D1","key":"2022112111112326300_ref57","doi-asserted-by":"crossref","first-page":"D607","DOI":"10.1093\/nar\/gky1131","article-title":"STRING v11: protein\u2013protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets","volume":"47","author":"Szklarczyk","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref58","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkj109","article-title":"BioGRID: a general repository for interaction datasets","volume":"34","author":"Stark","year":"2006","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref59","doi-asserted-by":"crossref","first-page":"D452","DOI":"10.1093\/nar\/gkh052","article-title":"IntAct: an open source molecular interaction database","volume":"32","author":"Hermjakob","year":"2004","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"2022112111112326300_ref60","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1038\/nmeth.4077","article-title":"OmniPath: guidelines and gateway for literature-curated signaling pathway resources","volume":"13","author":"T\u00fcrei","year":"2016","journal-title":"Nat Methods"},{"key":"2022112111112326300_ref61","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.3389\/fgene.2019.01203","article-title":"The impact of pathway database choice on statistical enrichment analysis and predictive modeling","volume":"10","author":"Mubeen","year":"2019","journal-title":"Front Genet"},{"issue":"D1","key":"2022112111112326300_ref62","first-page":"D498","article-title":"The reactome pathway knowledgebase","volume":"48","author":"Jassal","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref63","doi-asserted-by":"crossref","first-page":"D661","DOI":"10.1093\/nar\/gkx1064","article-title":"WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research","volume":"46","author":"Slenter","year":"2018","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref64","doi-asserted-by":"crossref","first-page":"D353","DOI":"10.1093\/nar\/gkw1092","article-title":"KEGG: new perspectives on genomes, pathways, diseases and drugs","volume":"45","author":"Kanehisa","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref65","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkm882","article-title":"KEGG for linking genomes to life and the environment","volume":"36","author":"Kanehisa","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2022112111112326300_ref66","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1016\/j.ymeth.2014.11.020","article-title":"DISEASES: Text mining and data integration of disease\u2013gene associations","volume":"74","author":"Pletscher-Frankild","year":"2015","journal-title":"Methods"},{"key":"2022112111112326300_ref67","doi-asserted-by":"crossref","DOI":"10.1093\/database\/bav028","article-title":"DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes","volume":"2015","author":"Pi\u00f1ero","year":"2015","journal-title":"Database"},{"issue":"1","key":"2022112111112326300_ref68","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1002\/(SICI)1098-1004(200001)15:1<57::AID-HUMU12>3.0.CO;2-G","article-title":"Online Mendelian inheritance in man (OMIM)","volume":"15","author":"Hamosh","year":"2000","journal-title":"Hum Mutat"},{"issue":"D1","key":"2022112111112326300_ref69","doi-asserted-by":"crossref","first-page":"D1005","DOI":"10.1093\/nar\/gky1120","article-title":"The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019","volume":"47","author":"Buniello","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref70","doi-asserted-by":"crossref","first-page":"D930","DOI":"10.1093\/nar\/gky1075","article-title":"ChEMBL: towards direct deposition of bioassay data","volume":"47","author":"Mendez","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref71","doi-asserted-by":"crossref","first-page":"D1202","DOI":"10.1093\/nar\/gkv951","article-title":"PubChem substance and compound databases","volume":"44","author":"Kim","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref72","doi-asserted-by":"crossref","first-page":"D901","DOI":"10.1093\/nar\/gkm958","article-title":"DrugBank: a knowledgebase for drugs, drug actions and drug targets","volume":"36","author":"Wishart","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2022112111112326300_ref73","first-page":"gkw993","article-title":"DrugCentral: online drug compendium","volume":"45","author":"Ursu","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"8","key":"2022112111112326300_ref74","doi-asserted-by":"crossref","first-page":"719","DOI":"10.2174\/1386207013330670","article-title":"BindingDB: a web-accessible molecular recognition database","volume":"4","author":"Chen","year":"2001","journal-title":"Comb Chem High Throughput Screen"},{"issue":"1","key":"2022112111112326300_ref75","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2017.29","article-title":"A standard database for drug repositioning","volume":"4","author":"Brown","year":"2017","journal-title":"Scientific data"},{"issue":"19","key":"2022112111112326300_ref76","first-page":"83","article-title":"Convolutional neural network based on SMILES representation of compounds for detecting chemical motif","volume":"19","author":"Hirohara","year":"2018","journal-title":"BMC bioinformatics"},{"issue":"8","key":"2022112111112326300_ref77","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1109\/TPAMI.2013.50","article-title":"Representation learning: A review and new perspectives","volume":"35","author":"Bengio","year":"2013","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2022112111112326300_ref78","first-page":"3111","article-title":"Distributed Representations of Words and Phrases and their Compositionality","volume":"26","author":"Mikolov","year":"2013","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2022112111112326300_ref79","doi-asserted-by":"crossref","first-page":"3173","DOI":"10.1145\/3340531.3412776","volume-title":"Proceedings of the 29th ACM International Conference on Information & Knowledge Management","author":"Walsh","year":"2020"},{"key":"2022112111112326300_ref80","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbaa344","article-title":"PharmKG: a dedicated knowledge graph benchmark for bomedical data mining","volume":"22","author":"Zheng","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022112111112326300_ref81","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btaa274","article-title":"OpenBioLink: A benchmarking framework for large-scale biomedical link prediction","volume":"36","author":"Breit","year":"2020","journal-title":"Bioinformatics"},{"key":"2022112111112326300_ref82","first-page":"1","article-title":"A knowledge graph to interpret clinical proteomics data","volume":"45","author":"Santos","year":"2022","journal-title":"Nat Biotechnol"},{"issue":"24","key":"2022112111112326300_ref83","doi-asserted-by":"crossref","first-page":"3107","DOI":"10.1093\/bioinformatics\/btt549","article-title":"Are graph databases ready for bioinformatics?","volume":"29","author":"Have","year":"2013","journal-title":"Bioinformatics"},{"issue":"D1","key":"2022112111112326300_ref84","doi-asserted-by":"crossref","first-page":"D1074","DOI":"10.1093\/nar\/gkx1037","article-title":"DrugBank 5.0: a major update to the DrugBank database for 2018","volume":"46","author":"Wishart","year":"2018","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref85","first-page":"D845","article-title":"The DisGeNET knowledge platform for disease genomics: 2019 update","volume":"48","author":"Pi\u00f1ero","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref86","doi-asserted-by":"crossref","first-page":"D440","DOI":"10.1093\/nar\/gkm883","article-title":"The gene ontology project in 2008","volume":"36","author":"Consortium GO","year":"2008","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2022112111112326300_ref87","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-11069-0","article-title":"Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings","volume":"10","author":"Nelson","year":"2019","journal-title":"Nat Commun"},{"key":"2022112111112326300_ref88","article-title":"Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing","author":"Ioannidis","year":"2020"},{"key":"2022112111112326300_ref89","doi-asserted-by":"crossref","first-page":"1141","DOI":"10.1145\/3437963.3441663","volume-title":"Proceedings of the 14th ACM International Conference on Web Search and Data Mining","author":"Zheng","year":"2021"},{"issue":"15","key":"2022112111112326300_ref90","doi-asserted-by":"crossref","first-page":"2614","DOI":"10.1093\/bioinformatics\/bty114","article-title":"A global network of biomedical relationships derived from text","volume":"34","author":"Percha","year":"2018","journal-title":"Bioinformatics"},{"issue":"4","key":"2022112111112326300_ref91","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1038\/clpt.2012.96","article-title":"Pharmacogenomics knowledge for personalized medicine","volume":"92","author":"Whirl-Carrillo","year":"2012","journal-title":"Clinical Pharmacology & Therapeutics"},{"issue":"1","key":"2022112111112326300_ref92","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1093\/nar\/30.1.412","article-title":"TTD: therapeutic target database","volume":"30","author":"Chen","year":"2002","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref93","doi-asserted-by":"crossref","first-page":"D1075","DOI":"10.1093\/nar\/gkv1075","article-title":"The SIDER database of drugs and side effects","volume":"44","author":"Kuhn","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref94","doi-asserted-by":"crossref","first-page":"D573","DOI":"10.1093\/nar\/gky1126","article-title":"HumanNet v2: human gene networks for disease research","volume":"47","author":"Hwang","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"2022112111112326300_ref95","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"issue":"D1","key":"2022112111112326300_ref96","doi-asserted-by":"crossref","first-page":"D948","DOI":"10.1093\/nar\/gky868","article-title":"The comparative toxicogenomics database: update 2019","volume":"47","author":"Davis","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref97","doi-asserted-by":"crossref","first-page":"D1018","DOI":"10.1093\/nar\/gky1105","article-title":"Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources","volume":"47","author":"K\u00f6hler","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2022112111112326300_ref98","doi-asserted-by":"crossref","first-page":"D355","DOI":"10.1093\/nar\/gkp896","article-title":"KEGG for representation and analysis of molecular networks involving diseases and drugs","volume":"38","author":"Kanehisa","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2022112111112326300_ref99","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1109\/ICDE.2019.00061","volume-title":"2019 IEEE 35th International Conference on Data Engineering (ICDE)","author":"Zhang","year":"2019"},{"issue":"D1","key":"2022112111112326300_ref100","doi-asserted-by":"crossref","first-page":"D512","DOI":"10.1093\/nar\/gku1267","article-title":"PhosphoSitePlus, 2014: Mutations, PTMs and recalibrations","volume":"43","author":"Hornbeck","year":"2015","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"2022112111112326300_ref101","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1093\/bioinformatics\/btz600","article-title":"Discovering protein drug targets using knowledge graph embeddings","volume":"36","author":"Mohamed","year":"2020","journal-title":"Bioinformatics"},{"issue":"1","key":"2022112111112326300_ref102","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-020-74922-z","article-title":"Preclinical validation of therapeutic targets predicted by tensor factorization on heterogeneous graphs","volume":"10","author":"Paliwal","year":"2020","journal-title":"Sci Rep"},{"issue":"D1","key":"2022112111112326300_ref103","doi-asserted-by":"crossref","first-page":"D529","DOI":"10.1093\/nar\/gky1079","article-title":"The BioGRID interaction database: 2019 update","volume":"47","author":"Oughtred","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2022112111112326300_ref104","doi-asserted-by":"crossref","first-page":"D380","DOI":"10.1093\/nar\/gkv1277","article-title":"STITCH 5: augmenting protein\u2013chemical interaction networks with tissue and affinity data","volume":"44","author":"Szklarczyk","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"125","key":"2022112111112326300_ref105","doi-asserted-by":"crossref","first-page":"125ra31","DOI":"10.1126\/scitranslmed.3003377","article-title":"Data-driven prediction of drug effects and interactions","volume":"4","author":"Tatonetti","year":"2012","journal-title":"Sci Transl Med"},{"key":"2022112111112326300_ref106","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1007\/978-3-319-93417-4_38","volume-title":"European Semantic Web Conference","author":"Schlichtkrull","year":"2018"},{"key":"2022112111112326300_ref107","volume-title":"International Conference on Machine Learning (ICML)","author":"Trouillon","year":"2016"},{"key":"2022112111112326300_ref108","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1109\/CVPR.2009.5206848","volume-title":"2009 IEEE conference on computer vision and pattern recognition","author":"Deng","year":"2009"},{"issue":"1","key":"2022112111112326300_ref109","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR Guiding Principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Scientific data"},{"key":"2022112111112326300_ref110","doi-asserted-by":"crossref","first-page":"57","DOI":"10.18653\/v1\/W15-4007","volume-title":"Proceedings of the 3rd workshop on continuous vector space models and their compositionality","author":"Toutanova","year":"2015"},{"issue":"2","key":"2022112111112326300_ref111","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3434185","article-title":"A troubling analysis of reproducibility and progress in recommender systems research","volume":"39","author":"Dacrema","year":"2021","journal-title":"ACM Transactions on Information Systems (TOIS)"},{"key":"2022112111112326300_ref112","article-title":"A fair comparison of graph neural networks for graph classification","author":"Errica","year":"2019"},{"issue":"1","key":"2022112111112326300_ref113","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1145\/3317287.3328534","article-title":"Troubling Trends in Machine Learning Scholarship: Some ML papers suffer from flaws that could mislead the public and stymie future research","volume":"17","author":"Lipton","year":"2019","journal-title":"Queue"},{"key":"2022112111112326300_ref114","article-title":"Bringing light into the dark: A large-scale evaluation of knowledge graph embedding models under a unified framework","author":"Ali","year":"2021","journal-title":"IEEE Trans Pattern Anal Mach Intell"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/6\/bbac404\/47144248\/bbac404.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/6\/bbac404\/47144248\/bbac404.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,2]],"date-time":"2024-10-02T13:19:35Z","timestamp":1727875175000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac404\/6712301"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,23]]},"references-count":114,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,11,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac404","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,11]]},"published":{"date-parts":[[2022,9,23]]},"article-number":"bbac404"}}