{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T17:33:41Z","timestamp":1761672821719,"version":"build-2065373602"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2025,10,7]],"date-time":"2025-10-07T00:00:00Z","timestamp":1759795200000},"content-version":"vor","delay-in-days":6,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,10,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Knowledge Graph Completion has been increasingly adopted as a useful method for helping address several tasks in biomedical research, such as drug repurposing or drug\u2013target identification. To that end, a variety of datasets and Knowledge Graph Embedding models have been proposed over the years. However, little is known about the properties that render a dataset, and associated modelling choices, useful for a given task. Moreover, even though theoretical properties of Knowledge Graph Embedding models are well understood, their practical utility in this field remains controversial.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this work, we conduct a comprehensive investigation into the topological properties of publicly available biomedical Knowledge Graphs and establish links to the accuracy observed in real-world tasks. By releasing all model predictions and a new suite of analysis tools we invite the community to build upon our work and continue improving the understanding of these crucial applications.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The code used to perform experiments and analyze results in this article as well as all experimental data is available at https:\/\/github.com\/graphcore-research\/kg-topology-toolbox\/tree\/main\/the_role_of_graph_topology_paper and archived on Zenodo, at https:\/\/doi.org\/10.5281\/zenodo.12097376.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf547","type":"journal-article","created":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T11:51:13Z","timestamp":1759578673000},"source":"Crossref","is-referenced-by-count":0,"title":["The role of graph topology in the performance of biomedical Knowledge Graph Completion models"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9743-2173","authenticated-orcid":false,"given":"Alberto","family":"Cattaneo","sequence":"first","affiliation":[{"name":"Graphcore Graphcore Research, , Bristol, BS1 2PH,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6008-358X","authenticated-orcid":false,"given":"Stephen","family":"Bonner","sequence":"additional","affiliation":[{"name":"AstraZeneca Data Sciences and Quantitative Biology, Discovery Sciences, R&D, , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"given":"Thomas","family":"Martynec","sequence":"additional","affiliation":[{"name":"AstraZeneca Data Sciences and Quantitative Biology, Discovery Sciences, R&D, , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"given":"Edward","family":"Morrissey","sequence":"additional","affiliation":[{"name":"AstraZeneca Data Sciences and Quantitative Biology, Discovery Sciences, R&D, , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"given":"Carlo","family":"Luschi","sequence":"additional","affiliation":[{"name":"Graphcore Graphcore Research, , Bristol, BS1 2PH,","place":["United Kingdom"]}]},{"given":"Ian P","family":"Barrett","sequence":"additional","affiliation":[{"name":"AstraZeneca Data Sciences and Quantitative Biology, Discovery Sciences, R&D, , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9731-5807","authenticated-orcid":false,"given":"Daniel","family":"Justus","sequence":"additional","affiliation":[{"name":"Graphcore Graphcore Research, , Bristol, BS1 2PH,","place":["United Kingdom"]}]}],"member":"286","published-online":{"date-parts":[[2025,10,7]]},"reference":[{"key":"2025102813170343000_btaf547-B1","doi-asserted-by":"crossref","first-page":"8825","DOI":"10.1109\/TPAMI.2021.3124805","article-title":"Bringing light into the dark: a large-scale evaluation of knowledge graph embedding models under a unified framework","volume":"44","author":"Ali","year":"2022","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2025102813170343000_btaf547-B2","doi-asserted-by":"crossref","first-page":"3570","DOI":"10.1038\/s41467-023-39301-y","article-title":"Biomedical knowledge graph learning for drug repurposing by extending guilt-by-association to multiple layers","volume":"14","author":"Bang","year":"2023","journal-title":"Nat Commun"},{"key":"2025102813170343000_btaf547-B3","doi-asserted-by":"crossref","first-page":"bbac404","DOI":"10.1093\/bib\/bbac404","article-title":"A review of biomedical datasets relating to drug discovery: a knowledge graph perspective","volume":"23","author":"Bonner","year":"2022","journal-title":"Brief Bioinform"},{"key":"2025102813170343000_btaf547-B4","doi-asserted-by":"crossref","first-page":"bbac279","DOI":"10.1093\/bib\/bbac279","article-title":"Implications of topological imbalance for representation learning on biomedical knowledge graphs","volume":"23","author":"Bonner","year":"2022","journal-title":"Brief Bioinform"},{"first-page":"2787","year":"2013","author":"Bordes","key":"2025102813170343000_btaf547-B5"},{"key":"2025102813170343000_btaf547-B6","doi-asserted-by":"crossref","first-page":"4097","DOI":"10.1093\/bioinformatics\/btaa274","article-title":"OpenBioLink: a benchmarking framework for large-scale biomedical link prediction","volume":"36","author":"Breit","year":"2020","journal-title":"Bioinformatics"},{"key":"2025102813170343000_btaf547-B7","doi-asserted-by":"crossref","first-page":"6894","DOI":"10.1609\/aaai.v35i8.16850","article-title":"Dual quaternion embeddings for link prediction","volume":"35","author":"Cao","year":"2021","journal-title":"AAAI"},{"key":"2025102813170343000_btaf547-B8","unstructured":"Cattaneo A, Justus D, Bonner S \u00a0et al \u00a0Link-Prediction on Biomedical Knowledge Graphs [Data set]. \u00a0Zenodo. \u00a010.5281\/zenodo.12097377. 2024."},{"key":"2025102813170343000_btaf547-B9","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1038\/s41597-023-01960-3","article-title":"Building a knowledge graph to enable precision medicine","volume":"10","author":"Chandak","year":"2023","journal-title":"Sci Data"},{"first-page":"4360","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Chao","key":"2025102813170343000_btaf547-B10"},{"first-page":"1811","year":"2018","author":"Dettmers","key":"2025102813170343000_btaf547-B11"},{"first-page":"587","volume-title":"Complex Networks & Their Applications X: Volume 2, Proceedings of the Tenth International Conference on Complex Networks and Their Applications","author":"Douglas","key":"2025102813170343000_btaf547-B12"},{"key":"2025102813170343000_btaf547-B13","doi-asserted-by":"crossref","first-page":"vbae097","DOI":"10.1093\/bioadv\/vbae097","article-title":"Knowledge graph embeddings in the biomedical domain: are they useful? A look at link prediction, rule learning, and downstream polypharmacy tasks","volume":"4","author":"Gema","year":"2024","journal-title":"Bioinform Adv"},{"key":"2025102813170343000_btaf547-B14","doi-asserted-by":"crossref","first-page":"e26726","DOI":"10.7554\/eLife.26726","article-title":"Systematic integration of biomedical knowledge prioritizes drugs for repurposing","volume":"6","author":"Himmelstein","year":"2017","journal-title":"Elife"},{"key":"2025102813170343000_btaf547-B15","first-page":"22118","article-title":"Open graph benchmark: datasets for machine learning on graphs","author":"Hu","year":"2020"},{"first-page":"290","volume-title":"The Semantic Web \u2013 ISWC 2023","author":"Jin","key":"2025102813170343000_btaf547-B16"},{"key":"2025102813170343000_btaf547-B17","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1038\/s41597-022-01510-3","article-title":"The heterogeneous pharmacological medical biochemical network PharMeBINet\u201d","volume":"9","author":"K\u00f6nigs","year":"2022","journal-title":"Sci Data"},{"key":"2025102813170343000_btaf547-B18","doi-asserted-by":"crossref","first-page":"1057","DOI":"10.1080\/17460441.2021.1910673","article-title":"Knowledge graphs and their applications in drug discovery","volume":"16","author":"MacLean","year":"2021","journal-title":"Expert Opin Drug Discov"},{"year":"2020","author":"Mohamed","key":"2025102813170343000_btaf547-B19"},{"key":"2025102813170343000_btaf547-B20","doi-asserted-by":"crossref","first-page":"18250","DOI":"10.1038\/s41598-020-74922-z","article-title":"\u201cPreclinical validation of therapeutic targets predicted by tensor factorization on heterogeneous graphs\u201d","volume":"10","author":"Paliwal","year":"2020","journal-title":"Sci Rep"},{"key":"2025102813170343000_btaf547-B21","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1186\/s12859-022-04608-y","article-title":"Task-driven knowledge graph filtering improves prioritizing drugs for repurposing","volume":"23","author":"Ratajczak","year":"2022","journal-title":"BMC Bioinformatics"},{"key":"2025102813170343000_btaf547-B22","first-page":"1","article-title":"Knowledge graph embedding for link prediction: a comparative analysis","volume":"15","author":"Rossi","year":"2021","journal-title":"ACM Trans Knowl Discov Data"},{"author":"Rossi","key":"2025102813170343000_btaf547-B23"},{"author":"Sun","key":"2025102813170343000_btaf547-B24"},{"year":"2023","author":"Teneva","key":"2025102813170343000_btaf547-B25"},{"first-page":"57","year":"2015","author":"Toutanova","key":"2025102813170343000_btaf547-B26"},{"key":"2025102813170343000_btaf547-B27","doi-asserted-by":"crossref","first-page":"1112","DOI":"10.1609\/aaai.v28i1.8870","article-title":"Knowledge graph embedding by translating on hyperplanes","volume":"28","author":"Wang","year":"2014","journal-title":"AAAI"},{"year":"2015","author":"Yang","key":"2025102813170343000_btaf547-B28"},{"year":"2022","author":"Yu","key":"2025102813170343000_btaf547-B29"},{"key":"2025102813170343000_btaf547-B30","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1016\/j.sbi.2021.09.003","article-title":"Toward better drug discovery with knowledge graph","volume":"72","author":"Zeng","year":"2022","journal-title":"Curr Opin Struct Biol"},{"key":"2025102813170343000_btaf547-B31","doi-asserted-by":"crossref","first-page":"bbaa344","DOI":"10.1093\/bib\/bbaa344","article-title":"PharmKG: a dedicated knowledge graph benchmark for bomedical data mining","volume":"22","author":"Zheng","year":"2021","journal-title":"Brief Bioinform"},{"key":"2025102813170343000_btaf547-B32","doi-asserted-by":"crossref","first-page":"giae001","DOI":"10.1093\/gigascience\/giae001","article-title":"The probability of edge existence due to node degree: a baseline for network-based predictions","volume":"13","author":"Zietz","year":"2024","journal-title":"Gigascience"},{"key":"2025102813170343000_btaf547-B33","doi-asserted-by":"crossref","first-page":"i457","DOI":"10.1093\/bioinformatics\/bty294","article-title":"Modeling polypharmacy side effects with graph convolutional networks","volume":"34","author":"Zitnik","year":"2018","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf547\/64539456\/btaf547.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/10\/btaf547\/64539456\/btaf547.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/10\/btaf547\/64539456\/btaf547.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T17:17:14Z","timestamp":1761671834000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf547\/8276990"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,10]]},"references-count":33,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf547","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,10]]},"published":{"date-parts":[[2025,10]]},"article-number":"btaf547"}}