{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T06:42:13Z","timestamp":1772001733603,"version":"3.50.1"},"reference-count":57,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T00:00:00Z","timestamp":1729900800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004052","name":"King Abdullah University of Science and Technology","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004052","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Identifying causal relations between diseases allows for the study of shared pathways, biological mechanisms, and inter-disease risks. Such causal relations can facilitate the identification of potential disease precursors and candidates for drug re-purposing. However, computational methods often lack access to these causal relations. Few approaches have been developed to automatically extract causal relationships between diseases from unstructured text, but they are often only focused on a small number of diseases, lack validation of the extracted causal relations, or do not make their data available.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We automatically mined statements asserting a causal relation between diseases from the scientific literature by leveraging lexical patterns. Following automated mining of causal relations, we mapped the diseases to the International Classification of Diseases (ICD) identifiers to allow the direct application to clinical data. We provide quantitative and qualitative measures to evaluate the mined causal relations and compare to UK Biobank diagnosis data as a completely independent data source. The validated causal associations were used to create a directed acyclic graph that can be used by causal inference frameworks. We demonstrate the utility of our causal network by performing causal inference using the do-calculus, using relations within the graph to construct and improve polygenic risk scores, and disentangle the pleiotropic effects of variants.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The data are available through https:\/\/github.com\/bio-ontology-research-group\/causal-relations-between-diseases.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae639","type":"journal-article","created":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T16:43:32Z","timestamp":1729961012000},"source":"Crossref","is-referenced-by-count":2,"title":["Causal relationships between diseases mined from the literature improve the use of polygenic risk scores"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4746-4649","authenticated-orcid":false,"given":"Sumyyah","family":"Toonsi","sequence":"first","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences & Engineering, King Abdullah University of Science and Technology , Thuwal 23955,","place":["Saudi Arabia"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7067-7012","authenticated-orcid":false,"given":"Iris Ivy","family":"Gauran","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences & Engineering, King Abdullah University of Science and Technology , Thuwal 23955,","place":["Saudi Arabia"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7020-8091","authenticated-orcid":false,"given":"Hernando","family":"Ombao","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences & Engineering, King Abdullah University of Science and Technology , Thuwal 23955,","place":["Saudi Arabia"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5111-7263","authenticated-orcid":false,"given":"Paul N","family":"Schofield","sequence":"additional","affiliation":[{"name":"Department of Physiology, Development & Neuroscience, University of Cambridge , Cambridge CB2 3EG,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8149-5890","authenticated-orcid":false,"given":"Robert","family":"Hoehndorf","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences & Engineering, King Abdullah University of Science and Technology , Thuwal 23955,","place":["Saudi Arabia"]},{"name":"SDAIA\u2013KAUST Center of Excellence in Data Science and Artificial Intelligence, King Abdullah University of Science and Technology , Thuwal 23955,","place":["Saudi Arabia"]},{"name":"KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology , Thuwal 23955,","place":["Saudi Arabia"]},{"name":"KAUST Center of Excellence for Generative AI, King Abdullah University of Science and Technology, Thuwal 23955 ,","place":["Saudi Arabia"]}]}],"member":"286","published-online":{"date-parts":[[2024,10,26]]},"reference":[{"key":"2024111406105538400_btae639-B1","doi-asserted-by":"crossref","first-page":"D789","DOI":"10.1093\/nar\/gku1205","article-title":"Omim.org: online mendelian inheritance in man (omim\u00ae), an online catalog of human genes and genetic disorders","volume":"43","author":"Amberger","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2024111406105538400_btae639-B2","first-page":"295","author":"Arsenyan"},{"key":"2024111406105538400_btae639-B3","doi-asserted-by":"crossref","first-page":"i437","DOI":"10.1093\/bioinformatics\/btw439","article-title":"Causality modeling for directed disease network","volume":"32","author":"Bang","year":"2016","journal-title":"Bioinformatics"},{"key":"2024111406105538400_btae639-B4","doi-asserted-by":"crossref","first-page":"D1305","DOI":"10.1093\/nar\/gkad1051","article-title":"The do-kb knowledgebase: a 20-year journey developing the disease open science ecosystem","volume":"52","author":"Baron","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024111406105538400_btae639-B5","first-page":"115","volume-title":"Linked Data\u2014The Story So Far","author":"Bizer","year":"2023"},{"key":"2024111406105538400_btae639-B6","doi-asserted-by":"crossref","first-page":"D267","DOI":"10.1093\/nar\/gkh061","article-title":"The unified medical language system (UMLS): integrating biomedical terminology","volume":"32","author":"Bodenreider","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2024111406105538400_btae639-B7","first-page":"2206","author":"Borgeaud"},{"key":"2024111406105538400_btae639-B8","first-page":"1","article-title":"The second generation of the PLINK software for genotype data","volume":"4","author":"Chang","year":"2015","journal-title":"GigaScience"},{"key":"2024111406105538400_btae639-B9","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1016\/S0140-6736(09)62124-3","article-title":"Diabetic retinopathy","volume":"376","author":"Cheung","year":"2010","journal-title":"Lancet"},{"key":"2024111406105538400_btae639-B10","doi-asserted-by":"crossref","first-page":"giz082","DOI":"10.1093\/gigascience\/giz082","article-title":"PRSice-2: polygenic risk score software for biobank-scale data","volume":"8","author":"Choi","year":"2019","journal-title":"Gigascience"},{"key":"2024111406105538400_btae639-B11","first-page":"e1006498","article-title":"Power and predictive accuracy of polygenic risk scores","volume":"12","author":"Dudbridge","year":"2016","journal-title":"PLOS Genet"},{"key":"2024111406105538400_btae639-B12","first-page":"406","article-title":"Is vesicolithotomy with bladder wash the answer for rectovesical fistula secondary to neglected vesical stone? Complicated presentation but simple management","volume":"35","author":"Elnaim","year":"2014","journal-title":"Saudi Med J"},{"key":"2024111406105538400_btae639-B14","article-title":"Delphi: a deep-learning framework for polygenic risk prediction","author":"Georgantas","year":"2024","journal-title":"medRxiv"},{"key":"2024111406105538400_btae639-B15","doi-asserted-by":"crossref","first-page":"e1007081","DOI":"10.1371\/journal.pgen.1007081","article-title":"Orienting the causal relationship between imprecisely measured traits using gwas summary data","volume":"13","author":"Hemani","year":"2017","journal-title":"PLoS Genet"},{"key":"2024111406105538400_btae639-B16","first-page":"841","article-title":"Does water kill? A call for less casual causal inferences","volume":"28","author":"Hern\u00e1n","year":"2018","journal-title":"Ann Epidemiol"},{"key":"2024111406105538400_btae639-B17","doi-asserted-by":"crossref","first-page":"e1000353","DOI":"10.1371\/journal.pcbi.1000353","article-title":"A dynamic network approach for the study of human phenotypes","volume":"5","author":"Hidalgo","year":"2009","journal-title":"PLoS Comput Biol"},{"key":"2024111406105538400_btae639-B18","first-page":"295","article-title":"The environment and disease: association or causation?","volume":"58","author":"Hill","year":"1965","journal-title":"Proc R Soc Med"},{"key":"2024111406105538400_btae639-B19","doi-asserted-by":"crossref","first-page":"100316","DOI":"10.1016\/j.jhepr.2021.100316","article-title":"Portal hypertension in cirrhosis: pathophysiological mechanisms and therapy","volume":"3","author":"Iwakiri","year":"2021","journal-title":"JHEP Rep"},{"key":"2024111406105538400_btae639-B20","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/MIC.2021.3133551","article-title":"Causalkg: causal knowledge graph explainability using interventional and counterfactual reasoning","volume":"26","author":"Jaimini","year":"2022","journal-title":"IEEE Internet Comput"},{"key":"2024111406105538400_btae639-B21","author":"Jiralerspong"},{"key":"2024111406105538400_btae639-B22","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1007\/s41666-022-00116-z","article-title":"Informative causality extraction from medical literature via dependency-tree\u2013based patterns","volume":"6","author":"Kabir","year":"2022","journal-title":"J Healthc Inform Res"},{"key":"2024111406105538400_btae639-B23","doi-asserted-by":"crossref","first-page":"102229","DOI":"10.1016\/j.ijinfomgt.2020.102229","article-title":"Which similarity measure to use in network analysis: impact of sample size on phi correlation coefficient and ochiai index","volume":"55","author":"Kalgotra","year":"2020","journal-title":"Int J Inf Manage"},{"key":"2024111406105538400_btae639-B24","doi-asserted-by":"crossref","first-page":"1219","DOI":"10.1038\/s41588-018-0183-z","article-title":"Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations","volume":"50","author":"Khera","year":"2018","journal-title":"Nat Genet"},{"key":"2024111406105538400_btae639-B25","doi-asserted-by":"crossref","first-page":"e1004125","DOI":"10.1371\/journal.pcbi.1004125","article-title":"Quantification of diabetes comorbidity risks across life using nation-wide big claims data","volume":"11","author":"Klimek","year":"2015","journal-title":"PLoS Comput Biol"},{"key":"2024111406105538400_btae639-B26","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1016\/j.autrev.2012.04.002","article-title":"Anti-glomerular basement membrane antibody disease: a rare autoimmune disorder affecting the kidney and the lung","volume":"12","author":"Lahmer","year":"2012","journal-title":"Autoimmun Rev"},{"key":"2024111406105538400_btae639-B27","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1038\/s41588-021-00783-5","article-title":"The polygenic score catalog as an open database for reproducibility and systematic evaluation","volume":"53","author":"Lambert","year":"2021","journal-title":"Nat Genet"},{"key":"2024111406105538400_btae639-B28","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1186\/s12911-017-0448-y","article-title":"Disease causality extraction based on lexical semantics and document-clause frequency from biomedical literature","volume":"17","author":"Lee","year":"2017","journal-title":"BMC Med Inform Decis Mak"},{"key":"2024111406105538400_btae639-B29","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/s41572-019-0106-z","article-title":"Atherosclerosis","volume":"5","author":"Libby","year":"2019","journal-title":"Nat Rev Dis Primers"},{"key":"2024111406105538400_btae639-B30","doi-asserted-by":"crossref","first-page":"814502","DOI":"10.3389\/fcvm.2022.814502","article-title":"Dissecting the polygenic basis of primary hypertension: identification of key pathway-specific components","volume":"9","author":"Maj","year":"2022","journal-title":"Front Cardiovasc Med"},{"key":"2024111406105538400_btae639-B31","doi-asserted-by":"crossref","first-page":"549","DOI":"10.1038\/s41591-020-0800-0","article-title":"Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers","volume":"26","author":"Mars","year":"2020","journal-title":"Nat Med"},{"key":"2024111406105538400_btae639-B32","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1007\/s11901-023-00601-y","article-title":"Portal hypertension in alcohol-associated hepatitis","volume":"22","author":"McConnell","year":"2023","journal-title":"Curr Hepatol Rep"},{"key":"2024111406105538400_btae639-B33","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1080\/00401706.1989.10488618","article-title":"Statistical power analysis for the behavioral sciences","volume":"31","author":"Muller","year":"1989","journal-title":"Technometrics"},{"key":"2024111406105538400_btae639-B34","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1159\/000209837","article-title":"Vesicorectal fistula, case report and review of literature","volume":"2","author":"Naguib","year":"2009","journal-title":"Curr Urol"},{"key":"2024111406105538400_btae639-B35","author":"National Institute for Health and Care Excellence","year":"2015"},{"key":"2024111406105538400_btae639-B36","author":"NCBI","year":"1996."},{"key":"2024111406105538400_btae639-B37","author":"OpenAI"},{"key":"2024111406105538400_btae639-B38","first-page":"e93","article-title":"Polygenic risk scores for cardiovascular disease: a scientific statement from the american heart association","volume":"146","author":"O'Sullivan","year":"2022","journal-title":"Circulation"},{"key":"2024111406105538400_btae639-B39","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.1399-0004.2006.00708.x","article-title":"The modular nature of genetic diseases","volume":"71","author":"Oti","year":"2006","journal-title":"Clin Genet"},{"key":"2024111406105538400_btae639-B40","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511803161","volume-title":"Causality: Models, Reasoning, and Inference","author":"Pearl","year":"2009","edition":"2nd edn"},{"key":"2024111406105538400_btae639-B41","doi-asserted-by":"crossref","first-page":"2555","DOI":"10.3390\/jcm10122555","article-title":"A comprehensive review of complications and new findings associated with anorexia nervosa","volume":"10","author":"Puckett","year":"2021","journal-title":"J Clin Med"},{"key":"2024111406105538400_btae639-B42","doi-asserted-by":"crossref","first-page":"79","DOI":"10.3233\/AO-150147","article-title":"Causality and the ontology of disease","volume":"10","author":"Rovetto","year":"2015","journal-title":"AO"},{"key":"2024111406105538400_btae639-B43","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1016\/S2213-8587(20)30161-3","article-title":"Gestational diabetes: opportunities for improving maternal and child health","volume":"8","author":"Saravanan","year":"2020","journal-title":"Lancet Diabetes Endocrinol"},{"key":"2024111406105538400_btae639-B44","doi-asserted-by":"crossref","first-page":"D955","DOI":"10.1093\/nar\/gky1032","article-title":"Human disease ontology 2018 update: classification, content and workflow expansion","volume":"47","author":"Schriml","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024111406105538400_btae639-B45","doi-asserted-by":"crossref","first-page":"D1255","DOI":"10.1093\/nar\/gkab1063","article-title":"The human disease ontology 2022 update","volume":"50","author":"Schriml","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024111406105538400_btae639-B46","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1038\/s41588-020-00757-z","article-title":"Genetics of 35 blood and urine biomarkers in the UK biobank","volume":"53","author":"Sinnott-Armstrong","year":"2021","journal-title":"Nat Genet"},{"key":"2024111406105538400_btae639-B47","doi-asserted-by":"crossref","first-page":"D977","DOI":"10.1093\/nar\/gkac1010","article-title":"The NHGRI-EBI GWAS catalog: knowledgebase and deposition resource","volume":"51","author":"Sollis","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024111406105538400_btae639-B48","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nrg3461","article-title":"Pleiotropy in complex traits: challenges and strategies","volume":"14","author":"Solovieff","year":"2013","journal-title":"Nat Rev Genet"},{"key":"2024111406105538400_btae639-B49","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1136\/jamia.2009.001230","article-title":"International classification of diseases, clinical modification and procedure coding system: descriptive overview of the next generation hipaa code sets","volume":"17","author":"Steindel","year":"2010","journal-title":"J Am Med Inform Assoc"},{"key":"2024111406105538400_btae639-B50","doi-asserted-by":"crossref","first-page":"e1001779","DOI":"10.1371\/journal.pmed.1001779","article-title":"UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of Middle and old age","volume":"12","author":"Sudlow","year":"2015","journal-title":"PLoS Med"},{"key":"2024111406105538400_btae639-B51","doi-asserted-by":"publisher","author":"Vasilevsky","DOI":"10.1101\/2022.04.13.22273750"},{"key":"2024111406105538400_btae639-B52","doi-asserted-by":"crossref","first-page":"1048","DOI":"10.1161\/01.CIR.46.6.1048","article-title":"Pathology of angina pectoris","volume":"46","author":"Vlodaver","year":"1972","journal-title":"Circulation"},{"key":"2024111406105538400_btae639-B53","doi-asserted-by":"crossref","first-page":"1339","DOI":"10.1038\/s41588-019-0481-0","article-title":"A global overview of pleiotropy and genetic architecture in complex traits","volume":"51","author":"Watanabe","year":"2019","journal-title":"Nat Genet"},{"key":"2024111406105538400_btae639-B54","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/s13073-024-01304-9","article-title":"Recent advances in polygenic scores: translation, equitability, methods and fair tools","volume":"16","author":"Xiang","year":"2024","journal-title":"Genome Med"},{"key":"2024111406105538400_btae639-B13","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1080\/09513590.2021.2005782","article-title":"The role of FGF-4 and FGFR-2 on preimplantation embryo development in experimental maternal diabetes","volume":"38","author":"Yilmaz","year":"2022","journal-title":"Gynecol Endocrinol"},{"key":"2024111406105538400_btae639-B55","first-page":"1","article-title":"Learning disease causality knowledge from the web of health data","author":"Yu","year":"2022"},{"key":"2024111406105538400_btae639-B56","doi-asserted-by":"crossref","first-page":"bbad181","DOI":"10.1093\/bib\/bbad181","article-title":"Integrating multiple traits for improving polygenic risk prediction in disease and pharmacogenomics gwas","volume":"24","author":"Zhai","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024111406105538400_btae639-B57","author":"Zhang"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae639\/60129222\/btae639.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae639\/60658163\/btae639.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae639\/60658163\/btae639.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,14]],"date-time":"2024-11-14T06:11:17Z","timestamp":1731564677000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae639\/7845254"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,10,26]]},"references-count":57,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae639","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,10,26]]},"article-number":"btae639"}}