{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T14:00:55Z","timestamp":1760623255094,"version":"3.37.3"},"reference-count":61,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T00:00:00Z","timestamp":1656288000000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Data Model Convergence Initiative at Pacific Northwest National Laboratory"},{"name":"Laboratory Directed Research and Development Program at PNNL"},{"DOI":"10.13039\/100000015","name":"U.S. Department of Energy","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000015","id-type":"DOI","asserted-by":"publisher"}]},{"name":"DARPA Young Faculty","award":["W911NF2010255","# 574137","NSF-BIO\/DBI 1759736","NSF-BIO\/DBI 1950412","NIH-NLM-R01 1R01LM013115"],"award-info":[{"award-number":["W911NF2010255","# 574137","NSF-BIO\/DBI 1759736","NSF-BIO\/DBI 1950412","NIH-NLM-R01 1R01LM013115"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,6,24]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Estimating causal queries, such as changes in protein abundance in response to a perturbation, is a fundamental task in the analysis of biomolecular pathways. The estimation requires experimental measurements on the pathway components. However, in practice many pathway components are left unobserved (latent) because they are either unknown, or difficult to measure. Latent variable models (LVMs) are well-suited for such estimation. Unfortunately, LVM-based estimation of causal queries can be inaccurate when parameters of the latent variables are not uniquely identified, or when the number of latent variables is misspecified. This has limited the use of LVMs for causal inference in biomolecular pathways.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>In this article, we propose a general and practical approach for LVM-based estimation of causal queries. We prove that, despite the challenges above, LVM-based estimators of causal queries are accurate if the queries are identifiable according to Pearl\u2019s do-calculus and describe an algorithm for its estimation. We illustrate the breadth and the practical utility of this approach for estimating causal queries in four synthetic and two experimental case studies, where structures of biomolecular pathways challenge the existing methods for causal query estimation.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The code and the data documenting all the case studies are available at https:\/\/github.com\/srtaheri\/LVMwithDoCalculus.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac251","type":"journal-article","created":{"date-parts":[[2022,4,14]],"date-time":"2022-04-14T11:10:15Z","timestamp":1649934615000},"page":"i350-i358","source":"Crossref","is-referenced-by-count":6,"title":["Do-calculus enables estimation of causal effects in partially observed biomolecular pathways"],"prefix":"10.1093","volume":"38","author":[{"given":"Sara","family":"Mohammad-Taheri","sequence":"first","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA"}]},{"given":"Jeremy","family":"Zucker","sequence":"additional","affiliation":[{"name":"Computational Biology, Pacific Northwest National Laboratory , Richland, Washington, DC 99354, USA"}]},{"given":"Charles Tapley","family":"Hoyt","sequence":"additional","affiliation":[{"name":"Laboratory of Systems Pharmacology, Harvard Medical School , Boston, MA 02115, USA"}]},{"given":"Karen","family":"Sachs","sequence":"additional","affiliation":[{"name":"Next Generation Analytics , Palo Alto, CA 94301, USA"},{"name":"Answer ALS Consortium , LA, CA 70184, USA"}]},{"given":"Vartika","family":"Tewari","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA"}]},{"given":"Robert","family":"Ness","sequence":"additional","affiliation":[{"name":"Microsoft Research , Redmond, WA 98052, USA"}]},{"given":"Olga","family":"Vitek","sequence":"additional","affiliation":[{"name":"Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,6,27]]},"reference":[{"key":"2023041407560837900_","doi-asserted-by":"crossref","DOI":"10.1201\/9780429283321","volume-title":"An Introduction to Systems Biology: Design Principles of Biological Circuits","author":"Alon","year":"2019"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1162\/neco.1997.9.2.349","article-title":"Statistical inference, Occam\u2019s razor, and statistical mechanics on the space of probability distributions","volume":"9","author":"Balasubramanian","year":"1997","journal-title":"Neural Comput"},{"year":"2020","author":"Bhattacharya","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.biotechadv.2011.05.010","article-title":"Computational model of EGFR and IGF1R pathways in lung cancer: a systems biology approach for translational oncology","volume":"30","author":"Bianconi","year":"2012","journal-title":"Biotechnol. Adv"},{"key":"2023041407560837900_","first-page":"973","article-title":"Pyro: deep universal probabilistic programming","volume":"20","author":"Bingham","year":"2019","journal-title":"J. Mach. Learn. Res"},{"volume-title":"Pattern Recognition and Machine Learning","year":"2006","author":"Bishop","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","article-title":"Variational inference: a review for statisticians","volume":"112","author":"Blei","year":"2017","journal-title":"J. Am. Stat. Assoc"},{"first-page":"9867","year":"2021","author":"Cannon","key":"2023041407560837900_"},{"year":"2019","author":"D\u2019Amour","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis","author":"Durbin","year":"1998"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"981","DOI":"10.1086\/525638","article-title":"Interventions and causal inference","volume":"74","author":"Eberhardt","year":"2007","journal-title":"Philos. Sci"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1038\/msb4100115","article-title":"Reconstructing dynamic regulatory maps","volume":"3","author":"Ernst","year":"2007","journal-title":"Mol. Syst. Biol"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"625","DOI":"10.1111\/sjos.12194","article-title":"Graphs for margins of Bayesian networks","volume":"43","author":"Evans","year":"2016","journal-title":"Scand. J. Statist"},{"year":"2013","author":"Galles","key":"2023041407560837900_"},{"key":"2023041407560837900_","volume-title":"Bayesian Data Analysis","author":"Gelman","year":"2014","edition":"3rd edn"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"2340","DOI":"10.1021\/j100540a008","article-title":"Exact stochastic simulation of coupled chemical reactions","volume":"81","author":"Gillespie","year":"1977","journal-title":"J. Phys. Chem"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1111\/j.1467-9868.2010.00765.x","article-title":"Riemann manifold Langevin and Hamiltonian Monte Carlo methods","volume":"73","author":"Girolami","year":"2011","journal-title":"J. R. Stat. Soc. B"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"954","DOI":"10.15252\/msb.20177651","article-title":"From word models to executable models of signaling networks using automated assembly","volume":"13","author":"Gyori","year":"2017","journal-title":"Mol. Syst. Biol"},{"first-page":"1030","year":"2021","author":"Helske","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1016\/j.immuni.2020.04.003","article-title":"COVID-19: a new virus, but a familiar receptor and cytokine release syndrome","volume":"52","author":"Hirano","year":"2020","journal-title":"Immunity"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"703","DOI":"10.1093\/bioinformatics\/btx660","article-title":"PyBEL: a computational framework for biological expression language","volume":"34","author":"Hoyt","year":"2018","journal-title":"Bioinformatics"},{"year":"2006","author":"Huang","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"1823","DOI":"10.1097\/00002030-199814000-00014","article-title":"CD4 cell count as a surrogate endpoint in HIV clinical trials: a meta-analysis of studies of the aids clinical trials group","volume":"12","author":"Hughes","year":"1998","journal-title":"Aids"},{"year":"2020","author":"Jung","key":"2023041407560837900_"},{"year":"2021","author":"Jung","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1093\/bib\/bbz104","article-title":"Pathway tools version 23.0 update: software for pathway\/genome informatics and systems biology","volume":"22","author":"Karp","year":"2021","journal-title":"Brief. Bioinform"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"2098","DOI":"10.3389\/fmicb.2021.711077","article-title":"The ECOCYC database in 2021","volume":"12","author":"Keseler","year":"2021","journal-title":"Front. Microbiol"},{"volume-title":"Probabilistic Graphical Models: Principles and Techniques","year":"2009","author":"Koller","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1049\/iet-syb.2014.0013","article-title":"Identifying latent dynamic components in biological systems","volume":"9","author":"Kondofersky","year":"2015","journal-title":"IET Syst. Biol"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1093\/biomet\/ast066","article-title":"Measurement bias and effect restoration in causal inference","volume":"101","author":"Kuroki","year":"2014","journal-title":"Biometrika"},{"year":"2019","author":"Lattimore","key":"2023041407560837900_"},{"year":"2019","author":"Lattimore","key":"2023041407560837900_"},{"key":"2023041407560837900_","first-page":"6446","volume-title":"Advances in Neural Information Processing Systems, Long Beach, CA.","author":"Louizos","year":"2017"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"11980","DOI":"10.1073\/pnas.2133841100","article-title":"Structure and function of the feed-forward loop network motif","volume":"100","author":"Mangan","year":"2003","journal-title":"Proc. Natl. Acad. Sci. USA"},{"first-page":"2968","year":"2021","author":"McNaughton","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1214\/ss\/1177010894","article-title":"Bayesian analysis in expert systems: comment: graphical models, causality and intervention","volume":"8","author":"Pearl","year":"1993","journal-title":"Stat. Sci"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"669","DOI":"10.1093\/biomet\/82.4.669","article-title":"Causal diagrams for empirical research","volume":"82","author":"Pearl","year":"1995","journal-title":"Biometrika"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511803161","volume-title":"Causality","author":"Pearl","year":"2009"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1145\/3241036","article-title":"The seven tools of causal inference, with reflections on machine learning","volume":"62","author":"Pearl","year":"2019","journal-title":"Commun. ACM"},{"volume-title":"The Book of Why: The New Science of Cause and Effect","year":"2018","author":"Pearl","key":"2023041407560837900_"},{"key":"2023041407560837900_","first-page":"294","volume-title":"Advances in Neural Information Processing Systems, Vancouver, British Columbia, Canada","author":"Rasmussen","year":"2001"},{"year":"2017","author":"Richardson","key":"2023041407560837900_"},{"volume-title":"Monte Carlo Statistical Methods","year":"2013","author":"Robert","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-13483-w","article-title":"The Escherichia coli transcriptome mostly consists of independently regulated modules","volume":"10","author":"Sastry","year":"2019","journal-title":"Nat. Commun"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1089\/cmb.2008.0081","article-title":"Analysis of gene sets based on the underlying regulatory network","volume":"16","author":"Shojaie","year":"2009","journal-title":"J. Comput. Biol"},{"first-page":"1219","year":"2006","author":"Shpitser","key":"2023041407560837900_"},{"key":"2023041407560837900_","first-page":"1941","article-title":"Complete identification methods for the causal hierarchy","volume":"9","author":"Shpitser","year":"2008","journal-title":"J. Mach. Learn. Res"},{"year":"2012","author":"Shpitser","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"3","DOI":"10.2333\/bhmk.41.3","article-title":"Introduction to nested Markov models","volume":"41","author":"Shpitser","year":"2014","journal-title":"Behaviormetrika"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1016\/j.drudis.2013.12.011","article-title":"Recent advances in modeling languages for pathway maps and computable biological networks","volume":"19","author":"Slater","year":"2014","journal-title":"Drug Discov. Today"},{"volume-title":"Causation, Prediction, and Search","year":"2000","author":"Spirtes","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"e1007424","DOI":"10.1371\/journal.pcbi.1007424","article-title":"Bayesian inference of metabolic kinetics from genome-scale multiomics data","volume":"15","author":"St John","year":"2019","journal-title":"PLoS Comput. Biol"},{"year":"2018","key":"2023041407560837900_"},{"year":"2020","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1016\/j.medmal.2020.04.002","article-title":"Interleukin-6 as a potential biomarker of COVID-19 progression","volume":"50","author":"Ulhaq","year":"2020","journal-title":"Med. Mal. Infect"},{"year":"2013","author":"Van Hoey","key":"2023041407560837900_"},{"year":"2020","author":"Wang","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"1574","DOI":"10.1080\/01621459.2019.1686987","article-title":"The blessings of multiple causes","volume":"114","author":"Wang","year":"2019","journal-title":"J. Am. Stat. Assoc"},{"year":"2018","author":"Wilkinson","key":"2023041407560837900_"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"105954","DOI":"10.1016\/j.ijantimicag.2020.105954","article-title":"Cytokine release syndrome in severe COVID-19: interleukin-6 receptor antagonist tocilizumab may be the key to reduce mortality","volume":"55","author":"Zhang","year":"2020","journal-title":"Int. J. Antimicrob. Agents"},{"key":"2023041407560837900_","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1109\/TBDATA.2021.3050680","article-title":"Leveraging structured biological knowledge for counterfactual inference: a case study of viral pathogenesis","volume":"7","author":"Zucker","year":"2021","journal-title":"IEEE Trans. Big Data"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/Supplement_1\/i350\/49886880\/btac251.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/Supplement_1\/i350\/49886880\/btac251.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,22]],"date-time":"2024-09-22T06:55:49Z","timestamp":1726988149000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/Supplement_1\/i350\/6617530"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,24]]},"references-count":61,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2022,6,24]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac251","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2022,7,1]]},"published":{"date-parts":[[2022,6,24]]}}}