{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,5]],"date-time":"2026-04-05T09:13:46Z","timestamp":1775380426708,"version":"3.50.1"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"11","license":[{"start":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T00:00:00Z","timestamp":1698278400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T00:00:00Z","timestamp":1698278400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Nat Mach Intell"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Causal learning is a key challenge in scientific artificial intelligence as it allows researchers to go beyond purely correlative or predictive analyses towards learning underlying cause-and-effect relationships, which are important for scientific understanding as well as for a wide range of downstream tasks. Here, motivated by emerging biomedical questions, we propose a deep neural architecture for learning causal relationships between variables from a combination of high-dimensional data and prior causal knowledge. We combine convolutional and graph neural networks within a causal risk framework to provide an approach that is demonstrably effective under the conditions of high dimensionality, noise and data limitations that are characteristic of many applications, including in large-scale biology. In experiments, we find that the proposed learners can effectively identify novel causal relationships across thousands of variables. Results include extensive (linear and nonlinear) simulations (where the ground truth is known and can be directly compared against), as well as real biological examples where the models are applied to high-dimensional molecular data and their outputs compared against entirely unseen validation experiments. These results support the notion that deep learning approaches can be used to learn causal networks at large scale.<\/jats:p>","DOI":"10.1038\/s42256-023-00744-z","type":"journal-article","created":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T16:02:45Z","timestamp":1698336165000},"page":"1306-1316","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":45,"title":["Deep learning of causal structures in high dimensions under data limitations"],"prefix":"10.1038","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8485-7682","authenticated-orcid":false,"given":"Kai","family":"Lagemann","sequence":"first","affiliation":[]},{"given":"Christian","family":"Lagemann","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6574-4789","authenticated-orcid":false,"given":"Bernd","family":"Taschler","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0390-4358","authenticated-orcid":false,"given":"Sach","family":"Mukherjee","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,10,26]]},"reference":[{"key":"744_CR1","volume-title":"Elements of Causal Inference","author":"J Peters","year":"2017","unstructured":"Peters, J., Janzing, D. & Sch\u00f6lkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms (MIT Press, 2017)."},{"key":"744_CR2","unstructured":"Arjovsky, M., Bottou, L., Gulrajani, I. & Lopez-Paz, D. Invariant risk minimization. Preprint at https:\/\/arxiv.org\/abs\/1907.02893 (2019)."},{"key":"744_CR3","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1146\/annurev-statistics-031017-100630","volume":"5","author":"C Heinze-Deml","year":"2018","unstructured":"Heinze-Deml, C., Maathuis, M. H. & Meinshausen, N. Causal structure learning. Annu. Rev. Stat. Appl. 5, 371\u2013391 (2018).","journal-title":"Annu. Rev. Stat. Appl."},{"key":"744_CR4","volume-title":"Causation","author":"P Spirtes","year":"2000","unstructured":"Spirtes, P., Glymour, C. & Scheines, R. Causation, Prediction and Search (MIT Press, 2000)."},{"key":"744_CR5","first-page":"2003","volume":"7","author":"S Shimizu","year":"2006","unstructured":"Shimizu, S., Hoyer, P. O., Hyv\u00e4rinen, A. & Kerminen, A. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7, 2003\u20132030 (2006).","journal-title":"J. Mach. Learn. Res."},{"key":"744_CR6","doi-asserted-by":"publisher","first-page":"3133","DOI":"10.1214\/09-AOS685","volume":"37","author":"MH Maathuis","year":"2009","unstructured":"Maathuis, M. H., Kalisch, M. & B\u00fchlmann, P. Estimating high-dimensional intervention effects from observational data. Ann. Stat. 37, 3133\u20133164 (2009).","journal-title":"Ann. Stat."},{"key":"744_CR7","first-page":"2409","volume":"13","author":"A Hauser","year":"2012","unstructured":"Hauser, A. & B\u00fchlmann, P. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13, 2409\u20132464 (2012).","journal-title":"J. Mach. Learn. Res."},{"key":"744_CR8","doi-asserted-by":"publisher","first-page":"294","DOI":"10.1214\/11-AOS940","volume":"40","author":"D Colombo","year":"2012","unstructured":"Colombo, D., Maathuis, M. H., Kalisch, M. & Richardson, T. S. Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40, 294\u2013321 (2012).","journal-title":"Ann. Stat."},{"key":"744_CR9","doi-asserted-by":"publisher","first-page":"947","DOI":"10.1111\/rssb.12167","volume":"78","author":"J Peters","year":"2016","unstructured":"Peters, J., B\u00fchlmann, P. & Meinshausen, N. Causal inference using invariant prediction: identification and confidence intervals. J. R. Stat. Soc. 78, 947\u20131012 (2016).","journal-title":"J. R. Stat. Soc."},{"key":"744_CR10","first-page":"127","volume":"20","author":"SM Hill","year":"2019","unstructured":"Hill, S. M., Oates, C. J., Blythe, D. A. & Mukherjee, S. Causal learning via manifold regularization. J. Mach. Learn. Res. 20, 127 (2019).","journal-title":"J. Mach. Learn. Res."},{"key":"744_CR11","unstructured":"Zheng, X., Aragam, B., Ravikumar, P. K. & Xing, E. P. DAGs with no tears: continuous optimization for structure learning. In Proc. Advance in Neural Information Processing Systems Vol. 31, 9472\u20139483, (eds Bengio, S. et al.) (Curran Associates, 2018)."},{"key":"744_CR12","unstructured":"Ke, N. R. et al. Learning neural causal models from unknown interventions. Preprint at https:\/\/arxiv.org\/abs\/1910.01075 (2019)."},{"key":"744_CR13","first-page":"21865","volume":"33","author":"P Brouillard","year":"2020","unstructured":"Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien, S. & Drouin, A. Differentiable causal discovery from interventional data. Adv. Neural Inf. Process. Syst. 33, 21865\u201321877 (2020).","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"744_CR14","first-page":"19290","volume":"35","author":"R Lopez","year":"2022","unstructured":"Lopez, R., H\u00fctter, J.-C., Pritchard, J. & Regev, A. Large-scale differentiable causal discovery of factor graphs. Adv. Neural Inf. Process. Syst. 35, 19290\u201319303 (2022).","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"744_CR15","unstructured":"Lippe, P., Cohen, T. & Gavves, E. Efficient neural causal discovery without acyclicity constraints. In International Conference on Learning Representations (2022)."},{"key":"744_CR16","doi-asserted-by":"publisher","first-page":"565","DOI":"10.1038\/msb.2011.99","volume":"8","author":"T Ideker","year":"2012","unstructured":"Ideker, T. & Krogan, N. J. Differential network biology. Mol. Syst. Biol. 8, 565 (2012).","journal-title":"Mol. Syst. Biol."},{"key":"744_CR17","doi-asserted-by":"publisher","first-page":"310","DOI":"10.1038\/nmeth.3773","volume":"13","author":"SM Hill","year":"2016","unstructured":"Hill, S. M. et al. Inferring causal molecular networks: empirical assessment through a community-based effort. Nat. Methods 13, 310\u2013318 (2016).","journal-title":"Nat. Methods"},{"key":"744_CR18","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1016\/j.cels.2016.11.013","volume":"4","author":"SM Hill","year":"2017","unstructured":"Hill, S. M. et al. Context specificity in causal signaling networks revealed by phosphoprotein profiling. Cell Syst. 4, 73\u201383 (2017).","journal-title":"Cell Syst."},{"key":"744_CR19","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1038\/s41568-020-0240-7","volume":"20","author":"BM Kuenzi","year":"2020","unstructured":"Kuenzi, B. M. & Ideker, T. A census of pathway maps in cancer systems biology. Nat. Rev. Cancer 20, 233\u2013246 (2020).","journal-title":"Nat. Rev. Cancer"},{"key":"744_CR20","unstructured":"Lopez-Paz, D., Muandet, K., Sch\u00f6lkopf, B. & Tolstikhin, I. Towards a learning theory of cause-effect inference. In Proc. 32nd International Conference on Machine Learning Vol. 37, 1452\u20131461 (eds Bach, F. et al.) (PMLR, 2015)."},{"key":"744_CR21","first-page":"1","volume":"17","author":"JM Mooij","year":"2016","unstructured":"Mooij, J. M., Peters, J., Janzing, D., Zscheischler, J. & Sch\u00f6lkopf, B. Distinguishing cause from effect using observational data: methods and benchmarks. J. Mach. Learn. Res. 17, 1\u2013102 (2016).","journal-title":"J. Mach. Learn. Res."},{"key":"744_CR22","unstructured":"No\u00e8, U., Taschler, B., T\u00e4ger, J., Heutink, P. & Mukherjee, S. Ancestral causal learning in high dimensions with a human genome-wide application. Preprint at https:\/\/arxiv.org\/abs\/1905.11506 (2019)."},{"key":"744_CR23","unstructured":"Eigenmann, M., Mukherjee, S. & Maathuis, M. Evaluation of causal structure learning algorithms via risk estimation. In Proc. 36th Conference of Uncertainty in Artificial Intelligence 2020, UAI 2020 Vol. 124, 151\u2013160 (eds Peters, J. et al.) (PMLR, 2020)."},{"key":"744_CR24","unstructured":"Ke, N. R. et al. Learning to induce causal structure. Preprint at https:\/\/arxiv.org\/abs\/2204.04875 (2022)."},{"key":"744_CR25","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1016\/j.cell.2014.02.054","volume":"157","author":"P Kemmeren","year":"2014","unstructured":"Kemmeren, P. et al. Large-scale genetic perturbations reveal regulatory networks and an abundance of gene-specific repressors. Cell 157, 740\u2013752 (2014).","journal-title":"Cell"},{"key":"744_CR26","doi-asserted-by":"publisher","first-page":"7361","DOI":"10.1073\/pnas.1510493113","volume":"113","author":"N Meinshausen","year":"2016","unstructured":"Meinshausen, N. et al. Methods for causal inference from gene perturbation experiments and validation. Proc. Natl Acad. Sci. USA 113, 7361\u20137368 (2016).","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"744_CR27","first-page":"1437","volume":"9","author":"J Zhang","year":"2008","unstructured":"Zhang, J. Causal reasoning with ancestral graphs. J. Mach. Learn. Res. 9, 1437\u20131474 (2008).","journal-title":"J. Mach. Learn. Res."},{"key":"744_CR28","doi-asserted-by":"crossref","unstructured":"Alon, U. An Introduction to Systems Biology: Design Principles of Biological Circuits (CRC Press, 2019).","DOI":"10.1201\/9780429283321"},{"key":"744_CR29","first-page":"3387","volume":"13","author":"A Hyttinen","year":"2012","unstructured":"Hyttinen, A., Eberhardt, F. & Hoyer, P. O. Learning linear cyclic causal models with latent variables. J. Mach. Learn. Res. 13, 3387\u20133439 (2012).","journal-title":"J. Mach. Learn. Res."},{"key":"744_CR30","doi-asserted-by":"publisher","first-page":"981","DOI":"10.1086\/525638","volume":"74","author":"F Eberhardt","year":"2007","unstructured":"Eberhardt, F. & Scheines, R. Interventions and causal inference. Philos. Sci. 74, 981\u2013995 (2007).","journal-title":"Philos. Sci."},{"key":"744_CR31","unstructured":"Kocaoglu, M., Shanmugam, K. & Bareinboim, E. Experimental design for learning causal graphs with latent variables. In Proc. Advance in Neural Information Processing Systems Vol. 30, 7018\u20137028, (eds Guyon, I. et al.) (Curran Associates, 2017)."},{"key":"744_CR32","doi-asserted-by":"publisher","first-page":"2559","DOI":"10.1016\/j.cell.2022.05.013","volume":"185","author":"JM Replogle","year":"2022","unstructured":"Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559\u20132575 (2022).","journal-title":"Cell"},{"key":"744_CR33","unstructured":"Sch\u00f6lkopf, B. et al. On causal and anticausal learning. In Proc. 29th International Conference on Machine Learning, ICML 2012 459\u2013466 (eds Langford, J. et al.) (icml.cc\/Omnipress, 2012)."},{"key":"744_CR34","unstructured":"Silverman, B. W. Density Estimation for Statistics and Data Analysis (Chapman & Hall, 1986)."},{"key":"744_CR35","unstructured":"Turlach, B. Bandwidth selection in kernel density estimation: a review. Technical Report (1999)."},{"key":"744_CR36","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770\u2013778 (IEEE, 2016).","DOI":"10.1109\/CVPR.2016.90"},{"key":"744_CR37","doi-asserted-by":"crossref","unstructured":"Szegedy, C. et al. Going deeper with convolutions. In Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1\u20139 (IEEE, 2015).","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"744_CR38","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In Proc. 2015 IEEE International Conference on Computer Vision (ICCV) 1026\u20131034 (IEEE, 2015).","DOI":"10.1109\/ICCV.2015.123"},{"key":"744_CR39","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z. & He, K. Aggregated residual transformations for deep neural networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 5998\u20135995 (IEEE, 2017).","DOI":"10.1109\/CVPR.2017.634"},{"key":"744_CR40","unstructured":"Zhang, M. & Chen, Y. Link prediction based on graph neural networks. In Proc. Advances in Neural Information Processing Systems 2018 Vol. 31, 5165\u20135175 (eds Bengio, S. et al.) (Curran Associates, 2018)."},{"key":"744_CR41","doi-asserted-by":"publisher","unstructured":"Chen, D. et al. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. Computing Research Repository (CoRR) https:\/\/doi.org\/10.1609\/aaai.v34i04.5747 (2019).","DOI":"10.1609\/aaai.v34i04.5747"},{"key":"744_CR42","doi-asserted-by":"crossref","unstructured":"Li, Q., Han, Z. & Wu, X.-M. Deeper insights into graph convolutional networks for semi-supervised learning. In Proc. 32nd AAAI Conference on Artificial Intelligence 3538\u20133545 (eds McIlraith, S. et al.) (AAAI, 2018).","DOI":"10.1609\/aaai.v32i1.11604"},{"key":"744_CR43","doi-asserted-by":"crossref","unstructured":"Zhang, M., Cui, Z., Neumann, M. & Chen, Y. An end-to-end deep learning architecture for graph classification. In Proc. 32nd AAAI Conference on Artificial Intelligence 4438\u20134445 (eds McIlraith, S. et al.) (AAAI, 2018).","DOI":"10.1609\/aaai.v32i1.11782"},{"key":"744_CR44","unstructured":"Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Proc. Advances in Neural Information Processing Systems Vol. 32, 8026\u20138037 (eds Wallach, H. et al.) (Curran Associates, 2019)."},{"key":"744_CR45","unstructured":"Wang, M. et al. Deep Graph Library: a graph-centric, highly-performant package for graph neural networks. Preprint at https:\/\/arxiv.org\/abs\/1909.01315 (2019)."},{"key":"744_CR46","unstructured":"Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In 3rd International Conference on Learning Representations (2015)."},{"key":"744_CR47","doi-asserted-by":"crossref","unstructured":"Lagemann, K., Lagemann, C., Taschler, B. & Mukherjee, S. Deep learning of causal structures in high dimensions under data limitations https:\/\/codeocean.com\/capsule\/4465854\/tree\/v1CodeOcean (2023).","DOI":"10.1038\/s42256-023-00744-z"}],"container-title":["Nature Machine Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00744-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00744-z","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00744-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,19]],"date-time":"2023-11-19T21:36:56Z","timestamp":1700429816000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s42256-023-00744-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,26]]},"references-count":47,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2023,11]]}},"alternative-id":["744"],"URL":"https:\/\/doi.org\/10.1038\/s42256-023-00744-z","relation":{},"ISSN":["2522-5839"],"issn-type":[{"value":"2522-5839","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,26]]},"assertion":[{"value":"13 April 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 September 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 October 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}]}}