{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T14:25:10Z","timestamp":1777386310889,"version":"3.51.4"},"reference-count":53,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2020,3,29]],"date-time":"2020-03-29T00:00:00Z","timestamp":1585440000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001711","name":"Schweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen Forschung","doi-asserted-by":"publisher","award":["P2BSP2_184359"],"award-info":[{"award-number":["P2BSP2_184359"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012390","name":"SystemsX.ch","doi-asserted-by":"publisher","award":["51MRP0158328"],"award-info":[{"award-number":["51MRP0158328"]}],"id":[{"id":"10.13039\/501100012390","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Estimating the effects of an intervention from high-dimensional observational data is a challenging problem due to the existence of confounding. The task is often further complicated in healthcare applications where a set of observations may be entirely missing for certain patients at test time, thereby prohibiting accurate inference. In this paper, we address this issue using an approach based on the information bottleneck to reason about the effects of interventions. To this end, we first train an information bottleneck to perform a low-dimensional compression of covariates by explicitly considering the relevance of information for treatment effects. As a second step, we subsequently use the compressed covariates to perform a transfer of relevant information to cases where data are missing during testing. In doing so, we can reliably and accurately estimate treatment effects even in the absence of a full set of covariate information at test time. Our results on two causal inference benchmarks and a real application for treating sepsis show that our method achieves state-of-the-art performance, without compromising interpretability.<\/jats:p>","DOI":"10.3390\/e22040389","type":"journal-article","created":{"date-parts":[[2020,3,31]],"date-time":"2020-03-31T13:27:19Z","timestamp":1585661239000},"page":"389","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Information Bottleneck for Estimating Treatment Effects with Systematically Missing Covariates"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8400-3732","authenticated-orcid":false,"given":"Sonali","family":"Parbhoo","sequence":"first","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Basel, Basel CH 4051, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8737-3605","authenticated-orcid":false,"given":"Mario","family":"Wieser","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Basel, Basel CH 4051, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1565-1781","authenticated-orcid":false,"given":"Aleksander","family":"Wieczorek","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Basel, Basel CH 4051, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Volker","family":"Roth","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Basel, Basel CH 4051, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,3,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1228","DOI":"10.1080\/01621459.2017.1319839","article-title":"Estimation and inference of heterogeneous treatment effects using random forests","volume":"113","author":"Wager","year":"2017","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_2","unstructured":"Alaa, A.M., and van der Schaar, M. (2017, January 4\u20139). Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1257\/jel.47.1.5","article-title":"Recent developments in the econometrics of program evaluation","volume":"47","author":"Imbens","year":"2009","journal-title":"J. Econ. Lit."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1257\/jep.31.2.3","article-title":"The state of applied econometrics: Causality and policy evaluation","volume":"31","author":"Athey","year":"2017","journal-title":"J. Econ. Perspect."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1080\/01621459.1999.10473858","article-title":"Causal effects in nonexperimental studies: Reevaluating the evaluation of training programs","volume":"94","author":"Dehejia","year":"1999","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_6","unstructured":"Johansson, F.D., Shalit, U., and Sontag, D. (2016, January 19\u201324). Learning Representations for Counterfactual Inference. Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Little, R.J., and Rubin, D.B. (2019). Statistical Analysis with Missing Data, John Wiley & Sons.","DOI":"10.1002\/9781119482260"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1093\/biomet\/63.3.581","article-title":"Inference and missing data","volume":"63","author":"Rubin","year":"1976","journal-title":"Biometrika"},{"key":"ref_9","unstructured":"Greenland, S., and Lash, T. (2008). Bias Analysis. Modern Epidemiology, Lippincott Williams & Wilkins."},{"key":"ref_10","unstructured":"Pearl, J. (2012). On measurement bias in causal inference. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1093\/biomet\/ast066","article-title":"Measurement bias and effect restoration in causal inference","volume":"101","author":"Kuroki","year":"2014","journal-title":"Biometrika"},{"key":"ref_12","unstructured":"Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., and Welling, M. (2017, January 4\u20139). Causal Effect Inference with Deep Latent-Variable Models. Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA."},{"key":"ref_13","unstructured":"Tishby, N., Pereira, F.C., and Bialek, W. (2000). The information bottleneck method. arXiv."},{"key":"ref_14","unstructured":"Alemi, A.A., Fischer, I., Dillon, J.V., and Murphy, K. (2016). Deep Variational Information Bottleneck. arXiv."},{"key":"ref_15","first-page":"1103","article-title":"Distinguishing cause from effect using observational data: methods and benchmarks","volume":"17","author":"Mooij","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_16","first-page":"2009","article-title":"Causal discovery with continuous additive noise models","volume":"15","author":"Peters","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_17","first-page":"1","article-title":"Sur les applications de la th\u00e9orie des probabilit\u00e9s aux experiences agricoles: Essai des principes","volume":"10","year":"1923","journal-title":"Roczniki Nauk Rolniczych"},{"key":"ref_18","first-page":"465","article-title":"On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9","volume":"5","year":"1990","journal-title":"Stat. Sci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1214\/aos\/1176344064","article-title":"Bayesian inference for causal effects: The role of randomization","volume":"6","author":"Rubin","year":"1978","journal-title":"Ann. Stat."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Pearl, J. (2009). Causality, Cambridge University Press.","DOI":"10.1017\/CBO9780511803161"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Morgan, S.L., and Winship, C. (2015). Counterfactuals and Causal Inference, Cambridge University Press.","DOI":"10.1017\/CBO9781107587991"},{"key":"ref_22","unstructured":"Schulam, P., and Saria, S. (2017, January 4\u20139). Reliable Decision Support using Counterfactual Models. Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA."},{"key":"ref_23","unstructured":"Schulam, P., and Saria, S. (2017, January 4\u20139). What-If Reasoning using Counterfactual Gaussian Processes. Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA."},{"key":"ref_24","first-page":"3207","article-title":"Counterfactual reasoning and learning systems: The example of computational advertising","volume":"14","author":"Bottou","year":"2013","journal-title":"J. Mach. Learn. Res."},{"key":"ref_25","unstructured":"Dud\u00edk, M., Langford, J., and Li, L. (2011). Doubly robust policy evaluation and learning. arXiv."},{"key":"ref_26","unstructured":"Thomas, P., and Brunskill, E. (2016, January 19\u201324). Data-efficient off-policy policy evaluation for reinforcement learning. Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_27","unstructured":"Jiang, N., and Li, L. (2016). Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. arXiv."},{"key":"ref_28","unstructured":"Dawid, P. (2007). Fundamentals of Statistical Causality, University College London. Technical report; Department of Statistical Science."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1002\/sim.4124","article-title":"Estimating propensity scores with missing covariate data using general location mixture models","volume":"30","author":"Mitra","year":"2011","journal-title":"Stat. Med."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1037\/met0000076","article-title":"Propensity score analysis with missing data","volume":"21","author":"Cham","year":"2016","journal-title":"Psychol. Methods"},{"key":"ref_31","unstructured":"Kallus, N., Mao, X., and Udell, M. (2018, January 3\u20138). Causal inference with noisy and missing covariates via matrix factorization. Proceedings of the Advances in Neural Information Processing Systems 31 (NIPS 2018), Montr\u00e9al, QC, Canada."},{"key":"ref_32","first-page":"165","article-title":"Information bottleneck for Gaussian variables","volume":"6","author":"Chechik","year":"2005","journal-title":"J. Mach. Learn. Res."},{"key":"ref_33","unstructured":"Rey, M., and Roth, V. (2012, January 3\u20136). Meta-Gaussian Information Bottleneck. Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2897","DOI":"10.1109\/TPAMI.2017.2784440","article-title":"Information dropout: Learning optimal representations through noisy computation","volume":"40","author":"Achille","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_35","unstructured":"Wieczorek, A., Wieser, M., Murezzan, D., and Roth, V. (May, January 30). Learning Sparse Latent Representations with the Deep Copula Information Bottleneck. Proceedings of the International Conference on Learning Representations (ICLR 2018), Vancouver, QC, Canada."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wieczorek, A., and Roth, V. (2019). On the Difference Between the Information Bottleneck and the Deep Information Bottleneck. arXiv.","DOI":"10.3390\/e22020131"},{"key":"ref_37","unstructured":"Tran, D., and Blei, D.M. (2017). Implicit causal models for genome-wide association studies. arXiv."},{"key":"ref_38","unstructured":"Kingma, D.P., and Welling, M. (2013). Auto-encoding variational bayes. arXiv."},{"key":"ref_39","unstructured":"Rezende, D.J., Mohamed, S., and Wierstra, D. (2014). Stochastic Backpropagation and Approximate Inference in Deep Generative Models. arXiv."},{"key":"ref_40","unstructured":"Kingma, D.P., Mohamed, S., Rezende, D.J., and Welling, M. (2014, January 8\u201313). Semi-supervised Learning with Deep Generative Models. Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_41","unstructured":"Jang, E., Gu, S., and Poole, B. (2017). Categorical Reparameterization with Gumbel-Softmax. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Kaltenpoth, D., and Vreeken, J. (2019, January 2\u20134). We Are Not Your Real Parents: Telling Causal from Confounded using MDL. Proceedings of the 2019 SIAM International Conference on Data Mining, Calgary, AB, Canada.","DOI":"10.1137\/1.9781611975673.23"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Tishby, N., and Zaslavsky, N. (2015). Deep Learning and the Information Bottleneck Principle. CoRR.","DOI":"10.1109\/ITW.2015.7133169"},{"key":"ref_44","unstructured":"Parbhoo, S. (2019). Causal Inference and Interpretable Machine Learning for Personalised Medicine. [Ph.D. Thesis, University of Basel]."},{"key":"ref_45","unstructured":"McCormick, M.C., Brooks-Gunn, J., and Buka, S.L. (2013). Infant Health and Development Program, Phase IV, 2001\u20132004 [United States], Columbia University."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1198\/jcgs.2010.08162","article-title":"Bayesian Nonparametric Modeling for Causal Inference","volume":"20","author":"Hill","year":"2011","journal-title":"J. Comput. Graphical Stat."},{"key":"ref_47","unstructured":"Shalit, U., Johansson, F.D., and Sontag, D. (2017, January 6\u201311). Estimating individual treatment effect: generalization bounds and algorithms. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1214\/09-AOAS285","article-title":"BART: Bayesian additive regression trees","volume":"4","author":"Chipman","year":"2010","journal-title":"Ann. Appl. Stat."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1214\/18-STS667","article-title":"Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition","volume":"34","author":"Dorie","year":"2019","journal-title":"Stat. Sci."},{"key":"ref_50","first-page":"1031","article-title":"The costs of low birth weight","volume":"120","author":"Almond","year":"2005","journal-title":"Q. J. Econ."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci. Data"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Medam, S., Zieleskiewicz, L., Duclos, G., Baumstarck, K., Loundou, A., Alingrin, J., Hammad, E., Vigne, C., Antonini, F., and Leone, M. (2017). Risk factors for death in septic shock. Medicine, 96.","DOI":"10.1097\/MD.0000000000009241"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1016\/j.ajem.2010.09.015","article-title":"The impact of emergency medical services on the ED care of severe sepsis","volume":"30","author":"Studnek","year":"2012","journal-title":"Am. J. Emergency Med."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/4\/389\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:13:01Z","timestamp":1760173981000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/4\/389"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,29]]},"references-count":53,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2020,4]]}},"alternative-id":["e22040389"],"URL":"https:\/\/doi.org\/10.3390\/e22040389","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,29]]}}}