{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T15:28:39Z","timestamp":1771342119319,"version":"3.50.1"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2022,2,18]],"date-time":"2022-02-18T00:00:00Z","timestamp":1645142400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,18]],"date-time":"2022-02-18T00:00:00Z","timestamp":1645142400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Curtin University"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["The VLDB Journal"],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Causal inference is capable of estimating the treatment effect (i.e., the causal effect of<jats:italic>treatment<\/jats:italic>on the<jats:italic>outcome<\/jats:italic>) to benefit the decision making in various domains. One fundamental challenge in this research is that the treatment assignment bias in observational data. To increase the validity of observational studies on causal inference, representation-based methods as the state-of-the-art have demonstrated the superior performance of treatment effect estimation. Most representation-based methods assume all observed covariates are pre-treatment (i.e., not affected by the treatment) and learn a balanced representation from these observed covariates for estimating treatment effect. Unfortunately, this assumption is often too strict a requirement in practice, as some covariates are changed by doing an intervention on treatment (i.e., post-treatment). By contrast, the balanced representation learned from unchanged covariates thus biases the treatment effect estimation. In light of this, we propose a deep treatment-adaptive architecture (DTANet) that can address the post-treatment covariates and provide a unbiased treatment effect estimation. Generally speaking, the contributions of this work are threefold. First, our theoretical results guarantee DTANet can identify treatment effect from observations. Second, we introduce a novel regularization of orthogonality projection to ensure that the learned confounding representation is invariant and not being contaminated by the treatment, meanwhile mediate variable representation is informative and discriminative for predicting the outcome. Finally, we build on the optimal transport and learn a treatment-invariant representation for the unobserved confounders to alleviate the confounding bias.<\/jats:p>","DOI":"10.1007\/s00778-021-00724-y","type":"journal-article","created":{"date-parts":[[2022,2,19]],"date-time":"2022-02-19T05:05:21Z","timestamp":1645247121000},"page":"1127-1142","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Deep treatment-adaptive network for causal inference"],"prefix":"10.1007","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9900-6178","authenticated-orcid":false,"given":"Qian","family":"Li","sequence":"first","affiliation":[]},{"given":"Zhichao","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Shaowu","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Gang","family":"Li","sequence":"additional","affiliation":[]},{"given":"Guandong","family":"Xu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,2,18]]},"reference":[{"key":"724_CR1","unstructured":"Alaa, A.M., van\u00a0der Schaar, M.: Bayesian inference of individualized treatment effects using multi-task gaussian processes. In: Advances in Neural Information Processing Systems, pp. 3424\u20133432 (2017)"},{"issue":"3","key":"724_CR2","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1080\/00273171.2011.568786","volume":"46","author":"PC Austin","year":"2011","unstructured":"Austin, P.C.: An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar. Behav. Res. 46(3), 399\u2013424 (2011)","journal-title":"Multivar. Behav. Res."},{"issue":"4","key":"724_CR3","doi-asserted-by":"publisher","first-page":"962","DOI":"10.1111\/j.1541-0420.2005.00377.x","volume":"61","author":"H Bang","year":"2005","unstructured":"Bang, H., Robins, J.M.: Doubly robust estimation in missing data and causal inference models. Biometrics 61(4), 962\u2013973 (2005)","journal-title":"Biometrics"},{"issue":"2","key":"724_CR4","doi-asserted-by":"publisher","first-page":"A1111","DOI":"10.1137\/141000439","volume":"37","author":"JD Benamou","year":"2015","unstructured":"Benamou, J.D., Carlier, G., Cuturi, M., Nenna, L., Peyr\u00e9, G.: Iterative bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111\u2013A1138 (2015)","journal-title":"SIAM J. Sci. Comput."},{"issue":"1","key":"724_CR5","first-page":"3207","volume":"14","author":"L Bottou","year":"2013","unstructured":"Bottou, L., Peters, J., Qui\u00f1onero-Candela, J., Charles, D.X., Chickering, D.M., Portugaly, E., Ray, D., Simard, P., Snelson, E.: Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. 14(1), 3207\u20133260 (2013)","journal-title":"J. Mach. Learn. Res."},{"key":"724_CR6","unstructured":"Colnet, B., Mayer, I., Chen, G., Dieng, A., Li, R., Varoquaux, G., Vert, J.P., Josse, J., Yang, S.: Causal inference methods for combining randomized trials and observational studies: a review. arXiv preprint arXiv:2011.08047 (2020)"},{"issue":"25","key":"724_CR7","doi-asserted-by":"publisher","first-page":"1887","DOI":"10.1056\/NEJM200006223422507","volume":"342","author":"J Concato","year":"2000","unstructured":"Concato, J., Shah, N., Horwitz, R.I.: Randomized, controlled trials, observational studies, and the hierarchy of research designs. N. Engl. J. Med. 342(25), 1887\u20131892 (2000)","journal-title":"N. Engl. J. Med."},{"key":"724_CR8","doi-asserted-by":"crossref","unstructured":"Courty, N., Flamary, R., Tuia, D.: Domain adaptation with regularized optimal transport. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 274\u2013289. Springer (2014)","DOI":"10.1007\/978-3-662-44848-9_18"},{"key":"724_CR9","unstructured":"Cuturi, M., Doucet, A.: Fast computation of wasserstein barycenters. In International Conference on Machine Learning, pp. 685\u2013693 (2014)"},{"key":"724_CR10","unstructured":"Davies, G.E., Soundy, T.J.: The genetics of smoking and nicotine addiction. South Dakota Med. (2009)"},{"issue":"3","key":"724_CR11","doi-asserted-by":"publisher","first-page":"932","DOI":"10.1162\/REST_a_00318","volume":"95","author":"A Diamond","year":"2013","unstructured":"Diamond, A., Sekhon, J.S.: Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev. Econ. Stat. 95(3), 932\u2013945 (2013)","journal-title":"Rev. Econ. Stat."},{"key":"724_CR12","unstructured":"Dud\u00edk, M., Langford, J., Li, L.: Doubly robust policy evaluation and learning. arXiv preprint arXiv:1103.4601 (2011)"},{"key":"724_CR13","unstructured":"Dung\u00a0Duong, T., Li, Q., Xu, G.: Stochastic intervention for causal inference via reinforcement learning. arXiv e-prints pp. arXiv\u20132105 (2021)"},{"issue":"1","key":"724_CR14","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1007\/s10107-017-1172-1","volume":"171","author":"PM Esfahani","year":"2018","unstructured":"Esfahani, P.M., Kuhn, D.: Data-driven distributionally robust optimization using the wasserstein metric: performance guarantees and tractable reformulations. Math. Prog. 171(1), 115\u2013166 (2018)","journal-title":"Math. Prog."},{"key":"724_CR15","unstructured":"Goldberger, A.S. et al.: Econometric theory. (1964)"},{"issue":"1","key":"724_CR16","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1198\/jcgs.2010.08162","volume":"20","author":"JL Hill","year":"2011","unstructured":"Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20(1), 217\u2013240 (2011)","journal-title":"J. Comput. Graph. Stat."},{"issue":"3","key":"724_CR17","doi-asserted-by":"publisher","first-page":"706","DOI":"10.1093\/biomet\/87.3.706","volume":"87","author":"GW Imbens","year":"2000","unstructured":"Imbens, G.W.: The role of the propensity score in estimating dose-response functions. Biometrika 87(3), 706\u2013710 (2000)","journal-title":"Biometrika"},{"key":"724_CR18","unstructured":"Johansson, F., Shalit, U., Sontag, D.: Learning representations for counterfactual inference. In International Conference on Machine Learning, pp. 3020\u20133029 (2016)"},{"key":"724_CR19","unstructured":"Johansson, F.D., Kallus, N., Shalit, U., Sontag, D.: Learning weighted representations for generalization across designs. arXiv preprint arXiv:1802.08598 (2018)"},{"key":"724_CR20","unstructured":"Kallus, N., Mao, X., Udell, M.: Causal inference with noisy and missing covariates via matrix factorization. arXiv preprint arXiv:1806.00811 (2018)"},{"key":"724_CR21","unstructured":"Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)"},{"issue":"2","key":"724_CR22","doi-asserted-by":"publisher","first-page":"423","DOI":"10.1093\/biomet\/ast066","volume":"101","author":"M Kuroki","year":"2014","unstructured":"Kuroki, M., Pearl, J.: Measurement bias and effect restoration in causal inference. Biometrika 101(2), 423\u2013437 (2014)","journal-title":"Biometrika"},{"key":"724_CR23","doi-asserted-by":"crossref","unstructured":"Lechner, M.: Identification and estimation of causal effects of multiple treatments under the conditional independence assumption. In Econometric Evaluation of Labour Market Policies, pp. 43\u201358. Springer (2001)","DOI":"10.1007\/978-3-642-57615-7_3"},{"key":"724_CR24","doi-asserted-by":"crossref","unstructured":"Li, Q., Duong, T.D., Wang, Z., Liu, S., Wang, D., Xu, G.: Causal-aware generative imputation for automated underwriting. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management, pp. 3916\u20133924 (2021)","DOI":"10.1145\/3459637.3481900"},{"key":"724_CR25","doi-asserted-by":"crossref","unstructured":"Li, Q., Niu, W., Li, G., Cao, Y., Tan, J., Guo, L.: Lingo: linearized grassmannian optimization for nuclear norm minimization. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 801\u2013809 (2015)","DOI":"10.1145\/2806416.2806532"},{"key":"724_CR26","doi-asserted-by":"crossref","unstructured":"Li, Q., Wang, X., Xu, G.: Be causal: De-biasing social network confounding in recommendation. arXiv preprint arXiv:2105.07775 (2021)","DOI":"10.1145\/3533725"},{"key":"724_CR27","doi-asserted-by":"crossref","unstructured":"Li, Q., Wang, Z., Li, G., Pang, J., Xu, G.: Hilbert sinkhorn divergence for optimal transport. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 3835\u20133844 (2021)","DOI":"10.1109\/CVPR46437.2021.00383"},{"key":"724_CR28","doi-asserted-by":"crossref","unstructured":"Li, Q., Wang, Z., Liu, S., Li, G., Xu, G.: Causal optimal transport for treatment effect estimation. IEEE Trans. Neural Netw. Learn. Syst.(2021)","DOI":"10.1109\/TNNLS.2021.3118542"},{"key":"724_CR29","unstructured":"Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In Advances in Neural Information Processing Systems, pp. 6446\u20136456 (2017)"},{"issue":"4","key":"724_CR30","first-page":"507","volume":"7","author":"A Nichols","year":"2007","unstructured":"Nichols, A.: Causal inference with observational data. Stand. Genomic Sci. 7(4), 507\u2013541 (2007)","journal-title":"Stand. Genomic Sci."},{"key":"724_CR31","doi-asserted-by":"publisher","first-page":"96","DOI":"10.1214\/09-SS057","volume":"3","author":"J Pearl","year":"2009","unstructured":"Pearl, J.: Causal inference in statistics: an overview. Stat. Surv. 3, 96\u2013146 (2009)","journal-title":"Stat. Surv."},{"key":"724_CR32","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511803161","volume-title":"Causality","author":"J Pearl","year":"2009","unstructured":"Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)"},{"key":"724_CR33","volume-title":"Causal Inference in Statistics: A Primer","author":"J Pearl","year":"2016","unstructured":"Pearl, J., Glymour, M., Jewell, N.P.: Causal Inference in Statistics: A Primer. John Wiley and Sons, New Jersey (2016)"},{"issue":"5\u20136","key":"724_CR34","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1561\/2200000073","volume":"11","author":"G Peyr\u00e9","year":"2019","unstructured":"Peyr\u00e9, G., Cuturi, M., et al.: Computational optimal transport. Found. Trends Mach. Learn. 11(5\u20136), 355\u2013607 (2019)","journal-title":"Found. Trends Mach. Learn."},{"key":"724_CR35","doi-asserted-by":"crossref","unstructured":"Rosenbaum, P.R.: Observational study. Encyclopedia of statistics in behavioral science (2005)","DOI":"10.1002\/0470013192.bsa454"},{"issue":"1","key":"724_CR36","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1093\/biomet\/70.1.41","volume":"70","author":"PR Rosenbaum","year":"1983","unstructured":"Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41\u201355 (1983)","journal-title":"Biometrika"},{"key":"724_CR37","doi-asserted-by":"crossref","unstructured":"Rubin, D.B.: Matching to remove bias in observational studies. Biometrics pp. 159\u2013183,(1973)","DOI":"10.2307\/2529684"},{"key":"724_CR38","unstructured":"Schnabel, T., Swaminathan, A., Singh, A., Chandak, N., Joachims, T.: Recommendations as treatments: Debiasing learning and evaluation. arXiv preprint arXiv:1602.05352 (2016)"},{"key":"724_CR39","unstructured":"Schwab, P., Linhardt, L., Karlen, W.: Perfect match: A simple method for learning representations for counterfactual inference with neural networks. arXiv preprint arXiv:1810.00656 (2018)"},{"key":"724_CR40","unstructured":"Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 3076\u20133085. JMLR. org (2017)"},{"key":"724_CR41","doi-asserted-by":"crossref","unstructured":"Sun, W., Wang, P., Yin, D., Yang, J., Chang, Y.: Causal inference via sparse additive models with application to online advertising. In Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)","DOI":"10.1609\/aaai.v29i1.9156"},{"key":"724_CR42","volume-title":"Optimal transport: old and new","author":"C Villani","year":"2008","unstructured":"Villani, C.: Optimal transport: old and new, vol. 338. Springer Science and Business Media, Berlin (2008)"},{"issue":"5","key":"724_CR43","doi-asserted-by":"publisher","first-page":"867","DOI":"10.1037\/0022-006X.65.5.867","volume":"65","author":"AD Vinokur","year":"1997","unstructured":"Vinokur, A.D., Schul, Y.: Mastery and inoculation against setbacks as active ingredients in the jobs intervention for the unemployed. J. Consult. Clin. Psychol. 65(5), 867 (1997)","journal-title":"J. Consult. Clin. Psychol."},{"issue":"523","key":"724_CR44","doi-asserted-by":"publisher","first-page":"1228","DOI":"10.1080\/01621459.2017.1319839","volume":"113","author":"S Wager","year":"2018","unstructured":"Wager, S., Athey, S.: Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113(523), 1228\u20131242 (2018)","journal-title":"J. Am. Stat. Assoc."},{"key":"724_CR45","doi-asserted-by":"crossref","unstructured":"Wang, Z., Li, Q., Li, G., Xu, G.: Polynomial representation for persistence diagram. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 6123\u20136132 (2019)","DOI":"10.1109\/CVPR.2019.00628"},{"key":"724_CR46","unstructured":"Xu, G., Duong, T.D., Li, Q., Liu, S., Wang, X.: Causality learning: A new perspective for interpretable machine learning. arXiv preprint arXiv:2006.16789 (2020)"}],"container-title":["The VLDB Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00778-021-00724-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00778-021-00724-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00778-021-00724-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,18]],"date-time":"2024-09-18T22:10:43Z","timestamp":1726697443000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00778-021-00724-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,18]]},"references-count":46,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["724"],"URL":"https:\/\/doi.org\/10.1007\/s00778-021-00724-y","relation":{},"ISSN":["1066-8888","0949-877X"],"issn-type":[{"value":"1066-8888","type":"print"},{"value":"0949-877X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,18]]},"assertion":[{"value":"20 February 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 September 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 December 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 February 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}