{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T00:55:52Z","timestamp":1760576152411,"version":"build-2065373602"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"10","license":[{"start":{"date-parts":[[2025,6,13]],"date-time":"2025-06-13T00:00:00Z","timestamp":1749772800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,6,13]],"date-time":"2025-06-13T00:00:00Z","timestamp":1749772800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003130","name":"Fonds Wetenschappelijk Onderzoek","doi-asserted-by":"publisher","award":["1S59522N).","G085920N)","G085920N)"],"award-info":[{"award-number":["1S59522N).","G085920N)","G085920N)"]}],"id":[{"id":"10.13039\/501100003130","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Onderzoeksprogramma Artifici\u00eble Intelligentie (AI) Vlaanderen"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int. J. Mach. Learn. &amp; Cyber."],"published-print":{"date-parts":[[2025,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Treatment effect analysis investigates the effect of a treatment or intervention. The variables that will determine the treatment effect are called, predictive variables, while prognostic variables determine the outcome regardless of treatment, based on existing conditions on characteristics. The identification of these predictive factors facilitates understanding the treatment effect and even allows for improving its success. However, in many cases, the predictive factors of a treatment or intervention are unknown. Furthermore, methods to find these predictive factors are limited and only focus on quantifying the predictive performance of a CATE estimator instead of discerning predictive from prognostic variables. Therefore, to find these predictive variables we present <jats:italic>Causalteshap<\/jats:italic>. <jats:italic>Causalteshap<\/jats:italic> is a Shapley-based method that leverages multiple statistical tests and treatment effect estimators to discern prognostic from predictive features. The method is benchmarked on multiple fully synthetic datasets and four semi-synthetic datasets. In most of these benchmarks, <jats:italic>Causalteshap<\/jats:italic> demonstrates high precision and recall performances above 0.9. Subsequently, <jats:italic>Causalteshap<\/jats:italic> is applied to a real-world ICU use case using the AmsterdamUMCdb dataset. We analyzed the effect of Noradrenaline on Atrial Fibrillation in the ICU to display the potential of <jats:italic>Causalteshap<\/jats:italic> as a tool for treatment effect analysis. Our results demonstrate that <jats:italic>Causalteshap<\/jats:italic> has the potential of combining treatment effect estimators with Shapley values and statistical tests to provide a novel method for discerning predictive from prognostic features in treatment effect analysis and making understanding treatment effects more accessible.<\/jats:p>","DOI":"10.1007\/s13042-025-02666-1","type":"journal-article","created":{"date-parts":[[2025,6,13]],"date-time":"2025-06-13T07:35:55Z","timestamp":1749800155000},"page":"7487-7507","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Causalteshap: discerning predictive from prognostic features for treatment effect analysis"],"prefix":"10.1007","volume":"16","author":[{"given":"Jarne","family":"Verhaeghe","sequence":"first","affiliation":[]},{"given":"Femke","family":"Ongenae","sequence":"additional","affiliation":[]},{"given":"Sofie Van","family":"Hoecke","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,6,13]]},"reference":[{"key":"2666_CR1","volume-title":"The book of why","author":"J Pearl","year":"2018","unstructured":"Pearl J, Mackenzie D (2018) The book of why. Basic Books, New York"},{"key":"2666_CR2","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1515\/jci-2021-0048\/html","volume":"10","author":"A Forney","year":"2022","unstructured":"Forney A, Mueller S (2022) Causal inference in AI education: A primer. J Causal Inference 10:141\u2013173. https:\/\/doi.org\/10.1515\/jci-2021-0048\/html","journal-title":"J Causal Inference"},{"key":"2666_CR3","doi-asserted-by":"publisher","first-page":"958","DOI":"10.1038\/s41591-024-02902-1","volume":"30","author":"S Feuerriegel","year":"2024","unstructured":"Feuerriegel S et al (2024) Causal machine learning for predicting treatment outcomes. Nat Med 30:958\u2013968. https:\/\/doi.org\/10.1038\/s41591-024-02902-1","journal-title":"Nat Med"},{"key":"2666_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2022.104256","volume":"137","author":"Y Ling","year":"2023","unstructured":"Ling Y, Upadhyaya P, Chen L, Jiang X, Kim Y (2023) Emulate randomized clinical trials using heterogeneous treatment effect estimation for personalized treatments: methodology review and benchmark. J Biomed Inf 137:104256","journal-title":"J Biomed Inf"},{"key":"2666_CR5","unstructured":"Crabb\u00e9 J, Curth A, Bica I, van\u00a0der Schaar M (2022) Benchmarking heterogeneous treatment effect models through the lens of interpretability. arXiv:2206.08363"},{"key":"2666_CR6","doi-asserted-by":"publisher","first-page":"322","DOI":"10.1198\/016214504000001880","volume":"100","author":"DB Rubin","year":"2005","unstructured":"Rubin DB (2005) Causal inference using potential outcomes. J Am Stat Assoc 100:322\u2013331. https:\/\/doi.org\/10.1198\/016214504000001880","journal-title":"J Am Stat Assoc"},{"key":"2666_CR7","doi-asserted-by":"publisher","first-page":"4156","DOI":"10.1073\/pnas.1804597116","volume":"116","author":"SR K\u00fcnzel","year":"2019","unstructured":"K\u00fcnzel SR, Sekhon JS, Bickel PJ, Yu B (2019) Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Natl Acad Sci 116:4156\u20134165. https:\/\/doi.org\/10.1073\/pnas.1804597116. (Publisher: Proceedings of the National Academy of Sciences)","journal-title":"Proc Natl Acad Sci"},{"key":"2666_CR8","doi-asserted-by":"crossref","unstructured":"Hermansson E, Svensson D, Gervasi O et\u00a0al (2021) (eds) On discovering treatment-effect modifiers using virtual twins and causal forest ml in the presence of prognostic biomarkers. In: Gervasi O et\u00a0al. (eds) Computational science and its applications\u2014ICCSA 2021, Springer International Publishing, Cham, pp 624\u2013640","DOI":"10.1007\/978-3-030-86973-1_44"},{"key":"2666_CR9","unstructured":"Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Guyon I et\u00a0al (eds) Advances in Neural Information Processing Systems, vol 30, Curran Associates, Inc., pp 4765\u20134774. http:\/\/papers.nips.cc\/paper\/7062-a-unified-approach-to-interpreting-model-predictions.pdf"},{"key":"2666_CR10","doi-asserted-by":"publisher","first-page":"18","DOI":"10.3390\/e23010018","volume":"23","author":"P Linardatos","year":"2020","unstructured":"Linardatos P, Papastefanopoulos V, Kotsiantis S (2020) Explainable AI: a review of machine learning interpretability methods. Entropy 23:18","journal-title":"Entropy"},{"key":"2666_CR11","doi-asserted-by":"crossref","unstructured":"Zhang Z, Seibold H, Vettore MV, Song W-J, Fran\u00e7ois V (2018) Subgroup identification in clinical trials: an overview of available methods and their implementations with r. Ann Transl Med 6. https:\/\/atm.amegroups.com\/article\/view\/19049","DOI":"10.21037\/atm.2018.03.07"},{"key":"2666_CR12","doi-asserted-by":"publisher","first-page":"1082","DOI":"10.1080\/10543406.2019.1584204","volume":"29","author":"Y Liu","year":"2019","unstructured":"Liu Y et al (2019) Look before you leap: systematic evaluation of tree-based statistical methods in subgroup identification. J Biopharm Stat 29:1082\u20131102","journal-title":"J Biopharm Stat"},{"key":"2666_CR13","doi-asserted-by":"publisher","first-page":"3658","DOI":"10.1177\/0962280217710570","volume":"27","author":"D Alemayehu","year":"2018","unstructured":"Alemayehu D, Chen Y, Markatou M (2018) A comparative study of subgroup identification methods for differential treatment effect: Performance metrics and recommendations. Stat Methods Med Res 27:3658\u20133678","journal-title":"Stat Methods Med Res"},{"key":"2666_CR14","doi-asserted-by":"publisher","first-page":"465","DOI":"10.1093\/biostatistics\/kxh002","volume":"5","author":"M Bonetti","year":"2004","unstructured":"Bonetti M, Gelber RD (2004) Patterns of treatment effects in subsets of patients in clinical trials. Biostatistics (Oxford, England) 5:465\u2013481","journal-title":"Biostatistics (Oxford, England)"},{"key":"2666_CR15","doi-asserted-by":"publisher","first-page":"2601","DOI":"10.1002\/sim.4289","volume":"30","author":"I Lipkovich","year":"2011","unstructured":"Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search\u2014a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30:2601\u20132621","journal-title":"Stat Med"},{"key":"2666_CR16","doi-asserted-by":"publisher","unstructured":"Foster JC, Taylor JM, Ruberg SJ (2011) Subgroup identification from randomized clinical trial data. Stat Med 30. https:\/\/doi.org\/10.1002\/sim.4322","DOI":"10.1002\/sim.4322"},{"key":"2666_CR17","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1093\/biostatistics\/kxq060","volume":"12","author":"T Cai","year":"2011","unstructured":"Cai T, Tian L, Wong PH, Wei LJ (2011) Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics 12:270\u2013282","journal-title":"Biostatistics"},{"key":"2666_CR18","doi-asserted-by":"publisher","first-page":"612","DOI":"10.1080\/10503307.2015.1062934","volume":"26","author":"LL Doove","year":"2016","unstructured":"Doove LL, Van Deun K, Dusseldorp E, Van Mechelen I (2016) QUINT: A tool to detect qualitative treatment-subgroup interactions in randomized controlled trials. Psychother Res 26:612\u2013622","journal-title":"Psychother Res"},{"key":"2666_CR19","doi-asserted-by":"publisher","first-page":"1488","DOI":"10.1080\/01621459.2022.2157727","volume":"118","author":"X Guo","year":"2023","unstructured":"Guo X et al (2023) Assessing the most vulnerable subgroup to type II diabetes associated with statin usage: evidence from electronic health record data. J Am Stat Assoc 118:1488\u20131499","journal-title":"J Am Stat Assoc"},{"key":"2666_CR20","doi-asserted-by":"publisher","first-page":"1517","DOI":"10.1080\/01621459.2014.951443","volume":"109","author":"L Tian","year":"2014","unstructured":"Tian L, Alizadeh AA, Gentles AJ, Tibshirani R (2014) A simple method for estimating interactions between a treatment and a large number of covariates. J Am Stat Assoc 109:1517\u20131532","journal-title":"J Am Stat Assoc"},{"key":"2666_CR21","doi-asserted-by":"publisher","first-page":"412","DOI":"10.1093\/biostatistics\/kxaa032","volume":"23","author":"H Park","year":"2022","unstructured":"Park H, Petkova E, Tarpey T, Ogden RT (2022) A sparse additive model for treatment effect-modifier selection. Biostatistics 23:412\u2013429","journal-title":"Biostatistics"},{"key":"2666_CR22","doi-asserted-by":"publisher","first-page":"861","DOI":"10.1093\/biomet\/asr041","volume":"98","author":"X De Luna","year":"2011","unstructured":"De Luna X, Waernbaum I, Richardson TS (2011) Covariate selection for the nonparametric estimation of an average treatment effect. Biometrika 98:861\u2013875","journal-title":"Biometrika"},{"key":"2666_CR23","doi-asserted-by":"publisher","first-page":"8823","DOI":"10.1109\/TKDE.2022.3218131","volume":"35","author":"D Cheng","year":"2023","unstructured":"Cheng D et al (2023) Local search for efficient causal effect estimation. IEEE Trans Knowl Data Eng 35:8823\u20138837","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2666_CR24","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1146\/annurev.publhealth.23.100901.140546","volume":"23","author":"T Lumley","year":"2002","unstructured":"Lumley T, Diehr P, Emerson S, Chen L (2002) The importance of the normality assumption in large public health data sets. Annu Rev Public Health 23:151\u2013169. https:\/\/doi.org\/10.1146\/annurev.publhealth.23.100901.140546","journal-title":"Annu Rev Public Health"},{"key":"2666_CR25","doi-asserted-by":"publisher","first-page":"210","DOI":"10.1080\/01621459.1976.10481517","volume":"71","author":"MA Fligner","year":"1976","unstructured":"Fligner MA, Killeen TJ (1976) Distribution-free two-sample tests for scale. J Am Stat Assoc 71:210\u2013213. https:\/\/doi.org\/10.1080\/01621459.1976.10481517","journal-title":"J Am Stat Assoc"},{"key":"2666_CR26","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"2666_CR27","doi-asserted-by":"crossref","unstructured":"Verhaeghe J, Van Der\u00a0Donckt J, Ongenae F, Van\u00a0Hoecke Sm Amini M-R et\u00a0al (2023) (eds) Powershap: a power-full shapley feature selection method. In: Amini M-R et\u00a0al (eds) Machine learning and knowledge discovery in databases, Springer International Publishing, pp 71\u201387","DOI":"10.1007\/978-3-031-26387-3_5"},{"key":"2666_CR28","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijmedinf.2023.105086","volume":"175","author":"J Verhaeghe","year":"2023","unstructured":"Verhaeghe J et al (2023) Generalizable calibrated machine learning models for real-time atrial fibrillation risk prediction in ICU patients. Int J Med Inform 175:105086","journal-title":"Int J Med Inform"},{"key":"2666_CR29","doi-asserted-by":"publisher","first-page":"1113","DOI":"10.1038\/ng.2764","volume":"45","author":"JN Weinstein","year":"2013","unstructured":"Weinstein JN et al (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45:1113\u20131120","journal-title":"Nat Genet"},{"key":"2666_CR30","doi-asserted-by":"crossref","unstructured":"Almond D, Chay KY, Lee DS (2005) The costs of low birth weight. Q J Econ","DOI":"10.3386\/w10552"},{"key":"2666_CR31","unstructured":"Newman D (2008) Bag of Words. https:\/\/archive.ics.uci.edu\/dataset\/164"},{"key":"2666_CR32","doi-asserted-by":"crossref","unstructured":"Dorie V, Hill J, Shalit U, Scott M, Cervone D (2019) Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. Stat Sci 34. https:\/\/projecteuclid.org\/journals\/statistical-science\/volume-34\/issue-1\/Automated-versus-Do-It-Yourself-Methods-for-Causal-Inference\/10.1214\/18-STS667.full","DOI":"10.1214\/18-STS667"},{"key":"2666_CR33","doi-asserted-by":"publisher","DOI":"10.1097\/CCM.0000000000004916","volume":"49","author":"PJ Thoral","year":"2021","unstructured":"Thoral PJ et al (2021) Sharing ICU patient data responsibly under the society of critical care medicine\/European society of intensive care medicine joint data science collaboration: the Amsterdam University Medical Centers Database (AmsterdamUMCdb) Example*. Crit Care Med 49:e563","journal-title":"Crit Care Med"},{"key":"2666_CR34","first-page":"0000062","volume":"3","author":"T Yoshida","year":"2015","unstructured":"Yoshida T, Fujii T, Uchino S, Takinami M (2015) Epidemiology, prevention, and treatment of new-onset atrial fibrillation in critically ill: a systematic review. J Intens Care 3:0000062","journal-title":"J Intens Care"},{"issue":"790\u2013797","key":"2666_CR35","first-page":"0000096","volume":"45","author":"TJ Moss","year":"2017","unstructured":"Moss TJ et al (2017) New-onset atrial fibrillation in the critically ill. Crit Care Med 45(790\u2013797):0000096","journal-title":"Crit Care Med"},{"key":"2666_CR36","first-page":"179","volume":"14","author":"K Wasmer","year":"2017","unstructured":"Wasmer K, Eckardt L, Breithardt G (2017) Predisposing factors for atrial fibrillation in the elderly. J Geriat Cardiol 14:179\u2013184","journal-title":"J Geriat Cardiol"},{"key":"2666_CR37","doi-asserted-by":"publisher","first-page":"4543","DOI":"10.1007\/s10489-021-02550-9","volume":"52","author":"P Dhal","year":"2021","unstructured":"Dhal P, Azad C (2021) A comprehensive survey on feature selection in the various fields of machine learning. Appl Intell 52:4543\u20134581. https:\/\/doi.org\/10.1007\/s10489-021-02550-9","journal-title":"Appl Intell"},{"key":"2666_CR38","unstructured":"Prokhorenkova L, Gusev G et\u00a0al (2019) CatBoost: unbiased boosting with categorical features. arXiv:1706.09516"},{"key":"2666_CR39","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1093\/biomet\/34.1-2.28","volume":"34","author":"BL Welch","year":"1947","unstructured":"Welch BL (1947) The generalization of \u2018student\u2019s\u2019 problem when several different population varlances are involved. Biometrika 34:28\u201335. https:\/\/doi.org\/10.1093\/biomet\/34.1-2.28","journal-title":"Biometrika"},{"key":"2666_CR40","unstructured":"Knuth DE (1997) The art of computer programming, Ch. 3.3.1, 52. Addison-Wesley, Reading, Mass, 3rd edn"},{"key":"2666_CR41","doi-asserted-by":"publisher","first-page":"4156","DOI":"10.1073\/pnas.1804597116","volume":"116","author":"SR K\u00fcnzel","year":"2019","unstructured":"K\u00fcnzel SR, Sekhon JS, Bickel PJ, Yu B (2019) Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Natl Acad Sci 116:4156\u20134165","journal-title":"Proc Natl Acad Sci"},{"key":"2666_CR42","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1093\/biomet\/asaa076","volume":"108","author":"X Nie","year":"2020","unstructured":"Nie X, Wager S (2020) Quasi-oracle estimation of heterogeneous treatment effects. Biometrika 108:299\u2013319. https:\/\/doi.org\/10.1093\/biomet\/asaa076","journal-title":"Biometrika"}],"container-title":["International Journal of Machine Learning and Cybernetics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-025-02666-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13042-025-02666-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-025-02666-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T16:58:45Z","timestamp":1760547525000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13042-025-02666-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,13]]},"references-count":42,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["2666"],"URL":"https:\/\/doi.org\/10.1007\/s13042-025-02666-1","relation":{},"ISSN":["1868-8071","1868-808X"],"issn-type":[{"type":"print","value":"1868-8071"},{"type":"electronic","value":"1868-808X"}],"subject":[],"published":{"date-parts":[[2025,6,13]]},"assertion":[{"value":"25 January 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 April 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 June 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The AF dataset is based upon the AmsterdamUMCdb, which is an open-source dataset and therefore requires no ethical approval.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}}]}}