{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T10:07:02Z","timestamp":1775815622080,"version":"3.50.1"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T00:00:00Z","timestamp":1748390400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T00:00:00Z","timestamp":1748390400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2023YFF1000100"],"award-info":[{"award-number":["2023YFF1000100"]}]},{"name":"National Key Research and Development Program of China","award":["2023YFF1000100"],"award-info":[{"award-number":["2023YFF1000100"]}]},{"name":"National Key Research and Development Program of China","award":["2023YFF1000100"],"award-info":[{"award-number":["2023YFF1000100"]}]},{"name":"Fundamental Research Funds for the Chinese Central Universities","award":["2662023XXPY004"],"award-info":[{"award-number":["2662023XXPY004"]}]},{"name":"Fundamental Research Funds for the Chinese Central Universities","award":["2662023XXPY004"],"award-info":[{"award-number":["2662023XXPY004"]}]},{"name":"Fundamental Research Funds for the Chinese Central Universities","award":["2662023XXPY004"],"award-info":[{"award-number":["2662023XXPY004"]}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP230101122"],"award-info":[{"award-number":["DP230101122"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP230101122"],"award-info":[{"award-number":["DP230101122"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP230101122"],"award-info":[{"award-number":["DP230101122"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001787","name":"University of South Australia","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001787","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Min Knowl Disc"],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Within the realm of causal inference, a pivotal task involves causal effect estimation from observational data when there exist confounding variables. The K-Nearest Neighbour Matching (K-NNM) method is widely applied to handle confounding bias, but its general application sets a uniform K value for all samples, which can lead to suboptimal results in practice. To overcome this limitation, this paper introduces a novel method for causal effect estimation called Dynamic K-Nearest Neighbour Matching (DK-NNM). The DK-NNM method employs a data-driven learning strategy to determine the optimal value of K for each sample. In practice, DK-NNM reconstructs a sparse coefficient matrix for all samples using sparse learning, while simultaneously learning a graph matrix to preserve local information and sample similarity. This approach helps identify the most suitable K-value for each sample. Additionally, DK-NNM utilizes joint propensity and prognostic scores to effectively mitigate confounding bias arising from high-dimensional covariates during the K-NNM process. Experiments performed on various synthetic, semi-synthetic, and real-world datasets conclusively demonstrate that DK-NNM surpasses baseline models in estimating causal effects from observational data and provides significant improvements over traditional methods.<\/jats:p>","DOI":"10.1007\/s10618-025-01107-5","type":"journal-article","created":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T13:13:35Z","timestamp":1748438015000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Data-driven learning optimal K values for K-nearest neighbour matching in causal inference"],"prefix":"10.1007","volume":"39","author":[{"given":"Yinghao","family":"Zhang","sequence":"first","affiliation":[]},{"given":"Tingting","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Debo","family":"Cheng","sequence":"additional","affiliation":[]},{"given":"Jiuyong","family":"Li","sequence":"additional","affiliation":[]},{"given":"Lin","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Ziqi","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Zaiwen","family":"Feng","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,5,28]]},"reference":[{"issue":"30","key":"1107_CR1","doi-asserted-by":"publisher","first-page":"4821","DOI":"10.1002\/sim.8754","volume":"39","author":"RC Aikens","year":"2020","unstructured":"Aikens RC, Greaves D, Baiocchi M (2020) A pilot design for observational studies: using abundant data thoughtfully. Stat Med 39(30):4821\u20134840","journal-title":"Stat Med"},{"issue":"3","key":"1107_CR2","first-page":"1031","volume":"120","author":"D Almond","year":"2005","unstructured":"Almond D, Chay KY, Lee DS (2005) The costs of low birth weight. Q J Econ 120(3):1031\u20131083","journal-title":"Q J Econ"},{"issue":"2","key":"1107_CR3","doi-asserted-by":"publisher","first-page":"1148","DOI":"10.1214\/18-AOS1709","volume":"47","author":"S Athey","year":"2019","unstructured":"Athey S, Tibshirani J, Wager S (2019) Generalized random forests. Ann Stat 47(2):1148\u20131178","journal-title":"Ann Stat"},{"issue":"3","key":"1107_CR4","doi-asserted-by":"publisher","first-page":"1174","DOI":"10.1007\/s10618-022-00832-5","volume":"36","author":"D Cheng","year":"2022","unstructured":"Cheng D, Li J et al (2022) Sufficient dimension reduction for average causal effect estimation. Data Min Knowl Disc 36(3):1174\u20131196","journal-title":"Data Min Knowl Disc"},{"issue":"5","key":"1107_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3636423","volume":"56","author":"D Cheng","year":"2024","unstructured":"Cheng D, Li J, Liu L, Liu J, Le TD (2024) Data-driven causal effect estimation based on graphical causal modelling: A survey. ACM Comput Surv 56(5):1\u201337","journal-title":"ACM Comput Surv"},{"issue":"11","key":"1107_CR6","doi-asserted-by":"publisher","first-page":"889","DOI":"10.1001\/jama.1996.03540110043030","volume":"276","author":"AF Connors","year":"1996","unstructured":"Connors AF, Speroff T, Dawson NV, Thomas C, Harrell FE, Wagner D, Desbiens N, Goldman L, Wu AW et al (1996) The effectiveness of right heart catheterization in the initial care of critically iii patients. JAMA 276(11):889\u2013897","journal-title":"JAMA"},{"key":"1107_CR7","volume-title":"Local polynomial modelling and its applications","author":"KA Copeland","year":"1997","unstructured":"Copeland KA (1997) Local polynomial modelling and its applications. Taylor & Francis, New York"},{"key":"1107_CR8","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1016\/j.socscimed.2017.12.005","volume":"210","author":"A Deaton","year":"2018","unstructured":"Deaton A, Cartwright N (2018) Understanding and misunderstanding randomized controlled trials. Soc Sci Med 210:2\u201321","journal-title":"Soc Sci Med"},{"issue":"3","key":"1107_CR9","doi-asserted-by":"publisher","first-page":"932","DOI":"10.1162\/REST_a_00318","volume":"95","author":"A Diamond","year":"2013","unstructured":"Diamond A, Sekhon JS (2013) Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies. Rev Econ Stat 95(3):932\u2013945","journal-title":"Rev Econ Stat"},{"issue":"2","key":"1107_CR10","first-page":"821","volume":"31","author":"T Ghosh","year":"2021","unstructured":"Ghosh T, Ma Y, De Luna X (2021) Sufficient dimension reduction for feasible and robust estimation of average causal effect. Stat Sin 31(2):821","journal-title":"Stat Sin"},{"issue":"4","key":"1107_CR11","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1080\/10618600.1993.10474623","volume":"2","author":"XS Gu","year":"1993","unstructured":"Gu XS, Rosenbaum PR (1993) Comparison of multivariate matching methods: structures, distances, and algorithms. J Comput Graph Stat 2(4):405\u2013420","journal-title":"J Comput Graph Stat"},{"issue":"3","key":"1107_CR12","doi-asserted-by":"publisher","first-page":"965","DOI":"10.1214\/19-BA1195","volume":"15","author":"PR Hahn","year":"2020","unstructured":"Hahn PR, Murray JS, Carvalho CM (2020) Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects (with discussion). Bayesian Anal 15(3):965\u20131056","journal-title":"Bayesian Anal"},{"key":"1107_CR13","unstructured":"He X, Niyogi P (2003) Locality preserving projections. Adv Neural Inf Process Syst 16"},{"issue":"1","key":"1107_CR14","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1198\/jcgs.2010.08162","volume":"20","author":"JL Hill","year":"2011","unstructured":"Hill JL (2011) Bayesian nonparametric modeling for causal inference. J Comput Graph Stat 20(1):217\u2013240","journal-title":"J Comput Graph Stat"},{"issue":"2","key":"1107_CR15","first-page":"72","volume":"1985","author":"PW Holland","year":"1985","unstructured":"Holland PW, Glymour C, Granger C (1985) Statistics and causal inference*. ETS Res Rep Ser 1985(2):72","journal-title":"ETS Res Rep Ser"},{"key":"1107_CR16","doi-asserted-by":"crossref","unstructured":"Imai K, Ratkovic MT (2014) Covariate balancing propensity score. J R Stat Soc: Ser B (Stat Methodol) 76","DOI":"10.1111\/rssb.12027"},{"issue":"1","key":"1107_CR17","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1162\/003465304323023651","volume":"86","author":"GW Imbens","year":"2004","unstructured":"Imbens GW (2004) Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat 86(1):4\u201329","journal-title":"Rev Econ Stat"},{"key":"1107_CR18","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781139025751","volume-title":"Causal inference for statistics, social, and biomedical sciences: an introduction","author":"GW Imbens","year":"2015","unstructured":"Imbens GW, Rubin DB (2015) Causal inference for statistics, social, and biomedical sciences: an introduction. Cambridge University Press, Cambridge"},{"issue":"3","key":"1107_CR19","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1093\/pan\/mpv007","volume":"23","author":"L Keele","year":"2015","unstructured":"Keele L (2015) The statistics of causal inference: a view from political methodology. Polit Anal 23(3):313\u2013335. https:\/\/doi.org\/10.1093\/pan\/mpv007","journal-title":"Polit Anal"},{"key":"1107_CR20","unstructured":"LaLonde RJ (1986) Evaluating the econometric evaluations of training programs with experimental data. Am Econ Rev pp 604\u2013620"},{"issue":"20","key":"1107_CR21","doi-asserted-by":"publisher","first-page":"3488","DOI":"10.1002\/sim.6030","volume":"33","author":"FP Leacy","year":"2014","unstructured":"Leacy FP, Stuart EA (2014) On the joint use of propensity and prognostic scores in estimation of the average treatment effect on the treated: a simulation study. Stat Med 33(20):3488\u20133508","journal-title":"Stat Med"},{"key":"1107_CR22","doi-asserted-by":"publisher","DOI":"10.1007\/b98858","volume-title":"Local regression and likelihood","author":"C Loader","year":"1999","unstructured":"Loader C (1999) Local regression and likelihood. Springer, New York"},{"key":"1107_CR23","unstructured":"Luna X, Johansson P, Sj\u00f6stedt-de\u00a0Luna S (2010) Bootstrap inference for k-nearest neighbour matching estimators"},{"key":"1107_CR24","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4419-8853-9","volume-title":"Introductory lectures on convex optimization: a basic course","author":"Y Nesterov","year":"2004","unstructured":"Nesterov Y (2004) Introductory lectures on convex optimization: a basic course, vol 87. Springer, New York"},{"issue":"2","key":"1107_CR25","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1093\/biomet\/asaa076","volume":"108","author":"X Nie","year":"2021","unstructured":"Nie X, Wager S (2021) Quasi-oracle estimation of heterogeneous treatment effects. Biometrika 108(2):299\u2013319","journal-title":"Biometrika"},{"key":"1107_CR26","doi-asserted-by":"publisher","first-page":"669","DOI":"10.1093\/biomet\/82.4.669","volume":"82","author":"J Pearl","year":"1995","unstructured":"Pearl J (1995) Causal diagrams for empirical research. Biometrika 82:669\u2013688","journal-title":"Biometrika"},{"issue":"1","key":"1107_CR27","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1080\/10618600.2016.1152971","volume":"26","author":"PR Rosenbaum","year":"2017","unstructured":"Rosenbaum PR (2017) Imposing minimax and quantile constraints on optimal matching in observational studies. J Comput Graph Stat 26(1):66\u201378","journal-title":"J Comput Graph Stat"},{"issue":"1","key":"1107_CR28","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1093\/biomet\/70.1.41","volume":"70","author":"PR Rosenbaum","year":"1983","unstructured":"Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41\u201355","journal-title":"Biometrika"},{"key":"1107_CR29","doi-asserted-by":"crossref","unstructured":"Rubin DB (1973) Matching to remove bias in observational studies. Biometrics, 159\u2013183","DOI":"10.2307\/2529684"},{"key":"1107_CR30","doi-asserted-by":"publisher","first-page":"688","DOI":"10.1037\/h0037350","volume":"66","author":"DB Rubin","year":"1974","unstructured":"Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688\u2013701","journal-title":"J Educ Psychol"},{"issue":"450","key":"1107_CR31","doi-asserted-by":"publisher","first-page":"573","DOI":"10.1080\/01621459.2000.10474233","volume":"95","author":"DB Rubin","year":"2000","unstructured":"Rubin DB, Thomas N (2000) Combining propensity score matching with additional adjustments for prognostic covariates. J Am Stat Assoc 95(450):573\u2013585","journal-title":"J Am Stat Assoc"},{"issue":"1","key":"1107_CR32","first-page":"1","volume":"25","author":"EA Stuart","year":"2010","unstructured":"Stuart EA (2010) Matching methods for causal inference: a review and a look forward. Stat Sci A Rev J Inst Math Stat 25(1):1\u201321","journal-title":"Stat Sci A Rev J Inst Math Stat"},{"issue":"1","key":"1107_CR33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1214\/09-STS313","volume":"25","author":"EA Stuart","year":"2010","unstructured":"Stuart EA (2010) Matching methods for causal inference: A review and a look forward. Statistical science: a review journal of the Institute of Mathematical Statistics 25(1):1\u201321","journal-title":"Statistical science: a review journal of the Institute of Mathematical Statistics"},{"issue":"523","key":"1107_CR34","doi-asserted-by":"publisher","first-page":"1228","DOI":"10.1080\/01621459.2017.1319839","volume":"113","author":"S Wager","year":"2018","unstructured":"Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc 113(523):1228\u20131242","journal-title":"J Am Stat Assoc"},{"issue":"6","key":"1107_CR35","doi-asserted-by":"publisher","first-page":"1031","DOI":"10.1109\/JPROC.2010.2044470","volume":"98","author":"J Wright","year":"2010","unstructured":"Wright J, Ma Y, Mairal J, Sapiro G, Huang TS, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031\u20131044","journal-title":"Proc IEEE"},{"key":"1107_CR36","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1007\/s11280-018-0539-4","volume":"22","author":"W Wu","year":"2019","unstructured":"Wu W, Parampalli U et al (2019) Privacy preserving k-nearest neighbor classification over encrypted database in outsourced cloud environments. World Wide Web 22:101\u2013123","journal-title":"World Wide Web"},{"key":"1107_CR37","doi-asserted-by":"crossref","unstructured":"Xu Z, Liu J, Cheng D, Li J, Liu L, Wang K (2023) Disentangled representation with causal constraints for counterfactual fairness. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp. 471\u2013482. Springer","DOI":"10.1007\/978-3-031-33374-3_37"},{"key":"1107_CR38","doi-asserted-by":"crossref","unstructured":"Xu T, Zhang Y, Li J, Liu L, Xu Z, Cheng D, Feng Z (2023) A data-driven approach to finding k for k nearest neighbor matching in average causal effect estimation. In: International Conference on Web Information Systems Engineering (WISE), pp 723\u2013732. Springer","DOI":"10.1007\/978-981-99-7254-8_56"},{"key":"1107_CR39","unstructured":"Ye SS, Chen Y, Padilla OHM (2021) 2d score based estimation of heterogeneous treatment effects. arXiv preprint arXiv:2110.02401"},{"key":"1107_CR40","doi-asserted-by":"publisher","first-page":"1545","DOI":"10.1007\/s11280-017-0502-9","volume":"21","author":"S Zhang","year":"2018","unstructured":"Zhang S, Cheng D et al (2018) Supervised feature selection algorithm via discriminative ridge regression. World Wide Web 21:1545\u20131562","journal-title":"World Wide Web"},{"issue":"3","key":"1107_CR41","first-page":"1","volume":"8","author":"S Zhang","year":"2017","unstructured":"Zhang S, Li X et al (2017) Learning k for knn classification. ACM Transactions on Intelligent Systems and Technology (TIST) 8(3):1\u201319","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"1107_CR42","unstructured":"Zhou D, Bousquet O, Lal T, Weston J, Sch\u00f6lkopf B (2003) Learning with local and global consistency. Adv Neural Inf Process Syst 16"},{"issue":"6","key":"1107_CR43","doi-asserted-by":"publisher","first-page":"1263","DOI":"10.1109\/TNNLS.2016.2521602","volume":"28","author":"X Zhu","year":"2016","unstructured":"Zhu X, Li X et al (2016) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263\u20131275","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"1107_CR44","doi-asserted-by":"crossref","unstructured":"Zhu X, Suk H-I, Shen D (2014) Matrix-similarity based loss function and feature selection for Alzheimer\u2019s disease diagnosis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3089\u20133096","DOI":"10.1109\/CVPR.2014.395"}],"container-title":["Data Mining and Knowledge Discovery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-025-01107-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10618-025-01107-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-025-01107-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,6]],"date-time":"2025-09-06T16:08:24Z","timestamp":1757174904000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10618-025-01107-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,28]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["1107"],"URL":"https:\/\/doi.org\/10.1007\/s10618-025-01107-5","relation":{},"ISSN":["1384-5810","1573-756X"],"issn-type":[{"value":"1384-5810","type":"print"},{"value":"1573-756X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,28]]},"assertion":[{"value":"29 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 April 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}}],"article-number":"35"}}