{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:35:17Z","timestamp":1760236517070,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2021,12,8]],"date-time":"2021-12-08T00:00:00Z","timestamp":1638921600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>An essential criterion for the proper implementation of case-control studies is selecting appropriate case and control groups. In this article, a new simulated annealing-based control group selection method is proposed, which solves the problem of selecting individuals in the control group as a distance optimization task. The proposed algorithm pairs the individuals in the n-dimensional feature space by minimizing the weighted distances between them. The weights of the dimensions are based on the odds ratios calculated from the logistic regression model fitted on the variables describing the probability of membership of the treated group. For finding the optimal pairing of the individuals, simulated annealing is utilized. The effectiveness of the newly proposed Weighted Nearest Neighbours Control Group Selection with Simulated Annealing (WNNSA) algorithm is presented by two Monte Carlo studies. Results show that the WNNSA method can outperform the widely applied greedy propensity score matching method in feature spaces where only a few covariates characterize individuals and the covariates can only take a few values.<\/jats:p>","DOI":"10.3390\/a14120356","type":"journal-article","created":{"date-parts":[[2021,12,9]],"date-time":"2021-12-09T21:52:32Z","timestamp":1639086752000},"page":"356","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Optimized Weighted Nearest Neighbours Matching Algorithm for Control Group Selection"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3698-9303","authenticated-orcid":false,"given":"Szabolcs","family":"Szek\u00e9r","sequence":"first","affiliation":[{"name":"Department of Computer Science and Systems Technology, Faculty of Information Technology, University of Pannonia, 8200 Veszpr\u00e9m, Hungary"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5524-1675","authenticated-orcid":false,"given":"\u00c1gnes","family":"Vathy-Fogarassy","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Systems Technology, Faculty of Information Technology, University of Pannonia, 8200 Veszpr\u00e9m, Hungary"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,8]]},"reference":[{"key":"ref_1","unstructured":"Babar, Z.U.D. (2019). Case-Control Studies. Encyclopedia of Pharmacy Practice and Clinical Pharmacy, Elsevier."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"105691","DOI":"10.1016\/j.aap.2020.105691","article-title":"Incorporating Bayesian methods into the propensity score matching framework: A no-treatment effect safety analysis","volume":"145","author":"Li","year":"2020","journal-title":"Accid. Anal. Prev."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"106123","DOI":"10.1016\/j.cct.2020.106123","article-title":"Practical considerations of utilizing propensity score methods in clinical development using real-world and historical data","volume":"97","author":"Li","year":"2020","journal-title":"Contemp. Clin. Trials"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"106091","DOI":"10.1016\/j.cct.2020.106091","article-title":"Key considerations in the design of real-world studies","volume":"96","author":"Fang","year":"2020","journal-title":"Contemp. Clin. Trials"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.hroo.2020.12.020","article-title":"Comparison of 2-year Outcomes between Primary and Secondary Prophylactic Use of Defibrillators in Patients with Coronary Artery Disease: A Prospective Propensity Score-Matched Analysis from the Nippon Storm Study","volume":"2","author":"Kondo","year":"2021","journal-title":"Heart Rhythm O2"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1093\/biomet\/70.1.41","article-title":"The central role of the propensity score in observational studies for causal effects","volume":"70","author":"Rosenbaum","year":"1983","journal-title":"Biometrika"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Szek\u00e9r, S., and Vathy-Fogarassy, \u00c1. (2020). Weighted nearest neighbours-based control group selection method for observational studies. PLoS ONE, 15.","DOI":"10.1371\/journal.pone.0236531"},{"key":"ref_8","unstructured":"Wright, R.E. (1995). Logistic Regression, American Psychological Association."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1007\/s10618-018-0576-8","article-title":"Linear regression for uplift modeling","volume":"32","author":"Jaroszewicz","year":"2018","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1111\/j.1524-4733.2006.00130.x","article-title":"Too much ado about propensity score models? Comparing methods of propensity score matching","volume":"9","author":"Baser","year":"2006","journal-title":"Value Health"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1111\/j.1467-6419.2007.00527.x","article-title":"Some Practical Guidance for the Implementation of Propensity Score Matching","volume":"22","author":"Caliendo","year":"2008","journal-title":"J. Econ. Surv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1080\/00273171.2011.568786","article-title":"An introduction to propensity score methods for reducing the effects of confounding in observational studies","volume":"46","author":"Austin","year":"2011","journal-title":"Multivar. Behav. Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"107188","DOI":"10.1016\/j.csda.2021.107188","article-title":"Subgroup causal effect identification and estimation via matching tree","volume":"159","author":"Zhang","year":"2021","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"107251","DOI":"10.1016\/j.csda.2021.107251","article-title":"Communication-efficient distributed M-estimation with missing data","volume":"16","author":"Shi","year":"2021","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_15","first-page":"251","article-title":"Comparison of Nearest Neighbor and Caliper Algorithms in Outcome Propensity Score Matching to Study the Relationship between Type 2 Diabetes and Coronary Artery Disease","volume":"7","author":"Tousi","year":"2021","journal-title":"J. Biostat. Epidemiol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2037","DOI":"10.1002\/sim.3150","article-title":"A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003","volume":"27","author":"Austin","year":"2008","journal-title":"Stat. Med."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1324","DOI":"10.1016\/j.neuroimage.2008.02.050","article-title":"Selection of the control group for VBM analysis: Influence of covariates, matching and sample size","volume":"41","author":"Pell","year":"2008","journal-title":"Neuroimage"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1016\/j.cct.2011.05.006","article-title":"Are propensity scores really superior to standard multivariable analysis?","volume":"32","author":"Romagnoli","year":"2011","journal-title":"Contemp. Clin. Trials"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1007\/s10654-017-0325-0","article-title":"Case\u2013control matching: Effects, misconceptions, and recommendations","volume":"33","author":"Mansournia","year":"2018","journal-title":"Eur. J. Epidemiol."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1017\/pan.2019.11","article-title":"Why propensity scores should not be used for matching","volume":"27","author":"King","year":"2019","journal-title":"Political Anal."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1007\/164_2019_280","article-title":"Out of Control? Managing Baseline Variability in Experimental Studies with Control Groups","volume":"257","author":"Moser","year":"2019","journal-title":"Handb. Exp. Pharmacol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1002\/sim.7976","article-title":"Matched or unmatched analyses with propensity-score-matched data?","volume":"38","author":"Wan","year":"2019","journal-title":"Stat. Med."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"107167","DOI":"10.1016\/j.csda.2021.107167","article-title":"Optimal treatment regimes for competing risk data using doubly robust outcome weighted learning with bi-level variable selection","volume":"158","author":"He","year":"2021","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_24","first-page":"61","article-title":"On stratification, grouping and matching","volume":"7","author":"Anderson","year":"1980","journal-title":"Scand. J. Stat."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"3083","DOI":"10.1002\/sim.3697","article-title":"Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples","volume":"28","author":"Austin","year":"2009","journal-title":"Stat. Med."},{"key":"ref_26","unstructured":"Gosset, W.S. (1908). The probable error of a mean. Biometrika, 1\u201325."},{"key":"ref_27","first-page":"83","article-title":"Sulla determinazione empirica di una lgge di distribuzione","volume":"4","author":"Kolmogorov","year":"1933","journal-title":"Inst. Ital. Attuari Giorn."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1214\/aoms\/1177730256","article-title":"Table for estimating the goodness of fit of empirical distributions","volume":"19","author":"Smirnov","year":"1948","journal-title":"Ann. Math. Stat."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1080\/14786440009463897","article-title":"On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling","volume":"50","author":"Pearson","year":"1900","journal-title":"Lond. Edinb. Dublin Philos. Mag. J. Sci."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"MacFarland, T.W., and Yates, J.M. (2016). Mann\u2013Whitney U test. Introduction to Nonparametric Statistics for the Biological Sciences Using R, Springer.","DOI":"10.1007\/978-3-319-30634-6"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Szek\u00e9r, S., and Vathy-Fogarassy, \u00c1. (2019, January 3\u20135). How Can the Similarity of the Case and Control Groups be Measured in Case-Control Studies?. Proceedings of the 2019 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), Budapest, Hungary.","DOI":"10.1109\/IWOBI47054.2019.9114390"},{"key":"ref_32","unstructured":"Bowers, J., Fredrickson, M., and Hansen, B. (2010). RItools: Randomization inference tools. R Package Version 0.1-11."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Van Laarhoven, P.J., and Aarts, E.H. (1987). Simulated Annealing: Theory and Applications, Springer.","DOI":"10.1007\/978-94-015-7744-1"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1292","DOI":"10.1002\/sim.4200","article-title":"Comparing paired vs non-paired statistical methods of analyses when making inferences about absolute risk reductions in propensity-score matched samples","volume":"30","author":"Austin","year":"2011","journal-title":"Stat. Med."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"159","DOI":"10.2307\/2529684","article-title":"Matching to remove bias in observational studies","volume":"29","author":"Rubin","year":"1973","journal-title":"Biometrics"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"293","DOI":"10.2307\/2529981","article-title":"Bias reduction using Mahalanobis-metric matching","volume":"36","author":"Rubin","year":"1980","journal-title":"Biometrics"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/12\/356\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:43:07Z","timestamp":1760168587000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/12\/356"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,8]]},"references-count":36,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["a14120356"],"URL":"https:\/\/doi.org\/10.3390\/a14120356","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2021,12,8]]}}}