{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T18:53:07Z","timestamp":1762195987198,"version":"build-2065373602"},"reference-count":53,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T00:00:00Z","timestamp":1762128000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Science Committee of the Ministry of Science and Higher Education of the Republic of Kazakhstan","award":["AP23488586"],"award-info":[{"award-number":["AP23488586"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Background: Coronary artery disease (CAD) remains a leading cause of morbidity and mortality. Early diagnosis reduces adverse outcomes and alleviates the burden on healthcare, yet conventional approaches are often invasive, costly, and not always available. In this context, machine learning offers promising solutions. Objective: The objective of this study is to evaluate the feasibility of reliably predicting both the presence and the severity of CAD. The analysis is based on a harmonized, multi-center UCI dataset that includes cohorts from Cleveland, Hungary, Switzerland, and Long Beach. The work aims to assess the accuracy and practical utility of models built exclusively on routine tabular clinical and demographic data, without relying on imaging. These models are designed to improve risk stratification and guide patient routing. Methods and Results: The study is based on a uniform and standardized data processing pipeline. This pipeline includes handling missing values, feature encoding, scaling, an 80\/20 train\u2013test split and applying the SMOTE method exclusively to the training set to prevent information leakage. Within this pipeline, a standardized comparison of a wide range of models (including gradient boosting, tree-based ensembles, support vector methods, etc.) was conducted with hyperparameter tuning via GridSearchCV. The best results were demonstrated by the CatBoost model: accuracy\u20140.8278, recall\u20140.8407, and F1-score\u20140.8436. Conclusions: A key distinction of this work is the comprehensive evaluation of the models\u2019 practical suitability. Beyond standard metrics, the analysis of calibration curves confirmed the reliability of the probabilistic predictions. Patient-level interpretability using SHAP showed that the model relies on clinically significant predictors, including ST-segment depression. Calibrated and explainable models based on readily available data are positioned as a practical tool for scalable risk stratification and decision support, especially in resource-constrained settings.<\/jats:p>","DOI":"10.3390\/a18110693","type":"journal-article","created":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T18:21:46Z","timestamp":1762194106000},"page":"693","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Explainable AI for Coronary Artery Disease Stratification Using Routine Clinical Data"],"prefix":"10.3390","volume":"18","author":[{"given":"Nurdaulet","family":"Tasmurzayev","sequence":"first","affiliation":[{"name":"Faculty of Information Technologies and Artificial Intelligence, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan"}]},{"given":"Baglan","family":"Imanbek","sequence":"additional","affiliation":[{"name":"Faculty of Information Technologies and Artificial Intelligence, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan"}]},{"given":"Assiya","family":"Boltaboyeva","sequence":"additional","affiliation":[{"name":"Faculty of Information Technologies and Artificial Intelligence, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan"},{"name":"LLP \u201cKazakhstan R&D Solutions\u201d, Almaty 050056, Kazakhstan"}]},{"given":"Gulmira","family":"Dikhanbayeva","sequence":"additional","affiliation":[{"name":"Faculty of Postgraduate Higher Medical Education, Akhmet Yasawi University, Shymkent 161200, Kazakhstan"}]},{"given":"Sarsenbek","family":"Zhussupbekov","sequence":"additional","affiliation":[{"name":"Department of Automation and Control, Energo University, Almaty 050013, Kazakhstan"}]},{"given":"Qarlygash","family":"Saparbayeva","sequence":"additional","affiliation":[{"name":"Faculty of Information Technologies and Artificial Intelligence, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan"},{"name":"Faculty of Pholology, South Kazakhstan University Named After O.Zhanibekov, Shymkent 160012, Kazakhstan"}]},{"given":"Gulshat","family":"Amirkhanova","sequence":"additional","affiliation":[{"name":"Faculty of Information Technologies and Artificial Intelligence, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1161\/CIRCRESAHA.117.308903","article-title":"Reducing the Global Burden of Cardiovascular Disease, Part 1: The Epidemiology and Risk Factors","volume":"121","author":"Joseph","year":"2017","journal-title":"Circ. Res."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Iside, C., Affinito, O., Punzo, B., Salvatore, M., Mirabelli, P., Cavaliere, C., and Franzese, M. (2023). Stratification of Patients with Coronary Artery Disease by Circulating Cytokines Profile: A Pilot Study. J. Clin. Med., 12.","DOI":"10.3390\/jcm12206649"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Pietrantonio, F., Carrieri, C., Rosiello, F., Spandonaro, F., Vinci, A., and D\u2019Angela, D. (2025). Clinical, Economical, and Organizational Impact of Chronic Ischemic Cardiovascular Disease in Italy: Evaluation of 2019 Nationwide Hospital Admissions Data. Int. J. Environ. Res. Public Health, 22.","DOI":"10.3390\/ijerph22040530"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Hosseini, K., Mortazavi, S.H., Sadeghian, S., Abdi, S., Lotfi, A., Rahimi, M., Nouri, M., and Emami, H. (2021). Prevalence and Trends of Coronary Artery Disease Risk Factors and Their Effect on Age of Diagnosis in Patients with Established Coronary Artery Disease: Tehran Heart Center (2005\u20132015). BMC Cardiovasc. Disord., 21.","DOI":"10.1186\/s12872-021-02293-y"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/s41669-021-00297-0","article-title":"Economic Analysis of the CADScor System for Ruling Out Coronary Artery Disease in England","volume":"6","author":"Javanbakht","year":"2022","journal-title":"PharmacoEconomics Open"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"C\u00e1rdenas-Anguiano, J.J., Quiroz-Gomez, S., Guzm\u00e1n-Priego, C.G., Celorio-M\u00e9ndez, K.S., Ba\u00f1os-Gonz\u00e1lez, M.A., Jim\u00e9nez-Sastr\u00e9, A., Baeza-Flores, G.C., and Albarran-Melzer, J.A. (2025). Estimation of the Burden of Ischemic Heart Disease in the Tabasco Population, Mexico, 2013\u20132021. Int. J. Environ. Res. Public Health, 22.","DOI":"10.20944\/preprints202502.0674.v1"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1177\/2047487315587402","article-title":"National Prevalence of Coronary Heart Disease and Its Relationship with Human Development Index: A Systematic Review","volume":"23","author":"Zhu","year":"2016","journal-title":"Eur. J. Prev. Cardiol."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Krishnan, M.N., Zachariah, G., Venugopal, K., Mohanan, P.P., Harikrishnan, S., Sanjay, G., Tharakan, J.M., Zachariah, A.M., Eapen, K., and Suresh, G. (2016). Prevalence of Coronary Artery Disease and Its Risk Factors in Kerala, South India: A Community-Based Cross-Sectional Study. BMC Cardiovasc. Disord., 16.","DOI":"10.1186\/s12872-016-0189-3"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1055\/s-0042-1751234","article-title":"Epidemiology, Pathophysiology, and Management of Coronary Artery Disease in the Elderly","volume":"31","author":"Fadah","year":"2022","journal-title":"Int. J. Angiol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"e017712","DOI":"10.1161\/JAHA.120.017712","article-title":"Risk Factor Burden and Long-Term Prognosis of Patients with Premature Coronary Artery Disease","volume":"9","author":"Zeitouni","year":"2020","journal-title":"J. Am. Heart Assoc."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Monizzi, G., Di Lenarda, F., Gallinoro, E., and Bartorelli, A.L. (2024). Myocardial Ischemia: Differentiating between Epicardial Coronary Artery Atherosclerosis, Microvascular Dysfunction and Vasospasm in the Catheterization Laboratory. J. Clin. Med., 13.","DOI":"10.3390\/jcm13144172"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4330\/wjc.v8.i1.1","article-title":"Genetics of Coronary Artery Disease and Myocardial Infarction","volume":"8","author":"Dai","year":"2016","journal-title":"World J. Cardiol."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.cmpb.2018.05.009","article-title":"Non-Invasive Detection of Coronary Artery Disease in High-Risk Patients Based on the Stenosis Prediction of Separate Coronary Arteries","volume":"162","author":"Alizadehsani","year":"2018","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"FSO698","DOI":"10.2144\/fsoa-2020-0206","article-title":"Machine Learning Algorithms for Predicting Coronary Artery Disease: Efforts toward an Open Source Solution","volume":"7","author":"Akella","year":"2021","journal-title":"Future Sci. OA"},{"key":"ref_15","first-page":"1","article-title":"A Hybrid Machine Learning Approach to Identify Coronary Diseases Using Feature Selection Mechanism on Heart Disease Dataset","volume":"41","author":"Doppala","year":"2023","journal-title":"Distrib. Parallel Databases"},{"key":"ref_16","first-page":"271","article-title":"Classification of Coronary Artery Disease Data Sets by Using a Deep Neural Network","volume":"1","author":"Caliskan","year":"2017","journal-title":"Eur. Biotech. J."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Teja, M.D., and Rayalu, G.M. (2025). Optimizing Heart Disease Diagnosis with Advanced Machine Learning Models: A Comparison of Predictive Performance. BMC Cardiovasc. Disord., 25.","DOI":"10.1186\/s12872-025-04627-6"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ahmad, A.A., and Polat, H. (2023). Prediction of Heart Disease Based on Machine Learning Using Jellyfish Optimization Algorithm. Diagnostics, 13.","DOI":"10.3390\/diagnostics13142392"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Papandrianos, N., and Papageorgiou, E. (2021). Automatic Diagnosis of Coronary Artery Disease in SPECT Myocardial Perfusion Imaging Employing Deep Learning. Appl. Sci., 11.","DOI":"10.3390\/app11146362"},{"key":"ref_20","first-page":"87","article-title":"Review on Cleveland Heart Disease Dataset Using Machine Learning","volume":"21","author":"Yaqoob","year":"2023","journal-title":"Quaid-e-Awam Univ. Res. J. Eng. Sci. Technol."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kaba, \u015e., Haci, H., Isin, A., Ilhan, A., and Conkbayir, C. (2023). The Application of Deep Learning for the Segmentation and Classification of Coronary Arteries. Diagnostics, 13.","DOI":"10.3390\/diagnostics13132274"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Hassannataj Joloudari, J., Azizi, F., Nematollahi, M.A., Alizadehsani, R., Hassannatajjeloudari, E., Nodehi, I., and Mosavi, A. (2021). GSVMA: A Genetic Support Vector Machine ANOVA Method for CAD Diagnosis. Front. Cardiovasc. Med., 8.","DOI":"10.3389\/fcvm.2021.760178"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Sait, A.R.W., and Awad, A.M.A.B. (2024). Ensemble Learning-Based Coronary Artery Disease Detection Using Computer Tomography Images. Appl. Sci., 14.","DOI":"10.3390\/app14031238"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Apostolopoulos, I.D., Papandrianos, N.I., Apostolopoulos, D.J., and Papageorgiou, E. (2024). Between Two Worlds: Investigating the Intersection of Human Expertise and Machine Learning in the Case of Coronary Artery Disease Diagnosis. Bioengineering, 11.","DOI":"10.3390\/bioengineering11100957"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Burton, T., Fathieh, F., Nemati, N., Gillins, H.R., Shadforth, I.P., Ramchandani, S., and Bridges, C.R. (2024). Development of a Non-Invasive Machine-Learned Point-of-Care Rule-Out Test for Coronary Artery Disease. Diagnostics, 14.","DOI":"10.3390\/diagnostics14070719"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1016\/0002-9149(89)90524-9","article-title":"International Application of a New Probability Algorithm for the Diagnosis of Coronary Artery Disease","volume":"64","author":"Detrano","year":"1989","journal-title":"Am. J. Cardiol."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1186\/s40537-020-00305-w","article-title":"Survey on Categorical Data for Neural Networks","volume":"7","author":"Hancock","year":"2020","journal-title":"J. Big Data"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"109924","DOI":"10.1016\/j.asoc.2022.109924","article-title":"The Choice of Scaling Technique Matters for Classification Performance","volume":"133","author":"Amorim","year":"2023","journal-title":"Appl. Soft Comput."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1436","DOI":"10.1007\/s10489-021-02412-4","article-title":"Conditional Mutual Information-Based Feature Selection Algorithm for Maximal Relevance Minimal Redundancy","volume":"52","author":"Gu","year":"2022","journal-title":"Appl. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"e074819","DOI":"10.1136\/bmj-2023-074819","article-title":"Evaluation of Clinical Prediction Models (Part 1): From Development to External Validation","volume":"384","author":"Collins","year":"2024","journal-title":"BMJ"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Demircio\u011flu, A. (2024). Applying Oversampling before Cross-Validation Will Lead to High Bias in Radiomics. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-62585-z"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zhang, S., Yuan, Y., Yao, Z., Wang, X., and Lei, Z. (2022). Improvement of the Performance of Models for Predicting Coronary Artery Disease Based on XGBoost Algorithm and Feature Processing Technology. Electronics, 11.","DOI":"10.3390\/electronics11030315"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Lin, L., Ding, L., Fu, Z., and Zhang, L. (2024). Machine Learning-Based Models for Prediction of the Risk of Stroke in Coronary Artery Disease Patients Receiving Coronary Revascularization. PLoS ONE, 19.","DOI":"10.1371\/journal.pone.0296402"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, J., Xue, Q., Zhang, C.W.J., Wong, K.K.L., and Liu, Z. (2024). Explainable Coronary Artery Disease Prediction Model Based on AutoGluon from AutoML Framework. Front. Cardiovasc. Med., 11.","DOI":"10.3389\/fcvm.2024.1360548"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Zhou, L., Song, J., Li, Z., Wang, Y., Chen, X., Fang, Q., and Zhang, J. (2024). THGB: Predicting Ligand-Receptor Interactions by Combining Tree Boosting and Histogram-Based Gradient Boosting. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-78954-7"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, J., Xu, Y., Liu, L., Li, F., Zhao, H., and Zhang, Y. (2023). Comparison of LASSO and Random Forest Models for Predicting the Risk of Premature Coronary Artery Disease. BMC Med. Inform. Decis. Mak., 23.","DOI":"10.1186\/s12911-023-02407-w"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Lyu, Y., Wu, H.M., Yan, H.X., Li, S.Q., Zhao, X., and Zhang, T. (2024). Classification of Coronary Artery Disease Using Radial Artery Pulse Wave Analysis via Machine Learning. BMC Med. Inform. Decis. Mak., 24.","DOI":"10.1186\/s12911-024-02666-1"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yang, L., Wu, H., Jin, X., Liu, Y., Chen, M., and Zhou, D. (2020). Study of Cardiovascular Disease Prediction Model Based on Random Forest in Eastern China. Sci. Rep., 10.","DOI":"10.1038\/s41598-020-62133-5"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Khan, M.A., Mazhar, T., Mateen Yaqoob, M., Rehman, A., Sharif, M., and Kadry, S. (2024). Optimal Feature Selection for Heart Disease Prediction Using Modified Artificial Bee Colony (M-ABC) and K-Nearest Neighbors (KNN). Sci. Rep., 14.","DOI":"10.1038\/s41598-024-78021-1"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Meda, J.R., Kusima, H.L., and Magitta, N.F. (2024). Angiographic Characteristics of Coronary Artery Disease in Patients Undergoing Diagnostic Coronary Angiography at a Tertiary Hospital in Tanzania. BMC Cardiovasc. Disord., 24.","DOI":"10.1186\/s12872-024-03773-7"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"e1484","DOI":"10.1002\/widm.1484","article-title":"Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges","volume":"13","author":"Bischl","year":"2023","journal-title":"WIREs Data Min. Knowl. Discov."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Husain, G., Nasef, D., Jose, R., Mayer, J., Bekbolatova, M., Devine, T., and Toma, M. (2025). SMOTE vs. SMOTEENN: A Study on the Performance of Resampling Algorithms for Addressing Class Imbalance in Regression Models. Algorithms, 18.","DOI":"10.3390\/a18010037"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"e12039","DOI":"10.1002\/jeo2.12039","article-title":"A Practical Guide to the Implementation of AI in Orthopaedic Research, Part 6: How to Evaluate the Performance of AI Research?","volume":"11","author":"Oettl","year":"2024","journal-title":"J. Exp. Orthop."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1093\/jamia\/ocz228","article-title":"A Tutorial on Calibration Measurements and Calibration Models for Clinical Prediction Models","volume":"27","author":"Huang","year":"2020","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/s42256-019-0138-9","article-title":"From Local Explanations to Global Understanding with Explainable AI for Trees","volume":"2","author":"Lundberg","year":"2020","journal-title":"Nat. Mach. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hicks, S.A., Str\u00fcmke, I., Thambawita, V., Halvorsen, P., Riegler, M., and Johansen, D. (2022). On Evaluation Metrics for Medical Applications of Artificial Intelligence. Sci. Rep., 12.","DOI":"10.1038\/s41598-022-09954-8"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Salah, H., and Srinivas, S. (2022). Explainable Machine Learning Framework for Predicting Long-Term Cardiovascular Disease Risk among Adolescents. Sci. Rep., 12.","DOI":"10.1038\/s41598-022-25933-5"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1177\/0272989X06295361","article-title":"Decision Curve Analysis: A Novel Method for Evaluating Prediction Models","volume":"26","author":"Vickers","year":"2006","journal-title":"Med. Decis. Mak."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Zou, M., Jiang, W.G., Qin, Q.H., Liu, Y.C., and Li, M.L. (2022). Optimized XGBoost Model with Small Dataset for Predicting Relative Density of Ti-6Al-4V Parts Manufactured by Selective Laser Melting. Materials, 15.","DOI":"10.3390\/ma15155298"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"122778","DOI":"10.1016\/j.eswa.2023.122778","article-title":"A Review of Ensemble Learning and Data Augmentation Models for Class Imbalanced Problems: Combination, Implementation and Evaluation","volume":"244","author":"Khan","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1038\/s41586-024-08328-6","article-title":"Accurate Predictions on Small Data with a Tabular Foundation Model","volume":"637","author":"Hollmann","year":"2025","journal-title":"Nature"},{"key":"ref_52","first-page":"2813","article-title":"Class Imbalance, Recalibration, and the Clinical Usefulness of Predictive Models","volume":"12","author":"Wallace","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_53","first-page":"5438","article-title":"Explainable AI for Deep Learning-Based Prediction of In-Hospital Mortality in Cardiac Patients","volume":"26","author":"Veta","year":"2022","journal-title":"IEEE J. Biomed. Health Inform."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/11\/693\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T18:32:52Z","timestamp":1762194772000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/11\/693"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,3]]},"references-count":53,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["a18110693"],"URL":"https:\/\/doi.org\/10.3390\/a18110693","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,3]]}}}