{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T04:56:16Z","timestamp":1780980976648,"version":"3.54.1"},"reference-count":43,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2024,4,26]],"date-time":"2024-04-26T00:00:00Z","timestamp":1714089600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Despite medical advancements in recent years, cardiovascular diseases (CVDs) remain a major factor in rising mortality rates, challenging predictions despite extensive expertise. The healthcare sector is poised to benefit significantly from harnessing massive data and the insights we can derive from it, underscoring the importance of integrating machine learning (ML) to improve CVD prevention strategies. In this study, we addressed the major issue of class imbalance in the Behavioral Risk Factor Surveillance System (BRFSS) 2021 heart disease dataset, including personal lifestyle factors, by exploring several resampling techniques, such as the Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), SMOTE-Tomek, and SMOTE-Edited Nearest Neighbor (SMOTE-ENN). Subsequently, we trained, tested, and evaluated multiple classifiers, including logistic regression (LR), decision trees (DTs), random forest (RF), gradient boosting (GB), XGBoost (XGB), CatBoost, and artificial neural networks (ANNs), comparing their performance with a primary focus on maximizing sensitivity for CVD risk prediction. Based on our findings, the hybrid resampling techniques outperformed the alternative sampling techniques, and our proposed implementation includes SMOTE-ENN coupled with CatBoost optimized through Optuna, achieving a remarkable 88% rate for recall and 82% for the area under the receiver operating characteristic (ROC) curve (AUC) metric.<\/jats:p>","DOI":"10.3390\/a17050178","type":"journal-article","created":{"date-parts":[[2024,4,26]],"date-time":"2024-04-26T03:23:47Z","timestamp":1714101827000},"page":"178","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["Strategic Machine Learning Optimization for Cardiovascular Disease Prediction and High-Risk Patient Identification"],"prefix":"10.3390","volume":"17","author":[{"given":"Konstantina-Vasiliki","family":"Tompra","sequence":"first","affiliation":[{"name":"School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9361-8621","authenticated-orcid":false,"given":"George","family":"Papageorgiou","sequence":"additional","affiliation":[{"name":"School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8263-9024","authenticated-orcid":false,"given":"Christos","family":"Tjortjis","sequence":"additional","affiliation":[{"name":"School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2024,4,26]]},"reference":[{"key":"ref_1","unstructured":"World Health Organization (2023, June 26). Cardiovascular Diseases (CVDs), Available online: https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/cardiovascular-diseases-(cvds)."},{"key":"ref_2","first-page":"44","article-title":"Integrated Machine Learning Model for Comprehensive Heart Disease Risk Assessment Based on Multi-Dimensional Health Factors","volume":"11","author":"Lupague","year":"2023","journal-title":"Eur. J. Comput. Sci. Inf. Technol."},{"key":"ref_3","unstructured":"(2023, August 01). Cleveland Clinic Cardiovascular Disease. Available online: https:\/\/my.clevelandclinic.org\/health\/diseases\/21493-cardiovascular-disease."},{"key":"ref_4","unstructured":"National Center for Chronic Disease Prevention and Health Promotion (2023, August 01). The Nation\u2019s Risk Factors and CDC\u2019s Response, Available online: https:\/\/www.cdc.gov\/chronicdisease\/resources\/publications\/factsheets\/heart-disease-stroke.htm."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"e015975","DOI":"10.1161\/JAHA.119.015975","article-title":"Priorities for Patient-Centered Research in Valvular Heart Disease: A Report from the National Heart, Lung, and Blood Institute Working Group","volume":"9","author":"Lindman","year":"2020","journal-title":"J. Am. Heart Assoc."},{"key":"ref_6","unstructured":"NHS (2023, August 01). Heart Failure. Available online: https:\/\/www.nhs.uk\/conditions\/heart-failure\/."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"e22949","DOI":"10.1002\/ima.22949","article-title":"Performance Analysis of state-of-the-art CNN Architectures for Brain Tumour Detection","volume":"34","author":"Khushi","year":"2024","journal-title":"Int. J. Imaging Syst. Technol."},{"key":"ref_8","unstructured":"Wisner, W. (2024, March 16). What Is Preventive Health and Why Is It Important?. Available online: https:\/\/www.healthline.com\/health\/what-is-preventive-health-and-why-is-it-important."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/s40537-021-00553-4","article-title":"The Use of Big Data Analytics in Healthcare","volume":"9","author":"Batko","year":"2022","journal-title":"J. Big Data"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Feng, C., Ding, Z., Lao, Q., Zhen, T., Ruan, M., Han, J., He, L., and Shen, Q. (2023). Prediction of early hematoma expansion of spontaneous intracerebral hemorrhage based on deep learning radiomics features of noncontrast computed tomography. Eur. Radiol.","DOI":"10.1007\/s00330-023-10410-y"},{"key":"ref_11","unstructured":"EIT Health (2023, August 01). Early Diagnostics: Shaping Healthcare and Society through New Technologies. Available online: https:\/\/eithealth.eu\/wp-content\/uploads\/2020\/09\/EIT-Health-paper_Early-Diagnostics_Shaping-Healthcare-Society.pdf."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1007\/s12553-020-00446-1","article-title":"Machine Learning Prediction of Susceptibility to Visceral Fat Associated Diseases","volume":"10","author":"Aldraimli","year":"2020","journal-title":"Health Technol."},{"key":"ref_13","unstructured":"Mary, K. (2023, August 01). Pratt Predictive Analytics in Healthcare: 12 Valuable Use Cases. Available online: https:\/\/www.techtarget.com\/searchbusinessanalytics\/tip\/Predictive-analytics-in-healthcare-12-valuable-use-cases."},{"key":"ref_14","unstructured":"Alkhaldi, N. (2023, August 01). Predictive Analytics in Healthcare: 7 Ways to Save Time and Money. Available online: https:\/\/itrexgroup.com\/blog\/predictive-analytics-in-healthcare-top-use-cases\/."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Weng, S.F., Reps, J., Kai, J., Garibaldi, J.M., and Qureshi, N. (2017). Can Machine-Learning Improve Cardiovascular Risk Prediction Using Routine Clinical Data?. PLoS ONE, 12.","DOI":"10.1371\/journal.pone.0174944"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"267498","DOI":"10.1155\/2022\/5267498","article-title":"Cardiovascular Disease Detection Using Ensemble Learning","volume":"2022","author":"Alqahtani","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"14659","DOI":"10.1109\/ACCESS.2019.2962755","article-title":"MIFH: A Machine Intelligence Framework for Heart Disease Diagnosis","volume":"8","author":"Gupta","year":"2020","journal-title":"IEEE Access"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"6663455","DOI":"10.1155\/2021\/6663455","article-title":"Improving the Accuracy for Analyzing Heart Diseases Prediction Based on the Ensemble Method","volume":"2021","author":"Gao","year":"2021","journal-title":"Complexity"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Paragliola, G., and Coronato, A. (2021). An Hybrid ECG-Based Deep Network for the Early Identification of High-Risk to Major Cardiovascular Events for Hypertension Patients. J. Biomed. Inform., 113.","DOI":"10.1016\/j.jbi.2020.103648"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"100584","DOI":"10.1016\/j.imu.2021.100584","article-title":"An Ensemble Method Based Multilayer Dynamic System to Predict Cardiovascular Disease Using Machine Learning Approach","volume":"24","author":"Uddin","year":"2021","journal-title":"Inform. Med. Unlocked"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"9738123","DOI":"10.1155\/2023\/9738123","article-title":"Monitoring Cardiovascular Problems in Heart Patients Using Machine Learning","volume":"2023","author":"Rakhra","year":"2023","journal-title":"J. Healthc. Eng."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"615","DOI":"10.30598\/barekengvol16iss2pp615-624","article-title":"Predicting Diabetes Mellitus Using Catboost Classifier and Shapley Additive Explanation (Shap) Approach","volume":"16","author":"Permatasari","year":"2022","journal-title":"BAREKENG J. Ilmu Mat. Dan. Terap."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"101064","DOI":"10.1016\/j.imu.2022.101064","article-title":"Advanced Hybrid Ensemble Gain Ratio Feature Selection Model Using Machine Learning for Enhanced Disease Risk Prediction","volume":"32","author":"Pasha","year":"2022","journal-title":"Inform. Med. Unlocked"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"3730303","DOI":"10.1155\/2022\/3730303","article-title":"Prediction of Cardiovascular Disease on Self-Augmented Datasets of Heart Patients Using Multiple Machine Learning Models","volume":"2022","author":"Ahmed","year":"2022","journal-title":"J. Sens."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Asif, D., Bibi, M., Arif, M.S., and Mukheimer, A. (2023). Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization. Algorithms, 16.","DOI":"10.3390\/a16060308"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2244","DOI":"10.35940\/ijitee.C9009.019320","article-title":"Heart Diseases Prediction Using Deep Learning Neural Network Model","volume":"9","author":"Sharma","year":"2020","journal-title":"Int. J. Innov. Technol. Explor. Eng."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"012022","DOI":"10.1088\/1742-6596\/1997\/1\/012022","article-title":"Classification of Heart Disease Using Artificial Neural Network","volume":"1997","author":"Tick","year":"2021","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"8387680","DOI":"10.1155\/2021\/8387680","article-title":"Prediction of Heart Disease Using a Combination of Machine Learning and Deep Learning","volume":"2021","author":"Bharti","year":"2021","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1150933","DOI":"10.3389\/fmed.2023.1150933","article-title":"Cardiovascular Diseases Prediction by Machine Learning Incorporation with Deep Learning","volume":"10","author":"Subramani","year":"2023","journal-title":"Front. Med."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Trigka, M., and Dritsas, E. (2023). Long-Term Coronary Artery Disease Risk Prediction with Machine Learning Models. Sensors, 23.","DOI":"10.3390\/s23031193"},{"key":"ref_31","first-page":"3649406","article-title":"A Comprehensive Investigation of the Performances of Different Machine Learning Classifiers with SMOTE-ENN Oversampling Technique and Hyperparameter Optimization for Imbalanced Heart Failure Dataset","volume":"2022","author":"Faisal","year":"2022","journal-title":"Sci. Program."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"9005278","DOI":"10.1155\/2022\/9005278","article-title":"AdaBoost Ensemble Methods Using K-Fold Cross Validation for Survivability with the Early Detection of Heart Disease","volume":"2022","author":"Mahesh","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"113408","DOI":"10.1016\/j.eswa.2020.113408","article-title":"An Efficient Convolutional Neural Network for Coronary Heart Disease Prediction","volume":"159","author":"Dutta","year":"2020","journal-title":"Expert. Syst. Appl."},{"key":"ref_34","unstructured":"Hsieh, H.-Y., Su, C.-F., and Chiu, S.-I. (EasyChair, 2022). Constructing Multiple Layers of Machine Learning for the Early Detection of Cardiovascular Diseases , EasyChair, preprint."},{"key":"ref_35","first-page":"258","article-title":"Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link","volume":"7","author":"Hairani","year":"2023","journal-title":"JOIV Int. J. Inform. Vis."},{"key":"ref_36","unstructured":"Center for Disease Control (2023, August 01). 2021 BRFSS Survey Data and Documentation, Available online: https:\/\/www.cdc.gov\/brfss\/annual_data\/annual_2021.html."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2218","DOI":"10.1093\/eurjpc\/zwac187","article-title":"Obesity and cardiovascular disease: Mechanistic insights and management strategies. A joint position paper by the World Heart Federation and World Obesity Federation","volume":"29","author":"Almahmeed","year":"2022","journal-title":"Eur. J. Prev. Cardiol."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Blagus, R., and Lusa, L. (2013). SMOTE for High-Dimensional Class-Imbalanced Data. BMC Bioinform., 14.","DOI":"10.1186\/1471-2105-14-106"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"012051","DOI":"10.1088\/1742-6596\/2325\/1\/012051","article-title":"Early Heart Disease Prediction Using Ensemble Learning Techniques","volume":"2325","author":"Bhargav","year":"2022","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1906466","DOI":"10.1155\/2022\/1906466","article-title":"An Efficient Machine Learning Model Based on Improved Features Selections for Early and Accurate Heart Disease Predication","volume":"2022","author":"Ullah","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_41","first-page":"130","article-title":"Decision Tree Methods: Applications for Classification and Prediction","volume":"27","author":"Song","year":"2015","journal-title":"Shanghai Arch. Psychiatry"},{"key":"ref_42","first-page":"75","article-title":"Heart Diseases Diagnosis Using Neural Networks Arbitration","volume":"7","author":"Olaniyi","year":"2015","journal-title":"Int. J. Intell. Syst. Appl."},{"key":"ref_43","first-page":"375","article-title":"Enhanced Accuracy for Heart Disease Prediction Using Artificial Neural Network","volume":"29","year":"2022","journal-title":"Indones. J. Electr. Eng. Comput. Sci."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/5\/178\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:33:34Z","timestamp":1760106814000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/5\/178"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,26]]},"references-count":43,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["a17050178"],"URL":"https:\/\/doi.org\/10.3390\/a17050178","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,26]]}}}