{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T03:30:53Z","timestamp":1777087853648,"version":"3.51.4"},"reference-count":42,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2022,9,25]],"date-time":"2022-09-25T00:00:00Z","timestamp":1664064000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Diabetes is a chronic disease that continues to be a primary and worldwide health concern since the health of the entire population has been affected by it. Over the years, many academics have attempted to develop a reliable diabetes prediction model using machine learning (ML) algorithms. However, these research investigations have had a minimal impact on clinical practice as the current studies focus mainly on improving the performance of complicated ML models while ignoring their explainability to clinical situations. Therefore, the physicians find it difficult to understand these models and rarely trust them for clinical use. In this study, a carefully constructed, efficient, and interpretable diabetes detection method using an explainable AI has been proposed. The Pima Indian diabetes dataset was used, containing a total of 768 instances where 268 are diabetic, and 500 cases are non-diabetic with several diabetic attributes. Here, six machine learning algorithms (artificial neural network (ANN), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost, XGBoost) have been used along with an ensemble classifier to diagnose the diabetes disease. For each machine learning model, global and local explanations have been produced using the Shapley additive explanations (SHAP), which are represented in different types of graphs to help physicians in understanding the model predictions. The balanced accuracy of the developed weighted ensemble model was 90% with a F1 score of 89% using a five-fold cross-validation (CV). The median values were used for the imputation of the missing values and the synthetic minority oversampling technique (SMOTETomek) was used to balance the classes of the dataset. The proposed approach can improve the clinical understanding of a diabetes diagnosis and help in taking necessary action at the very early stages of the disease.<\/jats:p>","DOI":"10.3390\/s22197268","type":"journal-article","created":{"date-parts":[[2022,9,26]],"date-time":"2022-09-26T03:34:17Z","timestamp":1664163257000},"page":"7268","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":117,"title":["An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4775-8639","authenticated-orcid":false,"given":"Hafsa Binte","family":"Kibria","sequence":"first","affiliation":[{"name":"Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4126-0389","authenticated-orcid":false,"given":"Md","family":"Nahiduzzaman","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4491-4942","authenticated-orcid":false,"given":"Md. Omaer Faruq","family":"Goni","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7300-506X","authenticated-orcid":false,"given":"Mominul","family":"Ahsan","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of York, Deramore Lane, Heslington, York YO10 5GH, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7010-8285","authenticated-orcid":false,"given":"Julfikar","family":"Haider","sequence":"additional","affiliation":[{"name":"Department of Engineering, Manchester Metropolitan University, Manchester M1 5GD, UK"}]}],"member":"1968","published-online":{"date-parts":[[2022,9,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"2239","DOI":"10.1016\/S0140-6736(17)30058-2","article-title":"Type 2 diabetes","volume":"389","author":"Chatterjee","year":"2017","journal-title":"Lancet"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"100204","DOI":"10.1016\/j.imu.2019.100204","article-title":"A model for early prediction of diabetes","volume":"16","author":"Alam","year":"2019","journal-title":"Inform. Med. Unlocked"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Islam, M.M.F., Ferdousi, R., Rahman, S., and Bushra, H.Y. (2019). Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques. Computer Vision and Machine Intelligence in Medical Image Analysis, Springer.","DOI":"10.1007\/978-981-13-8798-2_12"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1047","DOI":"10.2337\/diacare.27.5.1047","article-title":"Global Prevalence of Diabetes","volume":"27","author":"Wild","year":"2004","journal-title":"Diabetes Care"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"S290","DOI":"10.2337\/dc08-s271","article-title":"Is Type 2 Diabetes an Operable Intestinal Disease?","volume":"31","author":"Rubino","year":"2008","journal-title":"Diabetes Care"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Kibria, H.B., Matin, A., Jahan, N., and Islam, S. (2021, January 10\u201312). A Comparative Study with Different Machine Learning Algorithms for Diabetes Disease Prediction. Proceedings of the 2021 18th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), Mexico City, Mexico.","DOI":"10.1109\/CCE53527.2021.9633043"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"107672","DOI":"10.1016\/j.compbiolchem.2022.107672","article-title":"The severity prediction of the binary and multi-class cardiovascular disease \u2212 A machine learning-based fusion approach","volume":"98","author":"Kibria","year":"2022","journal-title":"Comput. Biol. Chem."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2022\/1684017","article-title":"A Novel Diabetes Healthcare Disease Prediction Framework Using Machine Learning Techniques","volume":"2022","author":"Krishnamoorthi","year":"2022","journal-title":"J. Health Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"100815","DOI":"10.1016\/j.imu.2021.100815","article-title":"Forecasting the spread of the third wave of COVID-19 pandemic using time series analysis in Bangladesh","volume":"28","author":"Kibria","year":"2021","journal-title":"Inform. Med. Unlocked"},{"key":"ref_10","first-page":"40","article-title":"An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier","volume":"2","author":"Kumari","year":"2021","journal-title":"Int. J. Cogn. Comput. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1578","DOI":"10.1016\/j.procs.2018.05.122","article-title":"Prediction of Diabetes using Classification Algorithms","volume":"132","author":"Sisodia","year":"2018","journal-title":"Procedia Comput. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"012013","DOI":"10.1088\/1742-6596\/1714\/1\/012013","article-title":"Diabetes disease prediction using significant attribute selection and classification approach","volume":"1714","author":"Tiwari","year":"2021","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Chang, V., Bailey, J., Xu, Q.A., and Sun, Z. (2022). Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms. Neural Comput. Appl., 1\u201317.","DOI":"10.1007\/s00521-022-07049-z"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chen, W., Chen, S., Zhang, H., and Wu, T. (2017, January 24\u201326). A hybrid prediction model for type 2 diabetes using K-means and decision tree. Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China.","DOI":"10.1109\/ICSESS.2017.8342938"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Mir, A., and Dhage, S.N. (2018, January 16\u201318). Diabetes Disease Prediction Using Machine Learning on Big Data of Healthcare. Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India.","DOI":"10.1109\/ICCUBEA.2018.8697439"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sangien, T., Bhat, T., and Khan, M.S. (2022). Diabetes Disease Prediction Using Classification Algorithms. Internet of Things and Its Applications, Springer.","DOI":"10.1007\/978-981-16-7637-6_17"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1049\/htl2.12010","article-title":"A remote healthcare monitoring framework for diabetes prediction using machine learning","volume":"8","author":"Ramesh","year":"2021","journal-title":"Health Technol. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"8529","DOI":"10.1109\/ACCESS.2022.3142097","article-title":"Prediction of Diabetes Empowered With Fused Machine Learning","volume":"10","author":"Ahmed","year":"2022","journal-title":"IEEE Access"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Abdollahi, J., and Nouri-Moghaddam, B. (2022). Hybrid stacked ensemble combined with genetic algorithms for diabetes prediction. Iran J. Comput. Sci., 1\u201316.","DOI":"10.1007\/s42044-022-00100-1"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"144777","DOI":"10.1109\/ACCESS.2019.2945129","article-title":"Development of Disease Prediction Model Based on Ensemble Learning Approach for Diabetes and Hypertension","volume":"7","author":"Fitriyani","year":"2019","journal-title":"IEEE Access"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kibria, H.B., and Matin, A. (2021). An Efficient Machine Learning-Based Decision-Level Fusion Model to Predict Cardiovascular Disease. International Conference on Intelligent Computing & Optimization, Springer.","DOI":"10.1007\/978-3-030-68154-8_92"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1109\/MCI.2018.2866730","article-title":"Cross-Validation for Imbalanced Datasets: Avoiding Overoptimistic and Overfitting Approaches [Research Frontier]","volume":"13","author":"Santos","year":"2018","journal-title":"IEEE Comput. Intell. Mag."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/j.eswa.2019.04.022","article-title":"A practical computerized decision support system for predicting the severity of Alzheimer\u2019s disease of an individual","volume":"130","author":"Bucholc","year":"2019","journal-title":"Expert Syst. Appl."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"e6543","DOI":"10.7717\/peerj.6543","article-title":"An interpretable machine learning model for diagnosis of Alzheimer\u2019s disease","volume":"7","author":"Das","year":"2019","journal-title":"PeerJ"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Burrell, J. (2016). How the machine \u2018thinks\u2019: Understanding opacity in machine learning algorithms. Big Data Soc., 3.","DOI":"10.1177\/2053951715622512"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"52138","DOI":"10.1109\/ACCESS.2018.2870052","article-title":"Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)","volume":"6","author":"Adadi","year":"2018","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1038\/s41551-018-0304-0","article-title":"Explainable machine-learning predictions for the prevention of hypoxaemia during surgery","volume":"2","author":"Lundberg","year":"2018","journal-title":"Nat. Biomed. Eng."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3236009","article-title":"A Survey of Methods for Explaining Black Box Models","volume":"51","author":"Guidotti","year":"2019","journal-title":"ACM Comput. Surv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1056\/NEJM199509283331307","article-title":"Polycystic Ovary Syndrome","volume":"333","author":"Tephen","year":"1995","journal-title":"N. Engl. J. Med."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"20","DOI":"10.4103\/0974-1208.82355","article-title":"Efficacy of 2-hour post glucose insulin levels in predicting insulin resistance in polycystic ovarian syndrome with infertility","volume":"4","author":"Saxena","year":"2011","journal-title":"J. Hum. Reprod. Sci."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1186\/s41044-016-0014-0","article-title":"Big data preprocessing: Methods and prospects","volume":"1","author":"Luengo","year":"2016","journal-title":"Big Data Anal."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1145\/1007730.1007735","article-title":"A study of the behavior of several methods for balancing machine learning training data","volume":"6","author":"Batista","year":"2004","journal-title":"ACM SIGKDD Explor. Newsl."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"3120","DOI":"10.1016\/j.patcog.2014.03.021","article-title":"Exploration of classification confidence in ensemble learning","volume":"47","author":"Li","year":"2014","journal-title":"Pattern Recognit."},{"key":"ref_34","unstructured":"Kibria, H.B., Matin, A., and Islam, S. (2022, July 01). Comparative Analysis of Two Artificial Intelligence Based Decision Level Fusion Models for Heart Disease Prediction. Available online: http:\/\/ceur-ws.org."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1016\/S0731-7085(99)00272-1","article-title":"Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research","volume":"22","author":"Beresford","year":"2000","journal-title":"J. Pharm. Biomed. Anal."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Hart, S. (1989). Shapley Value. Game Theory, Palgrave Macmillan.","DOI":"10.1007\/978-1-349-20181-5_25"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1007\/BF01753239","article-title":"A new index of power for simplen-person games","volume":"7","author":"Deegan","year":"1978","journal-title":"Int. J. Game Theory"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2660","DOI":"10.1038\/s41598-021-82098-3","article-title":"A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer\u2019s disease","volume":"11","author":"Alonso","year":"2021","journal-title":"Sci. Rep."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"144352","DOI":"10.1109\/ACCESS.2021.3119110","article-title":"Shapley values for feature selection: The good, the bad, and the axioms","volume":"9","author":"Fryer","year":"2021","journal-title":"IEEE Access"},{"key":"ref_40","unstructured":"Sundararajan, M., and Najmi, A. (2022, August 28). The Many Shapley Values for Model Explanation. Available online: https:\/\/proceedings.mlr.press\/v119\/sundararajan20b.html."},{"key":"ref_41","unstructured":"(2022, September 20). An Introduction to Explainable AI with Shapley Values\u2014SHAP Latest Documentation. Available online: https:\/\/shap.readthedocs.io\/en\/latest\/example_notebooks\/overviews\/An%20introduction%20to%20explainable%20AI%20with%20Shapley%20values.html."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1177\/1932296814552673","article-title":"Comparison of salivary and serum glucose levels in diabetic patients","volume":"9","author":"Gupta","year":"2015","journal-title":"J. Diabetes Sci. Technol."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/19\/7268\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:39:12Z","timestamp":1760143152000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/19\/7268"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,25]]},"references-count":42,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2022,10]]}},"alternative-id":["s22197268"],"URL":"https:\/\/doi.org\/10.3390\/s22197268","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,25]]}}}