{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T15:57:16Z","timestamp":1780588636169,"version":"3.54.1"},"reference-count":41,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,3,5]],"date-time":"2025-03-05T00:00:00Z","timestamp":1741132800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Statistical and machine learning modelling techniques have been effectively used in the healthcare domain and the prediction of epidemiological chronic diseases such as diabetes, which is classified as an epidemic due to its high rates of global prevalence. These techniques are useful for the processes of description, prediction, and evaluation of various diseases, including diabetes. This paper models diabetes disease in Saudi Arabia using the most relevant risk factors, namely smoking, obesity, and physical inactivity for adults aged \u226525 years. The aim of this study is based on developing statistical and machine learning models for the purpose of studying the trends in incidence rates of diabetes over 15 years (1999\u20132013) and to obtain predictions for future levels of the disease up to 2025, to support health policy planning and resource allocation for controlling diabetes. Different models were developed, namely Multiple Linear Regression (MLR), Support Vector Regression (SVR), Bayesian Linear Regression (BLM), Adaptive Neuro-Fuzzy Inference model (ANFIS), and Artificial Neural Network (ANN). The performance of the developed models is evaluated using four statistical metrices: Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and coefficient of determination R-squared. Based on the results, it can be observed that the overall performance for all proposed models was reasonably good; however, the best results were achieved by the ANFIS model with RMSE = 0.04 and R2 = 0.99 for men\u2019s training data, and RMSE = 0.02 and R2 = 0.99 for women\u2019s training data.<\/jats:p>","DOI":"10.3390\/a18030145","type":"journal-article","created":{"date-parts":[[2025,3,5]],"date-time":"2025-03-05T03:18:22Z","timestamp":1741144702000},"page":"145","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Prediction of Diabetes Using Statistical and Machine Learning Modelling Techniques"],"prefix":"10.3390","volume":"18","author":[{"given":"Entissar","family":"Almutairi","sequence":"first","affiliation":[{"name":"Department of Electronic and Electrical Engineering, Brunel University of London, Uxbridge UB8 3PH, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8515-7933","authenticated-orcid":false,"given":"Maysam","family":"Abbod","sequence":"additional","affiliation":[{"name":"Department of Electronic and Electrical Engineering, Brunel University of London, Uxbridge UB8 3PH, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7048-2469","authenticated-orcid":false,"given":"Ziad","family":"Hunaiti","sequence":"additional","affiliation":[{"name":"Department of Electronic and Electrical Engineering, Brunel University of London, Uxbridge UB8 3PH, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,3,5]]},"reference":[{"key":"ref_1","unstructured":"World Health Organization (2024, December 29). Global Report on Diabetes. Available online: https:\/\/www.who.int\/publications\/i\/item\/9789241565257."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Moini, J. (2019). Epidemiology of Diabetes, Elsevier Science.","DOI":"10.1016\/B978-0-12-816864-6.00007-9"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"S43","DOI":"10.2337\/diacare.29.s1.06.s43","article-title":"Diagnosis and classification of diabetes mellitus","volume":"29","author":"Mellitus","year":"2006","journal-title":"Diabetes Care"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1929","DOI":"10.1016\/S0140-6736(07)61696-1","article-title":"The burden and costs of chronic diseases in low-income and middle-income countries","volume":"370","author":"Abegunde","year":"2007","journal-title":"Lancet"},{"key":"ref_5","unstructured":"International Diabetes Federation (2019). IDF Diabetes Atlas, International Diabetes Federation. [9th ed.]."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1046\/j.1524-4733.2001.45061.x","article-title":"Modeling for health care and other policy decisions: Uses, roles, and validity","volume":"4","author":"Weinstein","year":"2001","journal-title":"Value Health"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1046\/j.1524-4733.2003.00234.x","article-title":"Principles of good practice for decision analytic modeling in health-care evaluation: Report of the ISPOR Task Force on Good Research Practices--Modeling Studies","volume":"6","author":"Weinstein","year":"2003","journal-title":"Value Health"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zou, Q., Qu, K., Luo, Y., Yin, D., Ju, Y., and Tang, H. (2018). Predicting Diabetes Mellitus with Machine Learning Techniques. Front. Genet., 9.","DOI":"10.3389\/fgene.2018.00515"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lai, H., Huang, H., Keshavjee, K., Guergachi, A., and Gao, X. (2019). Predictive models for diabetes mellitus using machine learning techniques. BMC Endocr. Disord., 19.","DOI":"10.1186\/s12902-019-0436-6"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1414","DOI":"10.2337\/diacare.21.9.1414","article-title":"Global Burden of Diabetes, 1995\u20132025: Prevalence, numerical estimates, and projections","volume":"21","author":"King","year":"1998","journal-title":"Diabetes Care"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1047","DOI":"10.2337\/diacare.27.5.1047","article-title":"Global Prevalence of Diabetes","volume":"27","author":"Wild","year":"2004","journal-title":"Diabetes Care"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.diabres.2009.10.007","article-title":"Global estimates of the prevalence of diabetes for 2010 and 2030","volume":"87","author":"Shaw","year":"2010","journal-title":"Diabetes Res. Clin. Pract."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2079","DOI":"10.1007\/s13300-019-00684-1","article-title":"Forecasting the prevalence of diabetes mellitus using econometric models","volume":"10","author":"Mukasheva","year":"2019","journal-title":"Diabetes Ther."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Islam, M.S., Qaraqe, M.K., and Belhaouari, S.B. (2020, January 1\u20134). Early Prediction of Hemoglobin Alc: A novel Framework for better Diabetes Management. Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia.","DOI":"10.1109\/SSCI47803.2020.9308539"},{"key":"ref_15","first-page":"3966","article-title":"A comparative analysis on the evaluation of classification algorithms in the prediction of diabetes","volume":"8","author":"Patil","year":"2018","journal-title":"Int. J. Electr. Comput. Eng."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Faruque, M.F., and Sarker, I.H. (2019, January 7\u20139). Performance analysis of machine learning techniques to predict diabetes mellitus. Proceedings of the 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), Cox\u2019s Bazar, Bangladesh.","DOI":"10.1109\/ECACE.2019.8679365"},{"key":"ref_17","first-page":"4151","article-title":"A comparative analysis and risk prediction of diabetes at early stage using machine learning approach","volume":"13","author":"Oleiwi","year":"2020","journal-title":"Int. J. Futur. Gener. Commun. Netw."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Abdulhadi, N., and Al-Mousa, A. (2021, January 14\u201315). Diabetes Detection Using Machine Learning Classification Methods. Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan.","DOI":"10.1109\/ICIT52682.2021.9491788"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1177\/1932296817706375","article-title":"Machine learning methods to predict diabetes complications","volume":"12","author":"Dagliati","year":"2018","journal-title":"J. Diabetes Sci. Technol."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Kantawong, K., Tongphet, S., Bhrommalee, P., Rachata, N., and Pravesjit, S. (2020, January 11\u201314). The Methodology for Diabetes Complications Prediction Model. Proceedings of the 2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), Pattaya, Thailand.","DOI":"10.1109\/ECTIDAMTNCON48261.2020.9090700"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ryan, T.P. (2008). Modern Regression Methods, Wiley.","DOI":"10.1002\/9780470382806"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Awad, M., and Khanna, R. (2015). Support Vector Regression BT\u2014Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers, Apress.","DOI":"10.1007\/978-1-4302-5990-9"},{"key":"ref_23","unstructured":"Koehrsen, W. (2024, December 29). Bayesian Linear Regression in Python: Using Machine Learning to Predict Student Grades Part 2. Towards Data Science. Available online: https:\/\/towardsdatascience.com\/bayesian-linear-regression-in-python-using-machine-learning-to-predict-student-grades-part-2-b72059a8ac7e%0D."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1016\/j.cmpb.2013.04.018","article-title":"The application of support vector regression for prediction of the antiallodynic effect of drug combinations in the mouse model of streptozocin-induced diabetic neuropathy","volume":"111","year":"2013","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_25","unstructured":"Rutkowski, L. (2006). Flexible Neuro-Fuzzy Systems: Structures, Learning and Performance Evaluation, Springer."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Salleh, M.N.M., Talpur, N., and Hussain, K. (August, January 27). Adaptive Neuro-Fuzzy Inference System: Overview, Strengths, Limitations, and Solutions. Proceedings of the Data Mining and Big Data: Second International Conference, DMBD 2017, Fukuoka, Japan.","DOI":"10.1007\/978-3-319-61845-6_52"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Suparta, W., and Alhasa, K.M. (2015). Modeling of Tropospheric Delays Using ANFIS, Springer International Publishing.","DOI":"10.1007\/978-3-319-28437-8"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Lavrakas, P.J. (2008). Encyclopedia of Survey Research Methods, SAGE Publications.","DOI":"10.4135\/9781412963947"},{"key":"ref_29","unstructured":"Bruce, P., and Bruce, A. (2017). Practical Statistics for Data Scientists: 50 Essential Concepts, O\u2019Reilly Media."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Swamidass, P.M. (2000). Mean Absolute Percentage Error (MAPE). Encyclopedia of Production and Manufacturing Management, Springer.","DOI":"10.1007\/1-4020-0612-8_580"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1007\/s12546-011-9054-5","article-title":"MAPE-R: A rescaled measure of accuracy for cross-sectional subnational population forecasts","volume":"28","author":"Swanson","year":"2011","journal-title":"J. Popul. Res."},{"key":"ref_32","unstructured":"Dodge, Y. (2008). The Concise Encyclopedia of Statistics, Springer."},{"key":"ref_33","unstructured":"(2024, December 29). Saudi Health Interview Survey Results. 2013. [Online]. Available online: https:\/\/www.healthdata.org\/sites\/default\/files\/files\/Projects\/KSA\/Saudi-Health-Interview-Survey-Results.pdf."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.26719\/1999.5.6.1236","article-title":"Diabetes Mellitus, Hypertension and Obesity\u2014Common Multifactorial Disorders in Saudis","volume":"5","author":"Warsy","year":"1999","journal-title":"East. Mediterr. Health J."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1136\/tc.8.1.53","article-title":"Prevalence and determinants of smoking in three regions of Saudi Arabia","volume":"8","author":"Jarallah","year":"1999","journal-title":"Tob. Control"},{"key":"ref_36","first-page":"824","article-title":"Obesity in Saudi Arabia","volume":"26","author":"Arafah","year":"2005","journal-title":"Saudi Med. J."},{"key":"ref_37","unstructured":"WHO (2005). STEPwise Approach to NCD Surveillance, Country-Specific Standard Report, Saudi Arabia, World Health Organization."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1248","DOI":"10.1016\/j.dsp.2009.10.021","article-title":"An intelligent diagnosis system for diabetes on Linear Discriminant Analysis and Adaptive Network Based Fuzzy Inference System: LDA-ANFIS","volume":"20","author":"Dogantekin","year":"2010","journal-title":"Digit. Signal Process."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"8610","DOI":"10.1016\/j.eswa.2008.10.032","article-title":"A comparative study on diabetes disease diagnosis using neural networks","volume":"36","author":"Temurtas","year":"2009","journal-title":"Expert Syst. Appl."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"9316","DOI":"10.1109\/JIOT.2019.2926321","article-title":"Dynamic Adaptive Network-Based Fuzzy Inference System (D-ANFIS) for the Imputation of Missing Data for Internet of Medical Things Applications","volume":"6","author":"Turabieh","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"8981","DOI":"10.1007\/s00521-020-05661-5","article-title":"Co-active neuro-fuzzy inference system model as single imputation approach for non-monotone pattern of missing data","volume":"33","year":"2021","journal-title":"Neural Comput. Appl."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/3\/145\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:47:19Z","timestamp":1760028439000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/3\/145"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,5]]},"references-count":41,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["a18030145"],"URL":"https:\/\/doi.org\/10.3390\/a18030145","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,5]]}}}