{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,6]],"date-time":"2025-11-06T20:16:56Z","timestamp":1762460216246,"version":"build-2065373602"},"reference-count":59,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2025,5,30]],"date-time":"2025-05-30T00:00:00Z","timestamp":1748563200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Program Management Unit for Human Resources &amp; Institutional Development, Research and Innovation (PMU-B)","award":["B04G640071","BG-RRP-2.013-0001"],"award-info":[{"award-number":["B04G640071","BG-RRP-2.013-0001"]}]},{"name":"European Union-NextGenerationEU, through the National Recovery and Resilience Plan of the Republic of Bulgaria","award":["B04G640071","BG-RRP-2.013-0001"],"award-info":[{"award-number":["B04G640071","BG-RRP-2.013-0001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>The early detection of dementia, a condition affecting both individuals and society, is essential for its effective management. However, reliance on advanced laboratory tests and specialized expertise limits accessibility, hindering timely diagnosis. To address this challenge, this study proposes a novel approach in which readily available biochemical and physiological features from electronic health records are employed to develop a machine learning-based binary classification model, improving accessibility and early detection. A dataset of 14,763 records from Phachanukroh Hospital, Chiang Rai, Thailand, was used for model construction. The use of a hybrid data enrichment framework involving feature augmentation and data balancing was proposed in order to increase the dimensionality of the data. Medical domain knowledge was used to generate inter-relation-based features (IRFs), which improve data diversity and promote explainability by making the features more informative. For data balancing, the K-Means Synthetic Minority Oversampling Technique (K-Means SMOTE) was applied to generate synthetic samples in under-represented regions of the feature space, addressing class imbalance. Extra Trees (ET) was used for model construction due to its noise resilience and ability to manage multicollinearity. The performance of the proposed method was compared with that of Support Vector Machine, K-Nearest Neighbors, Artificial Neural Networks, Random Forest, and Gradient Boosting. The results reveal that the ET model significantly outperformed other models on the combined dataset with four IRFs and K-Means SMOTE across key metrics, including accuracy (96.47%), precision (94.79%), recall (97.86%), F1 score (96.30%), and area under the receiver operating characteristic curve (99.51%).<\/jats:p>","DOI":"10.3390\/bdcc9060148","type":"journal-article","created":{"date-parts":[[2025,5,30]],"date-time":"2025-05-30T06:14:27Z","timestamp":1748585667000},"page":"148","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Improving Early Detection of Dementia: Extra Trees-Based Classification Model Using Inter-Relation-Based Features and K-Means Synthetic Minority Oversampling Technique"],"prefix":"10.3390","volume":"9","author":[{"given":"Yanawut","family":"Chaiyo","sequence":"first","affiliation":[{"name":"Computer and Communication Engineering for Capacity Building Research Center, School of Applied Digital Technology, Mae Fah Luang University, Chiang Rai 57100, Thailand"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8749-1396","authenticated-orcid":false,"given":"Worasak","family":"Rueangsirarak","sequence":"additional","affiliation":[{"name":"Computer and Communication Engineering for Capacity Building Research Center, School of Applied Digital Technology, Mae Fah Luang University, Chiang Rai 57100, Thailand"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3216-3870","authenticated-orcid":false,"given":"Georgi","family":"Hristov","sequence":"additional","affiliation":[{"name":"Telecommunications Department, University of Ruse, 7017 Ruse, Bulgaria"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9847-157X","authenticated-orcid":false,"given":"Punnarumol","family":"Temdee","sequence":"additional","affiliation":[{"name":"Computer and Communication Engineering for Capacity Building Research Center, School of Applied Digital Technology, Mae Fah Luang University, Chiang Rai 57100, Thailand"}]}],"member":"1968","published-online":{"date-parts":[[2025,5,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Castellazzi, G., Cuzzoni, M.G., Cotta Ramusino, M., Martinelli, D., Denaro, F., Ricciardi, A., Vitali, P., Anzalone, N., Bernini, S., and Palesi, F. (2020). A machine learning approach for the differential diagnosis of Alzheimer and vascular dementia fed by MRI selected features. Front. Neuroinform., 14.","DOI":"10.3389\/fninf.2020.00025"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"658","DOI":"10.1002\/alz.12694","article-title":"Global estimates on the number of persons across the Alzheimer\u2019s disease continuum","volume":"19","author":"Gustavsson","year":"2023","journal-title":"Alzheimer\u2019s Dement."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"e105","DOI":"10.1016\/S2468-2667(21)00249-8","article-title":"Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: An analysis for the Global Burden of Disease Study 2019","volume":"7","author":"Nichols","year":"2022","journal-title":"Lancet Public Health"},{"key":"ref_4","unstructured":"World Health Organization (2019). Risk Reduction of Cognitive Decline and Dementia: WHO Guidelines, World Health Organization."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"277","DOI":"10.3233\/JAD-240163","article-title":"Societal costs of dementia: 204 countries, 2000\u20132019","volume":"101","author":"Lastuka","year":"2024","journal-title":"J. Alzheimer\u2019s Dis."},{"key":"ref_6","unstructured":"Muangpaisan, W. (2013). Dementia: Prevention, Assessment and Care, Parbpim."},{"key":"ref_7","first-page":"96","article-title":"A model of dementia prevention in older adults at Taling Chan District Bangkok Metropolis","volume":"19","author":"Thongwachira","year":"2019","journal-title":"KKU Res. J."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"G\u00f3mez, C., Vaquerizo-Villar, F., Poza, J., Ruiz, S.J., Tola-Arribas, M.A., Cano, M., and Hornero, R. (2017, January 11\u201315). Bispectral analysis of spontaneous EEG activity from patients with moderate dementia due to Alzheimer\u2019s disease. Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Republic of Korea.","DOI":"10.1109\/EMBC.2017.8036852"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1097\/00004691-200101000-00010","article-title":"Nonlinear dynamic analysis of the EEG in patients with Alzheimer\u2019s disease and vascular dementia","volume":"18","author":"Jeong","year":"2001","journal-title":"J. Clin. Neurophysiol."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Nancy, A., Balamurugan, M., and Vijaykumar, S. (2017, January 6\u20137). A brain EEG classification system for the mild cognitive impairment analysis. Proceedings of the 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India.","DOI":"10.1109\/ICACCS.2017.8014655"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"2058","DOI":"10.1016\/j.clinph.2017.06.251","article-title":"Feature selection before EEG classification supports the diagnosis of Alzheimer\u2019s disease","volume":"128","author":"Trambaiolli","year":"2017","journal-title":"Clin. Neurophysiol."},{"key":"ref_12","unstructured":"Rodrigues, P.M., Bispo, B.C., Freitas, D.R., Teixeira, J.P., and Carreres, A. (2013, January 9\u201313). Evaluation of EEG spectral features in Alzheimer disease discrimination. Proceedings of the 21st European Signal Processing Conference (EUSIPCO 2013), Marrakech, Morocco."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/0013-4694(94)90033-7","article-title":"EEG-based, neural-net predictive classification of Alzheimer\u2019s disease versus control subjects is augmented by non-linear EEG measures","volume":"91","author":"Pritchard","year":"1994","journal-title":"Electroencephalogr. Clin. Neurophysiol."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on image data augmentation for deep learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ullah, H.T., Onik, Z., Islam, R., and Nandi, D. (2018, January 6\u20138). Alzheimer\u2019s disease and dementia detection from 3D brain MRI data using deep convolutional neural networks. Proceedings of the 3rd International Conference for Convergence in Technology (I2CT 2018), Pune, India.","DOI":"10.1109\/I2CT.2018.8529808"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/IJSESD.313966","article-title":"Comparative analysis of artificial neural networks and deep neural networks for detection of dementia","volume":"13","author":"Bansal","year":"2022","journal-title":"Int. J. Soc. Ecol. Sustain. Dev."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"7","DOI":"10.53409\/mnaa\/jcsit\/2102","article-title":"An analysis of deep learning techniques in neuroimaging","volume":"2","author":"Narmatha","year":"2021","journal-title":"J. Comput. Sci. Intell. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Vardhini, K.V., Vishnumolakala, L.D., Palanki, S.U.A., Yarramsetty, M., and Raja, G. (2024, January 7\u20139). Alzheimer\u2019s Research and Early Diagnosis Through Improved Deep Learning Models. Proceedings of the 2024 5th International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India.","DOI":"10.1109\/ICESC60852.2024.10689869"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Almubark, I., Alsegehy, S., Jiang, X., and Chang, L.C. (2020, January 17\u201319). Early detection of mild cognitive impairment using neuropsychological data and machine learning techniques. Proceedings of the 2020 IEEE Conference on Big Data and Analytics (ICBDA), Kota Kinabalu, Malaysia.","DOI":"10.1109\/ICBDA50157.2020.9289741"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Javeed, A., Dallora, A.L., Berglund, J.S., Idrisoglu, A., Ali, L., Rauf, H.T., and Anderberg, P. (2023). Early prediction of dementia using feature extraction battery (FEB) and optimized support vector machine (SVM) for classification. Biomedicines, 11.","DOI":"10.3390\/biomedicines11020439"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"48677","DOI":"10.1109\/ACCESS.2023.3276468","article-title":"Gradient boosting-based model for elderly heart failure, aortic stenosis, and dementia classification","volume":"11","author":"Yongcharoenchaiyasit","year":"2023","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Mirzaei, G., and Adeli, H. (2022). Machine learning techniques for diagnosis of Alzheimer\u2019s disease, mild cognitive disorder, and other types of dementia. Biomed. Signal Process. Control., 72.","DOI":"10.1016\/j.bspc.2021.103293"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Mohammed, B.A., Senan, E.M., Rassem, T.H., Makbol, N.M., Alanazi, A.A., Al-Mekhlafi, Z.G., Almurayziq, T.S., and Ghaleb, F.A. (2021). Multi-method analysis of medical records and MRI images for early diagnosis of dementia and Alzheimer\u2019s disease based on deep learning and hybrid methods. Electronics, 10.","DOI":"10.3390\/electronics10222860"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Cura, O.K., Yilmaz, G.C., Ture, H.S., and Akan, A. (November, January 31). Deep time-frequency feature extraction for Alzheimer\u2019s dementia EEG classification. Proceedings of the 2022 Medical Technologies Congress (TIPTEKNO), Antalya, Turkey.","DOI":"10.1109\/TIPTEKNO56568.2022.9960155"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hanai, S., Kato, S., Sakuma, T., Ohdake, R., Masuda, M., and Watanabe, H. (2022, January 7\u20139). A dementia classification based on speech analysis of casual talk during a clinical interview. Proceedings of the 2022 IEEE 4th Global Conference on Life Sciences and Technologies (LifeTech), Osaka, Japan.","DOI":"10.1109\/LifeTech53646.2022.9754933"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"100258","DOI":"10.1016\/j.array.2022.100258","article-title":"Data augmentation: A comprehensive survey of modern approaches","volume":"16","author":"Mumuni","year":"2022","journal-title":"Array"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Jha, A., John, E., and Banerjee, T. (2022, January 26\u201329). Multi-class classification of dementia from MRI images using transfer learning. Proceedings of the 2022 IEEE 13th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA.","DOI":"10.1109\/UEMCON54665.2022.9965672"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Reddy, T.S., Saikiran, V., Samhitha, S., Moin, S., Kumar, T.P., and Charan, V.S. (2023, January 6\u20138). Early detection of Alzheimer\u2019s disease using data augmentation and CNN. Proceedings of the 2023 4th IEEE Global Conference for Advancement in Technology (GCAT), Bangalore, India.","DOI":"10.1109\/GCAT59970.2023.10353397"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Samanta, S., Mazumder, I., and Roy, C. (2023, January 5\u20136). Deep learning-based early detection of Alzheimer\u2019s disease using image enhancement filters. Proceedings of the 2023 Third International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), Bhilai, India.","DOI":"10.1109\/ICAECT57570.2023.10117880"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"105180","DOI":"10.1016\/j.tust.2023.105180","article-title":"Application of KM-SMOTE for rockburst intelligent prediction","volume":"138","author":"Liu","year":"2023","journal-title":"Tunn. Undergr. Space Technol."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"89","DOI":"10.14710\/jtsiskom.8.2.2020.89-93","article-title":"K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4.5, SVM, dan naive Bayes","volume":"8","author":"Hairani","year":"2020","journal-title":"J. Teknol. Dan Sist. Komput."},{"key":"ref_32","unstructured":"He, H., Bai, Y., Garcia, E.A., and Li, S. (2008, January 1\u20138). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong."},{"key":"ref_33","first-page":"13","article-title":"Diagnosis of Parkinson disease using handwriting analysis","volume":"184","author":"Ranjan","year":"2022","journal-title":"Int. J. Comput. Appl."},{"key":"ref_34","first-page":"214","article-title":"A novel approach to detection of Alzheimer\u2019s disease from handwriting: Triple ensemble learning model","volume":"12","year":"2024","journal-title":"Gazi Univ. J. Sci. Part C Des. Technol."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1109\/TCBB.2022.3201295","article-title":"EnsDeepDP: An ensemble deep learning approach for disease prediction through metagenomics","volume":"20","author":"Shen","year":"2022","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Goel, A., Lal, M., and Javadekar, A.N. (2023, January 20\u201321). Comparative analysis of the machine and deep learning classifier for dementia prediction. Proceedings of the 2023 Advanced Computing and Communication Technologies for High Performance Applications (ACCTHPA), Ernakulam, India.","DOI":"10.1109\/ACCTHPA57160.2023.10083361"},{"key":"ref_37","first-page":"51","article-title":"Cardiovascular Disease Prediction System Using Extra Trees Classifier","volume":"11","author":"Rahman","year":"2019","journal-title":"Res. Sq."},{"key":"ref_38","unstructured":"Bhargav, S., Kaushik, S., and Dutt, V. (2021). A Combination of Decision Trees with Machine Learning Ensembles for Blood Glucose Level Predictions, Springer."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Hancz\u00e1r, G., Stippinger, M., Han\u00e1k, D., Kurbucz, M.T., T\u00f6rteli, O.M., Chripk\u00f3, \u00c1., and Somogyv\u00e1ri, Z. (2023). Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS). arXiv.","DOI":"10.1088\/2632-2153\/ad020e"},{"key":"ref_40","unstructured":"Zhou, Z.-H., and Feng, J. (2017, January 19\u201325). Deep forest: Towards an alternative to deep neural networks. Proceedings of the IJCAI\u201917: 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez, A., Garc\u00eda, S., Galar, M., Prati, R.C., Krawczyk, B., and Herrera, F. (2018). Learning from Imbalanced Data Streams, Springer.","DOI":"10.1007\/978-3-319-98074-4"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"143250","DOI":"10.1109\/ACCESS.2021.3120738","article-title":"Noise avoidance SMOTE in ensemble learning for imbalanced data","volume":"9","author":"Kim","year":"2021","journal-title":"IEEE Access"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"101694","DOI":"10.1016\/j.media.2020.101694","article-title":"Convolutional neural networks for classification of Alzheimer\u2019s disease: Overview and reproducible evaluation","volume":"63","author":"Wen","year":"2020","journal-title":"Med. Image Anal."},{"key":"ref_44","first-page":"3133","article-title":"Do we need hundreds of classifiers to solve real world classification problems?","volume":"15","author":"Cernadas","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.csbj.2014.11.005","article-title":"Machine learning applications in cancer prognosis and prediction","volume":"13","author":"Kourou","year":"2015","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"ref_46","unstructured":"Wang, X., Yu, H., Zhang, Y., and Yu, Y. (2021). An ensemble learning framework for early detection of Alzheimer\u2019s disease using multiple biomarkers. Comput. Biol. Med., 133."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"2409","DOI":"10.1109\/JBHI.2021.3059563","article-title":"Efficient and explainable risk assessments for imminent dementia in an aging cohort study","volume":"25","author":"Okeson","year":"2021","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_48","unstructured":"Pujianto, U., Wibawa, A.P., and Akbar, M.I. (2019, January 10\u201311). K-nearest neighbor (K-NN) based missing data imputation. Proceedings of the 2019 5th International Conference on Science in Information Technology (ICSITech), Yogyakarta, Indonesia."},{"key":"ref_49","unstructured":"Jameson, J.L., Fauci, A.S., Kasper, D.L., Hauser, S.L., Longo, D.L., and Loscalzo, J. (2022). Harrison\u2019s Principles of Internal Medicine, McGraw Hill. [21st ed.]."},{"key":"ref_50","unstructured":"Skerrett, P.J. (2014). Lipid Disorders: Diagnosis and Treatment, Harvard Health Publications."},{"key":"ref_51","unstructured":"Bishop, M.L. (2023). Clinical Chemistry: Principles, Techniques, and Correlations, Enhanced Edition: Principles, Techniques, and Correlations, Jones & Bartlett Learning."},{"key":"ref_52","unstructured":"Guyton, A.C., and Hall, J.E. (2021). Guyton and Hall Textbook of Medical Physiology, Elsevier. [13th ed.]."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1565","DOI":"10.1038\/nbt1206-1565","article-title":"What is a support vector machine?","volume":"24","author":"Noble","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"ref_54","first-page":"14","article-title":"Detection of chronic kidney disease using machine learning algorithms with least number of predictors","volume":"10","author":"Almasoud","year":"2019","journal-title":"Int. J. Soft Comput. Its Appl."},{"key":"ref_55","first-page":"135","article-title":"Heart disease as a risk factor for dementia","volume":"5","author":"Justin","year":"2013","journal-title":"Clin. Epidemiol."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Kramer, O. (2013). Dimensionality Reduction with Unsupervised Nearest Neghbors, Springer.","DOI":"10.1007\/978-3-642-38652-7"},{"key":"ref_57","unstructured":"Zhang, Z. (2016). Generalized Linear Models: Modern Methods and Applications, Springer."},{"key":"ref_58","unstructured":"Kunapuli, G. (2023). Ensemble Methods for Machine Learning, Simon and Schuster."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1500000066","article-title":"Explainable recommendation: A survey and new perspectives","volume":"14","author":"Zhang","year":"2020","journal-title":"Found. Trends\u00ae Inf. Retr."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/6\/148\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:43:50Z","timestamp":1760031830000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/6\/148"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,30]]},"references-count":59,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["bdcc9060148"],"URL":"https:\/\/doi.org\/10.3390\/bdcc9060148","relation":{},"ISSN":["2504-2289"],"issn-type":[{"type":"electronic","value":"2504-2289"}],"subject":[],"published":{"date-parts":[[2025,5,30]]}}}