{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:27:18Z","timestamp":1753885638734,"version":"3.41.2"},"reference-count":17,"publisher":"World Scientific Pub Co Pte Ltd","issue":"03","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Model. Simul. Sci. Comput."],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:p> Diabetes is a chronic disease which indicates the high level of body glucose level. As per the World Health Organization (WHO), 422 million people were diabetic until 2014. This paper develops an accurate classification machine learning model and an efficient usage of data pre-processing pipeline to improve overall accuracy. For the purpose, six algorithms: Support Vector Machine with Linear kernel (Linear-SVM), Support Vector Machine with RBF kernel (RBF-SVM), K-Nearest Neighbor (KNN), Artificial Neural Network (ANN), Decision Tree and Random Forest are used for classification purpose and their comparative accuracy is analyzed. Data Imputation, Oversampling and Feature scaling techniques are the constituents of Data preprocessing pipeline. Experiments are performed on a well-known dataset of National Institute of Diabetes and Digestive and Kidney Diseases, the PIMA diabetes dataset. The data preprocessing techniques, data imputation and Synthetic Minority Oversample Technique (SMOTE) analysis improved classification accuracy from 77% on raw data, to 88.12% (on Random Forest Classifier) and 91% (on ANN Classifier), respectively. Furthermore, a new feature generation approach is applied and performance is analyzed using the SVM model. Original data attributes BMI and Insulin are replaced with new features BMI_NORMAL and INSULIN_NORMAL, respectively. The significant improvement by proposed technique is confirmed by statistical testing followed by post-hoc analysis. <\/jats:p>","DOI":"10.1142\/s1793962323500101","type":"journal-article","created":{"date-parts":[[2022,5,8]],"date-time":"2022-05-08T05:46:21Z","timestamp":1651988781000},"source":"Crossref","is-referenced-by-count":1,"title":["Diabetes mellitus prediction: An efficient pipeline of data imputation and oversampling"],"prefix":"10.1142","volume":"14","author":[{"given":"Neha","family":"Rajawat","sequence":"first","affiliation":[{"name":"Department of Mathematics, Career Point University, Kota, India"}]},{"given":"Bharat Singh","family":"Hada","sequence":"additional","affiliation":[{"name":"Samsung R & D Institute, Noida, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5757-9744","authenticated-orcid":false,"given":"Soniya","family":"Lalwani","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Bal Krishna Institute of Technology, Kota, India"}]},{"given":"Rajesh","family":"Kumar","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, Malaviya National Institute of Technology, Jaipur, India"}]}],"member":"219","published-online":{"date-parts":[[2022,6,10]]},"reference":[{"key":"S1793962323500101BIB002","first-page":"1","volume":"124","author":"Jerjawi E.","year":"2018","journal-title":"Int. J. Adv. Sci. Technol."},{"issue":"4","key":"S1793962323500101BIB003","doi-asserted-by":"crossref","first-page":"8610","DOI":"10.1016\/j.eswa.2008.10.032","volume":"36","author":"Hasan T.","year":"2009","journal-title":"Expert Syst. Appl."},{"key":"S1793962323500101BIB004","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1007\/978-981-13-1642-5_59","volume-title":"Engineering Vibration, Communication and Information Processing","author":"Srivastava S.","year":"2019"},{"issue":"3","key":"S1793962323500101BIB005","first-page":"396","volume":"2","author":"Angeline C. Y.","year":"2013","journal-title":"Int. J. Eng. Adv. Technol."},{"issue":"6","key":"S1793962323500101BIB006","first-page":"4038","volume":"13","author":"Kadhm M. S.","year":"2018","journal-title":"Int. J. Appl. Eng. Res."},{"issue":"1","key":"S1793962323500101BIB007","first-page":"1805","volume":"9","author":"Kishore N. G.","year":"2020","journal-title":"Int. J. Sci. Technol. Res."},{"issue":"5","key":"S1793962323500101BIB008","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.2337\/diabetes.53.5.1181","volume":"53","author":"Leslie B. J.","year":"2004","journal-title":"Diabetes"},{"key":"S1793962323500101BIB009","first-page":"45","volume-title":"Machine Learning for Evolution Strategies","author":"Oliver K.","year":"2016"},{"key":"S1793962323500101BIB010","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1186\/1471-2105-14-106","volume":"14","author":"Blagus R.","year":"2013","journal-title":"BMC Bioinf."},{"key":"S1793962323500101BIB011","doi-asserted-by":"crossref","first-page":"1578","DOI":"10.1016\/j.procs.2018.05.122","volume":"132","author":"Sisodia D.","year":"2018","journal-title":"Procedia Comput. Sci."},{"issue":"7","key":"S1793962323500101BIB012","doi-asserted-by":"crossref","first-page":"e0201052","DOI":"10.1371\/journal.pone.0201052","volume":"13","author":"Okura T.","year":"2018","journal-title":"PLOS ONE"},{"issue":"4","key":"S1793962323500101BIB013","first-page":"36","volume":"2","author":"Saxena K.","year":"2014","journal-title":"Int. J. Comput. Sci. Trends Technol."},{"key":"S1793962323500101BIB014","first-page":"8887","volume":"975","author":"Aravind C.","year":"2013","journal-title":"Int. J. Comput. Appl."},{"issue":"2","key":"S1793962323500101BIB015","first-page":"1797","volume":"3","author":"Kumari V. A.","year":"2013","journal-title":"Int. J. Eng. Res. Appl."},{"issue":"11","key":"S1793962323500101BIB016","first-page":"680","volume":"3","author":"Yuvarani S.","year":"2016","journal-title":"Int. Res. J. Eng. Technol."},{"issue":"4","key":"S1793962323500101BIB017","first-page":"193","volume":"4","author":"Evangeline A. B.","year":"2015","journal-title":"Int. J. Sci. Technol. Manage."},{"issue":"1","key":"S1793962323500101BIB018","first-page":"66","volume":"7","author":"Sareh M.","year":"2019","journal-title":"Diabetes J. Res. Med. Dent. Sci."}],"container-title":["International Journal of Modeling, Simulation, and Scientific Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S1793962323500101","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,7]],"date-time":"2023-08-07T06:50:04Z","timestamp":1691391004000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S1793962323500101"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,10]]},"references-count":17,"journal-issue":{"issue":"03","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["10.1142\/S1793962323500101"],"URL":"https:\/\/doi.org\/10.1142\/s1793962323500101","relation":{},"ISSN":["1793-9623","1793-9615"],"issn-type":[{"type":"print","value":"1793-9623"},{"type":"electronic","value":"1793-9615"}],"subject":[],"published":{"date-parts":[[2022,6,10]]},"article-number":"2350010"}}