{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T00:46:07Z","timestamp":1770425167636,"version":"3.49.0"},"reference-count":30,"publisher":"Walter de Gruyter GmbH","issue":"1","license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,10,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Handwritten digit recognition (HDR) remains challenging due to variations in writing styles. To address this challenge, this study comprehensively compares ML (ML) and deep learning (DL) models. We explored a variety of approaches. We evaluated these models on the Modified National Institute of Standards and Technology (MNIST) and Extended Modified National Institute of Standards and Technology (EMNIST) datasets to assess their generalization capabilities. Initially, we investigated standalone ML and DL models trained from scratch to learn features directly. Logistic regression (LR) achieved an accuracy of 92.5% on MNIST and 86.63% on EMNIST. A multi-layer perceptron demonstrated improved performance with 98.10% accuracy on MNIST. Convolutional neural networks exhibited superior performance, reaching 99.90% accuracy on MNIST and 99.57% on EMNIST. To further enhance performance, we explored ensemble learning techniques, combining CNNs with RF (98.20 and 99.86% accuracy on MNIST and EMNIST, respectively), LR (88.67 and 99.79% accuracy on MNIST and EMNIST, respectively), and VC (99.27 and 99.83% accuracy on MNIST and EMNIST, respectively). We then introduced a ViT model, leveraging self-attention for long-range dependency modeling, achieving an accuracy of 98.70% on MNIST and 99.58% on EMNIST. Finally, we proposed a hybrid model combining CNN and ViT, that yielded the highest accuracy of 99.97% on MNIST and 98.26% on EMNIST. Throughout our experimentation, we employed various techniques such as regularization, weight initialization, and optimization strategies to improve model performance. The impact of each technique is analyzed and discussed. Overall, this study provides a comprehensive comparison of different HDR models, highlighting each approach\u2019s strengths and weaknesses. The results demonstrate the effectiveness of DL models, particularly CNNs and hybrid architectures, in achieving high accuracy in HDR.<\/jats:p>","DOI":"10.1515\/jisys-2024-0411","type":"journal-article","created":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T08:56:18Z","timestamp":1759568178000},"source":"Crossref","is-referenced-by-count":2,"title":["Handwritten digit recognition: Comparative analysis of ML, CNN, vision transformer, and hybrid models on the MNIST dataset"],"prefix":"10.1515","volume":"34","author":[{"given":"Dhouha","family":"Ben Noureddine","sequence":"first","affiliation":[{"name":"College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU) , Riyadh , 13318 , Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"374","published-online":{"date-parts":[[2025,10,4]]},"reference":[{"key":"2025122009032292893_j_jisys-2024-0411_ref_001","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16\u00d716 words: transformers for image recognition at scale, in: International Conference on Learning Representations. 2021. p. 293\u201397. 10.48550\/arXiv.2010.11929."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_002","unstructured":"Reddy BP, Reddy RS, Vasem PS, Venkatesh P, Rajashekaran S. Handwritten digit recognition using SVM algorithm in machine learning. Int J Creative Res Thoughts. 2022;10(6). Accessed 30 October 2024. [Online]. Available: https:\/\/ijcrt.org\/papers\/IJCRT22A6520.pdf."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_003","doi-asserted-by":"crossref","unstructured":"Assegie T, Nair P. Handwritten digits recognition with decision tree classification: A machine learning approach. Int J Electr Comput Eng. 2019;9(5):4446\u201351. 10.11591\/ijece.v9i5.pp4446-4451.","DOI":"10.11591\/ijece.v9i5.pp4446-4451"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_004","doi-asserted-by":"crossref","unstructured":"Wang Y, Wang R, Li D, Adu-Gyamfi D. Improved handwritten digit recognition using quantum k-nearest neighbor algorithm. Int J Theoret Phys. 2019;58(7):2331\u201340. 10.1007\/s10773-019-04124-5.","DOI":"10.1007\/s10773-019-04124-5"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_005","doi-asserted-by":"crossref","unstructured":"Sheikh R, Patel M. Handwritten digit recognition using different dimensionality reduction techniques. Int J Recent Tech Eng. 2019;8(2):999\u20131002. 10.35940\/ijrte.B1798.078219.","DOI":"10.35940\/ijrte.B1798.078219"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_006","unstructured":"Monica RF, Lavanya K.  Handwritten digit recognition of mnist data using consensus clustering. Int J Recent Tech Eng. 2019;7(6):1969\u201373. Accessed 21 June 2024. [Online]. Available: https:\/\/www.ijrte.org\/wp-content\/uploads\/papers\/v7i6\/F2408037619.pdf."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_007","unstructured":"Assiri S. A simple CNN model for MNIST handwritten digits classification. Int J Adv Comput Sci Inform Tech. 2020;11(5):1517\u201324."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_008","unstructured":"Hirata Y, Fujiyoshi H. EnsNet: An ensemble of convolutional neural networks for robust digit recognition. 2021. Accessed 10 July 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/2008.10400."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_009","unstructured":"An H, Sun C, Wang X. Deep ensemble convolutional neural network for digit recognition, in 15th International Conference on Computer Science and Information Technology (ICCSIT). 2018. Vol. 10. p. 309\u201313."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_010","unstructured":"Byerly A, Doll\u00e1r P, Goodman N, Srinivasan P, Zitnick CL, LaTeX: Capsule networks with homogeneous vector capsules, in Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2020. p. 6012\u201321."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_011","doi-asserted-by":"crossref","unstructured":"Ahmed SS, Mehmood Z, Awan IA, Yousaf RM. A novel technique for handwritten digit recognition using deep learning. J Sensors. 2023;2023:2753941. 10.1155\/2023\/2753941.","DOI":"10.1155\/2023\/2753941"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_012","doi-asserted-by":"crossref","unstructured":"Bhojanapalli S, Chakrabarti A, Glasner D, et al. Understanding robustness of transformers for image classification, in IEEE\/CVF International Conference on Computer Vision (ICCV). 2021. p. 211\u201321. 10.1109\/ICCV48922.2021.01007.","DOI":"10.1109\/ICCV48922.2021.01007"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_013","unstructured":"Naseer M, Ranasinghe K, Khan SH, Hayat M, Shahbaz Khan F, Yang MH. Intriguing properties of vision transformers. Adv Neural Inform Proces Syst. 2021;34:23296\u2013308. 10.1109\/ICCV48922.2021.01007."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_014","unstructured":"Paul S, Chen P. Vision transformers are robust learners. 2021. Accessed 18 July 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/2008.10400."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_015","doi-asserted-by":"crossref","unstructured":"LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278\u2013324. 10.1109\/5.726791.","DOI":"10.1109\/5.726791"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_016","unstructured":"Chen C, Fan Q, Panda R. Crossvit: cross-attention multi-scale vision transformer for image classification. 2021. Accessed 04 August 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/2103.14899."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_017","unstructured":"Wang J, Gao Y, Shi J, Liu Z. Optical remote sensing scene classification based on vision transformer and graph convolutional network. Acta Photon. Sin. 2021;50(11)."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_018","unstructured":"Gheflati B, Hassan R. Vision transformer for classification of breast ultrasound images. 2021. Accessed 11 August 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/2110.14731."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_019","unstructured":"Wu H, Xiao B, Noel C, Liu M, Dai X, Yuan C, et al. Cvt: introducing convolutions to vision transformers. 2021. Accessed 24 October 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/2103.15808."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_020","doi-asserted-by":"crossref","unstructured":"Agrawal V, Jayant J. Convolutional vision transformer for handwritten digit recognition. Durham, NC, USA: Research square Company; 2022. 10.21203\/rs.3.rs-1560520\/v1.","DOI":"10.21203\/rs.3.rs-1984839\/v1"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_021","doi-asserted-by":"crossref","unstructured":"Agrawal V, Jagtap J, Patil S, Kotecha K. Performance analysis of hybrid deep learning framework using a vision transformer and convolutional neural network for handwritten digit recognition. MethodsX. 2024;12(102554). 10.1016\/j.mex.2024.102554.","DOI":"10.1016\/j.mex.2024.102554"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_022","unstructured":"Hong D, Gao L, Yao J, Zhang B, Plaza A, Chanussot J. Graph convolutional networks for hyperspectral image classification. 2020. Accessed 13 August 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/2008.02457."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_023","doi-asserted-by":"crossref","unstructured":"Dixit R, Kushwah R, Pashine S. Handwritten digit recognition using machine and deep learning algorithms. Int J Comput Appl. 2020;176(42):27\u201333. 10.5120\/ijca2020920550.","DOI":"10.5120\/ijca2020920550"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_024","unstructured":"Chigozie N, Winifred I, Anthony G, Stephen M. Activation functions: Comparison of trends in practice and research for deep learning. 2018. [Online]. Available: https:\/\/arxiv.org\/abs\/1811.03378."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_025","unstructured":"Assiri Y. Stochastic optimization of plain convolutional neural networks with simple methods. 2020. Accessed 27 August 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/1511.08458."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_026","unstructured":"O\u2019Shea K, Nash R. An introduction to convolutional neural networks. 2015. Accessed 31 August 2024. [Online]. Available: arXiv:1511.08458."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_027","doi-asserted-by":"crossref","unstructured":"Riaz N, Arbab H, Maqsood A, Nasir K, Ul-Hasan A, Shafait F, et al. Conv-transformer architecture for unconstrained off-line urdu handwriting recognition. in Research Square, 2022. [Online]. Available: https:\/\/doi.org\/10.21203\/rs.3.rs-1514700\/v1.","DOI":"10.21203\/rs.3.rs-1514700\/v1"},{"key":"2025122009032292893_j_jisys-2024-0411_ref_028","unstructured":"Hendrycks D, Gimpel K. Gaussian error linear units (gelus). 2016. Accessed 07 September 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/1606.08415."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_029","unstructured":"Lecun Y, Cortes C, Burges CJ. MNIST handwritten digit database. 2010. Accessed 15 September 2024. [Online]. Available: http:\/\/yann.lecun.com\/exdb\/mnist."},{"key":"2025122009032292893_j_jisys-2024-0411_ref_030","unstructured":"Greg C, Sadegh A, Jonathan T, Andre VS. EMNIST: Extending MNIST to handwritten letters. 2017. Accessed 25 September 2024. [Online]. Available: https:\/\/arxiv.org\/abs\/1702.05373."}],"container-title":["Journal of Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.degruyterbrill.com\/document\/doi\/10.1515\/jisys-2024-0411\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyterbrill.com\/document\/doi\/10.1515\/jisys-2024-0411\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T09:16:54Z","timestamp":1766222214000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyterbrill.com\/document\/doi\/10.1515\/jisys-2024-0411\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,1]]},"references-count":30,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,3,7]]},"published-print":{"date-parts":[[2025,3,7]]}},"alternative-id":["10.1515\/jisys-2024-0411"],"URL":"https:\/\/doi.org\/10.1515\/jisys-2024-0411","relation":{},"ISSN":["2191-026X"],"issn-type":[{"value":"2191-026X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,1]]},"article-number":"20240411"}}