{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T05:14:29Z","timestamp":1763097269773,"version":"3.45.0"},"reference-count":21,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T00:00:00Z","timestamp":1763078400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100012024","name":"Multimedia University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100012024","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Big Data"],"abstract":"<jats:p>Bangla Handwritten Character Recognition (BHCR) remains challenging due to complex alphabets, and handwriting variations. In this study, we present a comparative evaluation of three deep learning architectures\u2014Vision Transformer (ViT), VGG-16, and ResNet-50\u2014on the CMATERdb 3.1.2 dataset comprising 24,000 images of 50 basic Bangla characters. Our work highlights the effectiveness of ViT in capturing global context and long-range dependencies, leading to improved generalization. Experimental results show that ViT achieves a state-of-the-art accuracy of 98.26%, outperforming VGG-16 (94.54%) and ResNet-50 (93.12%). We also analyze model behavior, discuss overfitting in CNNs, and provide insights into character-level misclassifications. This study demonstrates the potential of transformer-based architectures for robust BHCR and offers a benchmark for future research.<\/jats:p>","DOI":"10.3389\/fdata.2025.1682984","type":"journal-article","created":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T05:10:48Z","timestamp":1763097048000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Enhancing Bangla handwritten character recognition using Vision Transformers, VGG-16, and ResNet-50: a performance analysis"],"prefix":"10.3389","volume":"8","author":[{"given":"A. H. M.","family":"Shahariar Parvez","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Md.","family":"Samiul Islam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fahmid","family":"Al Farid","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tashida","family":"Yeasmin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Md. Monirul","family":"Islam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Md. Shafiul","family":"Azam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jia","family":"Uddin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hezerul","family":"Abdul Karim","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2025,11,14]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"105234","DOI":"10.1016\/j.imavis.2024.105234","article-title":"Enhanced human motion detection with hybrid rda-woa-based RNN and multiple hypothesis tracking for occlusion handling","volume":"150","author":"Cheltha","year":"2024","journal-title":"Image Vis. Comput"},{"key":"B2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/ICCIT54785.2021.9689864","article-title":"\u201cBangla optical character recognition (OCR) using deep learning based image classification algorithms,\u201d","volume-title":"2021 24th International Conference on Computer and Information Technology (ICCIT)","author":"Dipu","year":"2021"},{"key":"B3","article-title":"An image is worth 16x16 words: transformers for image recognition at scale","author":"Dosovitskiy","year":"2020","journal-title":"arXiv preprint arXiv:2010.11929"},{"key":"B4","doi-asserted-by":"publisher","first-page":"1693","DOI":"10.3390\/electronics12071693","article-title":"Lw-vit: the lightweight vision transformer model applied in offline handwritten Chinese character recognition","volume":"12","author":"Geng","year":"2023","journal-title":"Electronics"},{"key":"B5","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1134\/S1054661821010089","article-title":"Performance analysis of state of the art convolutional neural network architectures in bangla handwritten character recognition","volume":"31","author":"Ghosh","year":"2021","journal-title":"Patt. Recogn. Image Anal"},{"key":"B6","doi-asserted-by":"publisher","first-page":"2547","DOI":"10.11591\/eei.v9i6.2234","article-title":"Bangla handwritten character recognition using mobilenet v1 architecture","volume":"9","author":"Ghosh","year":"2020","journal-title":"Bull. Electr. Eng. Inform"},{"key":"B7","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1007\/978-981-19-6634-7_43","article-title":"\u201cAn improved method to recognize Bengali handwritten characters using CNN,\u201d","volume-title":"Proceedings of International Conference on Data Science and Applications: ICDSA 2022","author":"Halder","year":"2023"},{"key":"B8","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1109\/ISCAIE51753.2021.9431799","article-title":"\u201cAn automated system for recognizing isolated handwritten bangla characters using deep convolutional neural network,\u201d","volume-title":"2021 IEEE 11th IEEE Symposium on Computer Applications &Industrial Electronics (ISCAIE)","author":"Hasan","year":"2021"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90","article-title":"\u201cDeep residual learning for image recognition,\u201d","author":"He","year":"2016","journal-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition"},{"key":"B10","doi-asserted-by":"publisher","first-page":"4794","DOI":"10.1049\/gtd2.12997","article-title":"Using machine learning ensemble method for detection of energy theft in smart meters","volume":"17","author":"Kawoosa","year":"2023","journal-title":"IET Gener. Transm. Distr"},{"key":"B11","doi-asserted-by":"publisher","first-page":"4617","DOI":"10.1109\/TCE.2023.3323406","article-title":"Privacy preserved and decentralized smartphone recommendation system","volume":"70","author":"Khan","year":"2023","journal-title":"IEEE Trans. Consum. Electr"},{"key":"B12","doi-asserted-by":"publisher","first-page":"129505","DOI":"10.1109\/ACCESS.2022.3228160","article-title":"Internet of things (IoT) assisted context aware fertilizer recommendation","volume":"10","author":"Khan","year":"","journal-title":"IEEE Access"},{"key":"B13","doi-asserted-by":"publisher","first-page":"3105","DOI":"10.3390\/electronics11193105","article-title":"Crowd anomaly detection in video frames using fine-tuned alexnet model","volume":"11","author":"Khan","year":"","journal-title":"Electronics"},{"key":"B14","doi-asserted-by":"publisher","first-page":"112117","DOI":"10.1109\/ACCESS.2022.3216393","article-title":"Data complexity based evaluation of the model dependence of brain MRI images for classification of brain tumor and Alzheimer's disease","volume":"10","author":"Kujur","year":"2022","journal-title":"IEEE Access"},{"key":"B15","doi-asserted-by":"publisher","first-page":"528","DOI":"10.1016\/j.procs.2018.10.426","article-title":"Bornonet: Bangla handwritten characters recognition using convolutional neural network","volume":"143","author":"Rabby","year":"2018","journal-title":"Procedia Comput. Sci"},{"key":"B16","first-page":"190","article-title":"\u201cBangla handwritten basic character recognition using deep convolutional neural network,\u201d","volume-title":"2019 Joint 8th International Conference on Informatics, Electronics &Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision &Pattern Recognition (icIVPR)","author":"Saha","year":"2019"},{"key":"B17","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1007\/s10032-011-0148-6","article-title":"Cmaterdb1: a database of unconstrained handwritten Bangla and Bangla-English mixed script document image","volume":"15","author":"Sarkar","year":"2012","journal-title":"Int. J. Docum. Anal. Recogn"},{"key":"B18","doi-asserted-by":"publisher","first-page":"6845","DOI":"10.3390\/app11156845","article-title":"Bengalinet: a low-cost novel convolutional neural network for bengali handwritten characters recognition","volume":"11","author":"Sayeed","year":"2021","journal-title":"Appl. Sci"},{"key":"B19","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan","year":"2014","journal-title":"arXiv preprint arXiv:1409.1556"},{"key":"B20","unstructured":"Bengali language\n          \n          2024"},{"key":"B21","first-page":"783","article-title":"\u201cInception-v3 for flower classification,\u201d","volume-title":"2017 2nd International Conference on Image, Vision and Computing (ICIVC)","author":"Xia","year":"2017"}],"container-title":["Frontiers in Big Data"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdata.2025.1682984\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T05:10:50Z","timestamp":1763097050000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdata.2025.1682984\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,14]]},"references-count":21,"alternative-id":["10.3389\/fdata.2025.1682984"],"URL":"https:\/\/doi.org\/10.3389\/fdata.2025.1682984","relation":{},"ISSN":["2624-909X"],"issn-type":[{"value":"2624-909X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,14]]},"article-number":"1682984"}}