{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:35:24Z","timestamp":1772120124385,"version":"3.50.1"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T00:00:00Z","timestamp":1750464000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T00:00:00Z","timestamp":1750464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Discov Artif Intell"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Optical Character Recognition (OCR) plays a vital role in automating data entry from handwritten forms into digital systems. However, a significant gap exists in the research on OCR techniques tailored for handwritten texts in complex languages such as Bangla. Challenges in Bangla script arise from the presence of modifiers, compound characters, and diacritic marks, making accurate recognition difficult. Our research introduces a scalable and effective OCR pipeline for Bangla handwritten documents that addresses these complexities. The proposed pipeline leverages the YOLO (You Only Look Once) model for character detection, accurately isolating base alphabets, consonant conjuncts, and characters with modifiers (matras). For character recognition, the pipeline utilizes the EfficientNet-B4 model, which demonstrated a recognition accuracy of 93.87% for grapheme roots, 98.22% for vowel diacritics, and 98.0% for consonant diacritics on publicly available datasets, combined and adapted for our use. Additionally, the system\u2019s resilience was enhanced using a Word2Vec-based spelling correction layer, reducing the Character Error Rate (CER) from 10.37% to 2.47%. Comparative evaluations on in-house data show that the proposed pipeline with spelling correction achieves the highest precision (0.9701) and lowest CER (0.0247), outperforming the Google Cloud Vision API\u2019s OCR. In contrast, the Vision API has the highest CER (0.1389) and lower precision (0.8220), highlighting the effectiveness of the proposed approach for Bangla OCR.<\/jats:p>","DOI":"10.1007\/s44163-025-00251-7","type":"journal-article","created":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T03:31:08Z","timestamp":1750476668000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A hybrid approach to Bangla handwritten OCR: combining YOLO and an advanced CNN"],"prefix":"10.1007","volume":"5","author":[{"given":"Aye T.","family":"Maung","sequence":"first","affiliation":[]},{"given":"Sumaiya","family":"Salekin","sequence":"additional","affiliation":[]},{"given":"Mohammad A.","family":"Haque","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,6,21]]},"reference":[{"key":"251_CR1","unstructured":"University of Washington: Bengali (Bangla) | Asian Languages & Literature | University of Washington. https:\/\/asian.washington.edu\/. Accessed 29 July 2024"},{"key":"251_CR2","unstructured":"Encyclopaedia Britannica: The World\u2019s 5 Most Commonly Used Writing Systems. https:\/\/www.britannica.com\/list\/the-worlds-5-most-commonly-used-writing-systems. Accessed 29 July 2024"},{"key":"251_CR3","unstructured":"Laws of Bangladesh: \u09ac\u09be\u0982\u09b2\u09be \u09ad\u09be\u09b7\u09be \u09aa\u09cd\u09b0\u099a\u09b2\u09a8 \u0986\u0987\u09a8, \u09e7\u09ef\u09ee\u09ed Accessed 28 August 2024. http:\/\/bdlaws.minlaw.gov.bd\/act-details-705.html"},{"key":"251_CR4","doi-asserted-by":"publisher","unstructured":"Raj R, Kos A. A comprehensive study of optical character recognition. In: 2022 29th International Conference on Mixed Design of Integrated Circuits and System (MIXDES), 2022;pp. 151\u2013154 . https:\/\/doi.org\/10.23919\/MIXDES55591.2022.9837974","DOI":"10.23919\/MIXDES55591.2022.9837974"},{"key":"251_CR5","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1007\/s10032-014-0222-y","volume":"17","author":"N Das","year":"2014","unstructured":"Das N, Acharya K, Sarkar R, Basu S, Kundu M, Nasipuri M. A benchmark image database of isolated Bangla handwritten compound characters. IJDAR. 2014;17:413\u201331. https:\/\/doi.org\/10.1007\/s10032-014-0222-y.","journal-title":"IJDAR"},{"key":"251_CR6","doi-asserted-by":"publisher","DOI":"10.7763\/IJMLC.2012.V2.137","author":"A Singh","year":"2012","unstructured":"Singh A, Bacchuwar K, Bhasin A. A survey of ocr applications. Int J Machine Learn Comput (IJMLC). 2012. https:\/\/doi.org\/10.7763\/IJMLC.2012.V2.137.","journal-title":"Int J Machine Learn Comput (IJMLC)."},{"key":"251_CR7","doi-asserted-by":"publisher","unstructured":"Wahid MF, Shahriar MF, Sobuj MSI. A classical approach to handcrafted feature extraction techniques for bangla handwritten digit recognition. In: 2021 International Conference on Electronics, Communications and Information Technology (ICECIT), 2021;pp. 1\u20134 . https:\/\/doi.org\/10.1109\/ICECIT54077.2021.9641406","DOI":"10.1109\/ICECIT54077.2021.9641406"},{"key":"251_CR8","doi-asserted-by":"crossref","unstructured":"Alom MZ, Sidike P, Hasan M, Taha TM, Asari VK. Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural Networks 2018","DOI":"10.1155\/2018\/6747098"},{"key":"251_CR9","doi-asserted-by":"publisher","first-page":"9727","DOI":"10.1007\/s13369-021-06311-1","volume":"47","author":"M Tounsi","year":"2022","unstructured":"Tounsi M, Moalla I, Pal U, Alimi AM. Arabic and latin scene text recognition by combining handcrafted and deep-learned features. Arab J Sci Eng. 2022;47:9727\u201340. https:\/\/doi.org\/10.1007\/s13369-021-06311-1.","journal-title":"Arab J Sci Eng"},{"key":"251_CR10","doi-asserted-by":"publisher","unstructured":"Garain U, Mioulet L, Chaudhuri BB, Chatelain C, Paquet T. Unconstrained bengali handwriting recognition with recurrent models. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015;pp. 1056\u20131060 . https:\/\/doi.org\/10.1109\/ICDAR.2015.7333923","DOI":"10.1109\/ICDAR.2015.7333923"},{"key":"251_CR11","doi-asserted-by":"crossref","unstructured":"Farjana Yeasmin\u00a0Omee, MANB. Shiam Shabbir\u00a0Himel: A complete workflow for development of bangla ocr. International Journal of Computer Applications 21 (2011)","DOI":"10.5120\/2543-3483"},{"key":"251_CR12","doi-asserted-by":"crossref","unstructured":"Akter N, Hossain S, Islam MT, Sarwar H. An algorithm for segmenting modifiers from bangla text. 2008 11th International Conference on Computer and Information Technology, 2008;177\u2013182","DOI":"10.1109\/ICCITECHN.2008.4803049"},{"key":"251_CR13","first-page":"157","volume":"2","author":"S Ahmed","year":"2013","unstructured":"Ahmed S,  Enhancing the character segmentation accuracy of bangla ocr using bpnn. Int J Sci Res (IJSR). 2013;2:157\u201361.","journal-title":"Int J Sci Res (IJSR)"},{"key":"251_CR14","unstructured":"Bensefia A, Paquet T, Heutte L. Grapheme based writer verification. In: Proceedings of the 11th Conference of the International Graphonomics Society (IGS\u20192003), Scottsdale, Arizona, 2003;pp. 274\u2013277"},{"key":"251_CR15","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1109\/FGCNS.2008.21","volume":"3","author":"H Jung","year":"2008","unstructured":"Jung H, Ha J-Y. A structural method on grapheme segmentation of hangul characters for ocr. Future Gen Commun Network Symposia Int Conf on. 2008;3:71\u20134. https:\/\/doi.org\/10.1109\/FGCNS.2008.21.","journal-title":"Future Gen Commun Network Symposia Int Conf on"},{"key":"251_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-86337-1_26","volume-title":"A large multi-target dataset of common bengali handwritten graphemes","author":"S Alam","year":"2021","unstructured":"Alam S, Reasat T, Sushmit AS, Siddique SM, Rahman F, Hasan M, Humayun AI. A large multi-target dataset of common bengali handwritten graphemes. Cham: Springer; 2021."},{"key":"251_CR17","first-page":"23","volume":"1","author":"K Roy","year":"2023","unstructured":"Roy K, Hossain MS, Saha PK, Rohan S, Ashrafi I, Rezwan IM, Rahman F, Hossain BM, Kabir A, Mohammed N. A multifaceted evaluation of representation of graphemes for practically effective bangla ocr. Int J Document Anal Recogn (IJDAR). 2023;1:23.","journal-title":"Int J Document Anal Recogn (IJDAR)"},{"key":"251_CR18","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1016\/j.dib.2017.03.035","volume":"12","author":"M Biswas","year":"2017","unstructured":"Biswas M, Islam R, Shom GK, Shopon M, Mohammed N, Momen S, Abedin A. Banglalekha-isolated: A multi-purpose comprehensive dataset of handwritten bangla isolated characters. Data in Brief. 2017;12:103\u20137. https:\/\/doi.org\/10.1016\/j.dib.2017.03.035.","journal-title":"Data in Brief"},{"key":"251_CR19","doi-asserted-by":"publisher","DOI":"10.6084\/m9.figshare.20186825.v1","author":"K Roy","year":"2022","unstructured":"Roy K, Hossain MS, Saha P, Rohan S, Rahman F, Ashrafi I, Rezwan IM, Hossain BMM, Kabir A, Mohammed N. Synthetic Printed Words and Test Protocols Data for Bangla OCR. Int J Document Anal Recog. 2022. https:\/\/doi.org\/10.6084\/m9.figshare.20186825.v1.","journal-title":"Int J Document Anal Recog"},{"key":"251_CR20","unstructured":"Group CR. CMATERdb: Pattern Recognition Database Repository. https:\/\/code.google.com\/archive\/p\/cmaterdb\/. Accessed: 2023-07-25.2015."},{"key":"251_CR21","doi-asserted-by":"publisher","unstructured":"Mridha Dr MF. BanglaWriting: A multi-purpose offline Bangla handwriting dataset. Mendeley. 2020. https:\/\/doi.org\/10.17632\/R43WKVDK4W.1","DOI":"10.17632\/R43WKVDK4W.1"},{"key":"251_CR22","doi-asserted-by":"publisher","unstructured":"Chowdhury A, HOSSEN MA, Baru A. BD_DB_64. Mendeley Data. 2023. https:\/\/doi.org\/10.17632\/zb5g5td4ns.1 . https:\/\/data.mendeley.com\/datasets\/zb5g5td4ns\/1","DOI":"10.17632\/zb5g5td4ns.1"},{"key":"251_CR23","unstructured":"Kamal M. Bengali Dictionary. https:\/\/github.com\/MinhasKamal\/BengaliDictionary. GitHub repository. 2024. https:\/\/github.com\/MinhasKamal\/BengaliDictionary"},{"key":"251_CR24","first-page":"2024","volume":"22","author":"M Faruk","year":"2024","unstructured":"Faruk M. Bengali Names vs Gender Dataset. Hugging Face Accessed. 2024;22:2024.","journal-title":"Hugging Face Accessed"},{"key":"251_CR25","unstructured":"Prothomalo: \u09aa\u09cd\u09b0\u09a5\u09ae \u0986\u09b2\u09cb | \u09ac\u09be\u0982\u09b2\u09be \u09a8\u09bf\u0989\u099c \u09aa\u09c7\u09aa\u09be\u09b0 Accessed 29 July 2024. https:\/\/www.prothomalo.com"},{"key":"251_CR26","doi-asserted-by":"publisher","unstructured":"Salekin S, Maung AT. BanglaWriting Words Dataset: A Collection of Isolated Word Images from the BanglaWriting Multi- Purpose Bangla Offline-handwriting Dataset (WoBW). https:\/\/doi.org\/10.5281\/zenodo.14163687 .","DOI":"10.5281\/zenodo.14163687"},{"key":"251_CR27","unstructured":"Roboflow: Sign in to Roboflow. https:\/\/app.roboflow.com\/. Accessed 7 July 2024"},{"key":"251_CR28","doi-asserted-by":"crossref","unstructured":"Karna N, Putra MAP, Rachmawati S, Abisado M, Sampedro G. Toward Accurate Fused Deposition Modeling 3D Printer Fault Detection Using Improved YOLOv8 With Hyperparameter Optimization - Scientific Figure on ResearchGate. https:\/\/www.researchgate.net\/figure\/The-improved-YOLOv8-network-architecture product penalty- @M-includes-an-additional-module-for-the-head_fig2_372207753. Accessed 21 June 2024","DOI":"10.1109\/ACCESS.2023.3293056"},{"issue":"11","key":"251_CR29","doi-asserted-by":"publisher","first-page":"16929","DOI":"10.1007\/s11042-022-13909-6.","volume":"82","author":"P Rakshit","year":"2023","unstructured":"Rakshit P, Chatterjee S, Halder C, Sen S, Obaidullah SM, Roy K. Comparative study on the performance of the state-of-the-art CNN models for handwritten Bangla character recognition. Multi Tools Applic. 2023;82(11):16929\u201350. https:\/\/doi.org\/10.1007\/s11042-022-13909-6..","journal-title":"Multi Tools Applic"},{"key":"251_CR30","doi-asserted-by":"publisher","first-page":"165314","DOI":"10.1109\/ACCESS.2024.3469951","volume":"12","author":"Y Choi","year":"2024","unstructured":"Choi Y, Bae B, Hee Han T, Ahn J. Application of mask R-CNN and yolov8 algorithms for concrete crack detection. IEEE Access. 2024;12:165314\u201321. https:\/\/doi.org\/10.1109\/ACCESS.2024.3469951.","journal-title":"IEEE Access"},{"key":"251_CR31","doi-asserted-by":"crossref","unstructured":"Turnbull R, Mannix E. Detecting and recognizing characters in Greek papyri with YOLOv8, DeiT and SimCLR. 2024. https:\/\/arxiv.org\/abs\/2401.12513","DOI":"10.1007\/s10032-024-00504-8"},{"key":"251_CR32","doi-asserted-by":"crossref","unstructured":"Baek Y, Lee B, Han D, Yun S, Lee H. Character Region Awareness for Text Detection (2019). https:\/\/arxiv.org\/abs\/1904.01941","DOI":"10.1109\/CVPR.2019.00959"},{"key":"251_CR33","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-63128-4_35","volume-title":"Borno Bangla handwritten character recognition using a multiclass convolutional neural network","author":"ASA Rabby","year":"2021","unstructured":"Rabby ASA, Islam MM, Hasan N, Nahar J, Rahman F. Borno Bangla handwritten character recognition using a multiclass convolutional neural network. Cham: Springer; 2021."},{"key":"251_CR34","unstructured":"Cloud G. Handwriting Recognition with the Vision API. https:\/\/cloud.google.com\/vision\/docs\/handwriting. Accessed: 15-11-2024"}],"container-title":["Discover Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-025-00251-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44163-025-00251-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44163-025-00251-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T03:31:13Z","timestamp":1750476673000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44163-025-00251-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,21]]},"references-count":34,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["251"],"URL":"https:\/\/doi.org\/10.1007\/s44163-025-00251-7","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-5149357\/v1","asserted-by":"object"}]},"ISSN":["2731-0809"],"issn-type":[{"value":"2731-0809","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,21]]},"assertion":[{"value":"25 September 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 March 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 June 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"119"}}