{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T14:16:00Z","timestamp":1771337760369,"version":"3.50.1"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,8,25]],"date-time":"2023-08-25T00:00:00Z","timestamp":1692921600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,8,25]],"date-time":"2023-08-25T00:00:00Z","timestamp":1692921600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100023561","name":"Ministerio de Universidades","doi-asserted-by":"publisher","award":["PID2019-109099RB-C41"],"award-info":[{"award-number":["PID2019-109099RB-C41"]}],"id":[{"id":"10.13039\/501100023561","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006368","name":"Universidad de las Palmas de Gran Canaria","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006368","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cogn Comput"],"published-print":{"date-parts":[[2024,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Script identification plays a vital role in applications that involve handwriting and document analysis within a multi-script and multi-lingual environment. Moreover, it exhibits a profound connection with human cognition. This paper provides a new database for benchmarking script identification algorithms, which contains both printed and handwritten documents collected from a wide variety of scripts, such as Arabic, Bengali (Bangla), Gujarati, Gurmukhi, Devanagari, Japanese, Kannada, Malayalam, Oriya, Roman, Tamil, Telugu, and Thai. The dataset consists of 1,135 documents scanned from local newspaper and handwritten letters as well as notes from different native writers. Further, these documents are segmented into lines and words, comprising a total of 13,979 and 86,655 lines and words, respectively, in the dataset. Easy-to-go benchmarks are proposed with handcrafted and deep learning methods. The benchmark includes results at the document, line, and word levels with printed and handwritten documents. Results of script identification independent of the document\/line\/word level and independent of the printed\/handwritten letters are also given. The new multi-lingual database is expected to create new script identifiers, present various challenges, including identifying handwritten and printed samples and serve as a foundation for future research in script identification based on the reported results of the three benchmarks.<\/jats:p>","DOI":"10.1007\/s12559-023-10193-w","type":"journal-article","created":{"date-parts":[[2023,8,25]],"date-time":"2023-08-25T07:02:08Z","timestamp":1692946928000},"page":"131-157","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["MDIW-13: a New Multi-Lingual and Multi-Script Database and Benchmark for Script Identification"],"prefix":"10.1007","volume":"16","author":[{"given":"Miguel A.","family":"Ferrer","sequence":"first","affiliation":[]},{"given":"Abhijit","family":"Das","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3878-3867","authenticated-orcid":false,"given":"Moises","family":"Diaz","sequence":"additional","affiliation":[]},{"given":"Aythami","family":"Morales","sequence":"additional","affiliation":[]},{"given":"Cristina","family":"Carmona-Duarte","sequence":"additional","affiliation":[]},{"given":"Umapada","family":"Pal","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,8,25]]},"reference":[{"key":"10193_CR1","doi-asserted-by":"publisher","first-page":"23255","DOI":"10.1007\/s11042-019-7620-6","volume":"78","author":"SR Narang","year":"2019","unstructured":"Narang SR, Jindal MK, Kumar M. Drop flow method: an iterative algorithm for complete segmentation of Devanagari ancient manuscripts. Multimed Tools Appl. 2019;78:23255\u201380.","journal-title":"Multimed Tools Appl"},{"issue":"12","key":"10193_CR2","doi-asserted-by":"publisher","first-page":"2142","DOI":"10.1109\/TPAMI.2010.30","volume":"32","author":"D Ghosh","year":"2010","unstructured":"Ghosh D, Dube T, Shivaprasad A. Script recognition-a review. IEEE Trans Pattern Anal Mach Intell. 2010;32(12):2142\u201361.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"10193_CR3","first-page":"6546","volume":"5","author":"K Ubul","year":"2017","unstructured":"Ubul K, Tursun G, Aysa A, Impedovo D, Pirlo G, Yibulayin T. Script identification of multi-script documents: a survey. IEEE Access. 2017;5:6546\u201359.","journal-title":"IEEE Access"},{"issue":"10","key":"10193_CR4","doi-asserted-by":"publisher","first-page":"1856012","DOI":"10.1142\/S0218001418560128","volume":"32","author":"SM Obaidullah","year":"2018","unstructured":"Obaidullah SM, Santosh K, Das N, Halder C, Roy K. Handwritten Indic script identification in multi-script document images: a survey. Int J Pattern Recognit Artif Intell. 2018;32(10):1856012.","journal-title":"Int J Pattern Recognit Artif Intell"},{"key":"10193_CR5","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1007\/s13042-017-0702-8","volume":"10","author":"SM Obaidullah","year":"2019","unstructured":"Obaidullah SM, Santosh K, Halder C, Das N, Roy K. Automatic Indic script identification from handwritten documents: page, block, line and word-level approach. Int J Mach Learn Cybern. 2019;10:87\u2013106.","journal-title":"Int J Mach Learn Cybern"},{"key":"10193_CR6","doi-asserted-by":"crossref","unstructured":"Pal U, Chaudhuri B. Indian script character recognition: a survey. Pattern Recogni. 2004;37(9):1887\u20131899.","DOI":"10.1016\/j.patcog.2004.02.003"},{"key":"10193_CR7","doi-asserted-by":"crossref","unstructured":"Ferrer MA, Morales A, Pal U. Lbp based line-wise script identification. In: 2013 12th International Conference on Document Analysis and Recognition. IEEE; 2013. p. 369\u2013373.","DOI":"10.1109\/ICDAR.2013.81"},{"key":"10193_CR8","doi-asserted-by":"crossref","unstructured":"Ferrer MA, Morales A, Rodr\u00edguez N, Pal U. Multiple training-one test methodology for handwritten word-script identification. In: 2014 14th International Conference on Frontiers in Handwriting Recognition. IEEE; 2014. p. 754\u2013759.","DOI":"10.1109\/ICFHR.2014.132"},{"key":"10193_CR9","doi-asserted-by":"crossref","unstructured":"Sharma N, Pal U, Blumenstein MA, study on word-level multi-script identification from video frames. In,. international joint conference on neural networks (IJCNN). IEEE. 2014;2014:1827\u201333.","DOI":"10.1109\/IJCNN.2014.6889906"},{"key":"10193_CR10","doi-asserted-by":"publisher","first-page":"448","DOI":"10.1016\/j.patcog.2015.11.005","volume":"52","author":"B Shi","year":"2016","unstructured":"Shi B, Bai X, Yao C. Script identification in the wild via discriminative convolutional neural network. Pattern Recognit. 2016;52:448\u201358.","journal-title":"Pattern Recognit"},{"key":"10193_CR11","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1016\/j.patcog.2017.01.032","volume":"67","author":"L Gomez","year":"2017","unstructured":"Gomez L, Nicolaou A, Karatzas D. Improving patch-based scene text script identification with ensembles of conjoined networks. Pattern Recognit. 2017;67:85\u201396.","journal-title":"Pattern Recognit"},{"key":"10193_CR12","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1016\/j.patrec.2017.09.016","volume":"100","author":"Z Feng","year":"2017","unstructured":"Feng Z, Yang Z, Jin L, Huang S, Sun J. Robust shared feature learning for script and handwritten\/machine-printed identification. Pattern Recognit Lett. 2017;100:6\u201313.","journal-title":"Pattern Recognit Lett"},{"key":"10193_CR13","doi-asserted-by":"crossref","unstructured":"Ahmad R, Naz S, Afzal MZ, Rashid SF, Liwicki M, Dengel A. Khatt: A deep learning benchmark on arabic script. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). vol.7. IEEE; 2017. p. 10\u201314.","DOI":"10.1109\/ICDAR.2017.358"},{"key":"10193_CR14","unstructured":"Tripathi R, Gill A, Tripati R. Recurrent neural networks based Indic word-wise script identification using character-wise training.\u00a02017. arXiv preprint arXiv:1709.03209."},{"key":"10193_CR15","doi-asserted-by":"publisher","first-page":"2829","DOI":"10.1007\/s00521-019-04111-1","volume":"32","author":"S Ukil","year":"2020","unstructured":"Ukil S, Ghosh S, Obaidullah SM, Santosh K, Roy K, Das N. Improved word-level handwritten indic script identification by integrating small convolutional neural networks. Neural Comput Appl. 2020;32:2829\u201344.","journal-title":"Neural Comput Appl"},{"issue":"5","key":"10193_CR16","doi-asserted-by":"publisher","first-page":"051214","DOI":"10.1117\/1.JEI.27.5.051214","volume":"27","author":"SM Obaidullah","year":"2018","unstructured":"Obaidullah SM, Bose A, Mukherjee H, Santosh K, Das N, Roy K. Extreme learning machine for handwritten Indic script identification in multiscript documents. J Electron Imaging. 2018;27(5):051214\u2013051214.","journal-title":"J Electron Imaging"},{"key":"10193_CR17","doi-asserted-by":"crossref","unstructured":"M\u00e4enp\u00e4\u00e4 T, Pietik\u00e4inen M. Texture analysis with local binary patterns. In: Handbook of pattern recognition and computer vision. World Scientific; 2005. p. 197\u2013216.","DOI":"10.1142\/9789812775320_0011"},{"key":"10193_CR18","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1016\/j.imavis.2017.08.004","volume":"66","author":"Y Serdouk","year":"2017","unstructured":"Serdouk Y, Nemmour H, Chibani Y. Handwritten signature verification using the quad-tree histogram of templates and a support vector-based artificial immune classification. Image Vis Comput. 2017;66:26\u201335.","journal-title":"Image Vis Comput"},{"key":"10193_CR19","doi-asserted-by":"crossref","unstructured":"Paris S, Glotin H. Pyramidal multi-level features for the robot vision@ icpr 2010 challenge. In: 2010 20th International Conference on Pattern Recognition. IEEE; 2010. p. 2949\u20132952.","DOI":"10.1109\/ICPR.2010.1143"},{"issue":"09","key":"10193_CR20","doi-asserted-by":"publisher","first-page":"1856011","DOI":"10.1142\/S0218001418560116","volume":"32","author":"C Halder","year":"2018","unstructured":"Halder C, Obaidullah SM, Santosh K, Roy K. Content independent writer identification on Bangla Script: a document level approach. Int J Pattern Recognit Artif Intell. 2018;32(09):1856011.","journal-title":"Int J Pattern Recognit Artif Intell"},{"key":"10193_CR21","doi-asserted-by":"crossref","unstructured":"Brunessaux S, Giroux P, Grilheres B, Manta M, Bodin M, Choukri K, The maurdor project: improving automatic processing of digital documents. In, et al. 11th IAPR International Workshop on Document Analysis Systems. IEEE. 2014;2014:349\u201354.","DOI":"10.1109\/DAS.2014.58"},{"key":"10193_CR22","doi-asserted-by":"crossref","unstructured":"Djeddi C, Gattal A, Souici-Meslati L, Siddiqi I, Chibani Y, ElAbed H. LAMIS-MSHD: a multi-script offline handwriting database. In: 2014 14th International Conference on Frontiers in Handwriting Recognition. IEEE; 2014. p. 93\u201397.","DOI":"10.1109\/ICFHR.2014.23"},{"key":"10193_CR23","doi-asserted-by":"crossref","unstructured":"AlMaadeed S, Ayouby W, Hassaine A, Aljaam JM. QUWI: an Arabic and English handwriting dataset for offline writer identification. In: 2012 International Conference on Frontiers in Handwriting Recognition. IEEE; 2012. p. 746\u2013751.","DOI":"10.1109\/ICFHR.2012.256"},{"key":"10193_CR24","doi-asserted-by":"crossref","unstructured":"Jaiem FK, Kanoun S, Khemakhem M, ElAbed H, Kardoun J. Database for Arabic printed text recognition research. In: Image Analysis and Processing\u2013ICIAP 2013: 17th International Conference, Naples, Italy, September 9-13, 2013. Proceedings, Part I 17. Springer; 2013. p. 251\u2013259.","DOI":"10.1007\/978-3-642-41181-6_26"},{"key":"10193_CR25","doi-asserted-by":"publisher","first-page":"8441","DOI":"10.1007\/s11042-017-4745-3","volume":"77","author":"PK Singh","year":"2018","unstructured":"Singh PK, Sarkar R, Das N, Basu S, Kundu M, Nasipuri M. Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images. Multimed Tools Appl. 2018;77:8441\u201373.","journal-title":"Multimed Tools Appl"},{"issue":"2","key":"10193_CR26","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1007\/s00371-020-01799-4","volume":"37","author":"S Inunganbi","year":"2021","unstructured":"Inunganbi S, Choudhary P, Manglem K. Meitei Mayek handwritten dataset: compilation, segmentation, and character recognition. Vis Comput. 2021;37(2):291\u2013305.","journal-title":"Vis Comput"},{"key":"10193_CR27","doi-asserted-by":"crossref","unstructured":"Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X. Automatic script identification in the wild. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE; 2015. p. 531\u2013535.","DOI":"10.1109\/ICDAR.2015.7333818"},{"key":"10193_CR28","doi-asserted-by":"publisher","first-page":"1643","DOI":"10.1007\/s11042-017-4373-y","volume":"77","author":"SM Obaidullah","year":"2018","unstructured":"Obaidullah SM, Halder C, Santosh K, Das N, Roy K. PHDIndic\\_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimed Tools Appl. 2018;77:1643\u201378.","journal-title":"Multimed Tools Appl."},{"key":"10193_CR29","doi-asserted-by":"crossref","unstructured":"Obaidullah SM, Halder C, Das N, Roy K. A corpus of word-level offline handwritten numeral images from official indic scripts. In: Proceedings of the Second International Conference on Computer and Communication Technologies: IC3T 2015, Volume 1. Springer; 2016. p. 703\u2013711.","DOI":"10.1007\/978-81-322-2517-1_67"},{"issue":"04","key":"10193_CR30","doi-asserted-by":"publisher","first-page":"1253001","DOI":"10.1142\/S0218001412530011","volume":"26","author":"A Alaei","year":"2012","unstructured":"Alaei A, Pal U, Nagabhushan P. Dataset and ground truth for handwritten text in four different scripts. Int J Pattern Recognit Artif Intell. 2012;26(04):1253001.","journal-title":"Int J Pattern Recognit Artif Intell"},{"key":"10193_CR31","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1007\/s100320200071","volume":"5","author":"UV Marti","year":"2002","unstructured":"Marti UV, Bunke H. The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recognit. 2002;5:39\u201346.","journal-title":"Int J Doc Anal Recognit"},{"key":"10193_CR32","doi-asserted-by":"publisher","first-page":"1243","DOI":"10.1007\/s11042-014-2366-7","volume":"75","author":"CS Hung","year":"2016","unstructured":"Hung CS, Ruan SJ. Efficient adaptive thresholding algorithm for in-homogeneous document background removal. Multimed Tools Appl. 2016;75:1243\u201359.","journal-title":"Multimed Tools Appl"},{"issue":"3","key":"10193_CR33","doi-asserted-by":"publisher","first-page":"667","DOI":"10.1109\/TPAMI.2014.2343981","volume":"37","author":"MA Ferrer","year":"2014","unstructured":"Ferrer MA, Diaz-Cabrera M, Morales A. Static signature synthesis: A neuromotor inspired approach for biometrics. IEEE Trans Pattern Anal Mach Intell. 2014;37(3):667\u201380.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"10193_CR34","doi-asserted-by":"crossref","unstructured":"Franke K, Rose S. Ink-deposition model: The relation of writing and ink deposition processes. In: Ninth International Workshop on Frontiers in Handwriting Recognition. IEEE; 2004. p. 173\u2013178.","DOI":"10.1109\/IWFHR.2004.59"},{"key":"10193_CR35","doi-asserted-by":"crossref","unstructured":"Suykens J, VanGestel T, DeBrabanter J, DeMoor B, Vandewalle J. Least squares support vector machines, World Scientific Publishing, Singapore. 2002.","DOI":"10.1142\/5089"},{"key":"10193_CR36","doi-asserted-by":"crossref","unstructured":"Chan CH, Kittler J, Messer K. Multi-scale local binary pattern histograms for face recognition. In: Advances in Biometrics: International Conference, ICB 2007, Seoul, Korea, August 27-29, 2007. Proceedings. Springer; 2007. p. 809\u2013818.","DOI":"10.1007\/978-3-540-74549-5_85"},{"key":"10193_CR37","unstructured":"Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. arXiv preprint arXiv:1409.1556."},{"key":"10193_CR38","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770\u2013778.","DOI":"10.1109\/CVPR.2016.90"},{"key":"10193_CR39","doi-asserted-by":"crossref","unstructured":"Conneau A, Schwenk H, Barrault L, Lecun Y. Very deep convolutional networks for text classification. 2016. arXiv preprint arXiv:1606.01781.","DOI":"10.18653\/v1\/E17-1104"},{"key":"10193_CR40","doi-asserted-by":"crossref","unstructured":"Bakkali S, Ming Z, Coustaty M, Rusinol M. Visual and textual deep feature fusion for document image classification. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops; 2020. p. 562\u2013563.","DOI":"10.1109\/CVPRW50498.2020.00289"},{"issue":"2","key":"10193_CR41","first-page":"81","volume":"7","author":"S Obaidullah","year":"2017","unstructured":"Obaidullah S, Santosh K, Halder C, Das N, Roy K. Word-level multi-script Indic document image dataset and baseline results on script identification. Int J Comput Vis Image Process (IJCVIP). 2017;7(2):81\u201394.","journal-title":"Int J Comput Vis Image Process (IJCVIP)"},{"issue":"10","key":"10193_CR42","doi-asserted-by":"publisher","first-page":"1615","DOI":"10.1109\/TPAMI.2005.188","volume":"27","author":"K Mikolajczyk","year":"2005","unstructured":"Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell. 2005;27(10):1615\u201330.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"03","key":"10193_CR43","doi-asserted-by":"publisher","first-page":"2140011","DOI":"10.1142\/S0219467821400118","volume":"22","author":"R Rani","year":"2022","unstructured":"Rani R, Dhir R, Kakkar D, Sharma N. Script identification for printed and handwritten Indian documents: An empirical study of different feature classifier combinations. Int J Image Graph. 2022;22(03):2140011.","journal-title":"Int J Image Graph"},{"issue":"2","key":"10193_CR44","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1007\/s10032-022-00394-8","volume":"25","author":"A Semma","year":"2022","unstructured":"Semma A, Hannad Y, Siddiqi I, Lazrak S, Kettani MEYE. Feature learning and encoding for multi-script writer identification. Int J Doc Anal Recognit (IJDAR). 2022;25(2):79\u201393.","journal-title":"Int J Doc Anal Recognit (IJDAR)"},{"key":"10193_CR45","first-page":"121","volume":"13","author":"M Ghosh","year":"2022","unstructured":"Ghosh M, Baidya G, Mukherjee H, Obaidullah SM, Roy K. A deep learning-based approach to single\/mixed script-type identification. Adv Comput Syst Secur. 2022;13:121\u201332.","journal-title":"Adv Comput Syst Secur"},{"key":"10193_CR46","doi-asserted-by":"crossref","unstructured":"Ghosh M, Obaidullah SM, Santosh K, Das N, Roy K. Artistic multi-character script identification using iterative isotropic dilation algorithm. In: Recent Trends in Image Processing and Pattern Recognition: Second International Conference, RTIP2R 2018, Solapur, India, December 21\u201322, 2018, Revised Selected Papers, Part III 2. Springer; 2019. p. 49\u201362.","DOI":"10.1007\/978-981-13-9187-3_5"}],"container-title":["Cognitive Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12559-023-10193-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s12559-023-10193-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s12559-023-10193-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T21:16:02Z","timestamp":1729977362000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s12559-023-10193-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,25]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1]]}},"alternative-id":["10193"],"URL":"https:\/\/doi.org\/10.1007\/s12559-023-10193-w","relation":{},"ISSN":["1866-9956","1866-9964"],"issn-type":[{"value":"1866-9956","type":"print"},{"value":"1866-9964","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,25]]},"assertion":[{"value":"27 April 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 August 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 August 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"This article does not contain any studies with human participants or animals performed by any of the authors.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics Approval"}},{"value":"The authors declare that they have no conflict of interest.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}}]}}