{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T04:38:33Z","timestamp":1777696713974,"version":"3.51.4"},"reference-count":34,"publisher":"SAGE Publications","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDT"],"published-print":{"date-parts":[[2024,9,16]]},"abstract":"<jats:p>Determining the script of historical manuscripts is pivotal for understanding historical narratives, providing historians with vital insights into the past. In this study, our focus lies in developing an automated system for effectively identifying the script of historical documents using a deep learning approach. Leveraging the ClAMM dataset as the foundation for our system, we initiate the system with dataset preprocessing, employing two fundamental techniques: denoising through non-local means denoising and binarization using Canny-edge detection. These techniques prepare the document for keypoint detection facilitated by the Harris-corner detector, a feature-detection method. Subsequently, we cluster these keypoints utilizing the k-means algorithm and extract patches based on the identified features. The final step involves training these patches on deep learning models, with a comparative analysis between two architectures: Convolutional Neural Networks (CNN) and Vision Transformers (ViT). Given the absence of prior studies investigating the performance of vision transformers on historical manuscripts, our research fills this gap. The system undergoes a series of experiments to fine-tune its parameters for optimal performance. Our conclusive results demonstrate an average accuracy of 89.2 and 91.99% respectively of the CNN and ViT based proposed framework, surpassing the state of the art in historical script classification so far, and affirming the effectiveness of our automated script identification system.<\/jats:p>","DOI":"10.3233\/idt-240565","type":"journal-article","created":{"date-parts":[[2024,7,26]],"date-time":"2024-07-26T10:55:55Z","timestamp":1721991355000},"page":"2055-2078","source":"Crossref","is-referenced-by-count":4,"title":["HeritageScript: A cutting-edge approach to historical manuscript script classification with CNN and vision transformer architectures"],"prefix":"10.1177","volume":"18","author":[{"given":"Akram","family":"Bennour","sequence":"first","affiliation":[{"name":"Laboratory of mathematics, informatics and systems (LAMIS), Echahid Cheikh Larbi Tebessi University, Tebessa, Algeria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Merouane","family":"Boudraa","sequence":"additional","affiliation":[{"name":"Laboratory of mathematics, informatics and systems (LAMIS), Echahid Cheikh Larbi Tebessi University, Tebessa, Algeria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fahad","family":"Ghabban","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","reference":[{"key":"10.3233\/IDT-240565_ref1","doi-asserted-by":"crossref","unstructured":"Bischoff B. Latin palaeography: Antiquity and the middle ages. Cambridge University Press; 1990.","DOI":"10.1017\/CBO9780511809927"},{"key":"10.3233\/IDT-240565_ref2","unstructured":"Brown MP. Understanding illuminated manuscripts: A guide to technical terms. Getty Publications; 1994."},{"key":"10.3233\/IDT-240565_ref3","doi-asserted-by":"crossref","unstructured":"Parkes MB. Pause and effect: An introduction to the history of punctuation in the West. Routledge; 2016.","DOI":"10.4324\/9781315247243"},{"issue":"8","key":"10.3233\/IDT-240565_ref4","doi-asserted-by":"crossref","first-page":"1720","DOI":"10.1109\/TPAMI.2005.227","article-title":"Texture for script identification","volume":"27","author":"Busch","year":"2005","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"10.3233\/IDT-240565_ref5","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1109\/IADCC.2010.5423028","article-title":"Script identification in a handwritten document image using texture features","author":"Hiremath","year":"2010","journal-title":"2010 IEEE 2nd International Advance Computing Conference (IACC)"},{"issue":"7553","key":"10.3233\/IDT-240565_ref6","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"issue":"2","key":"10.3233\/IDT-240565_ref7","first-page":"74","article-title":"Word-level script identification using texture based features","volume":"4","author":"Singh","year":"2015","journal-title":"Int J Syst Dyn Appl (IJSDA)"},{"key":"10.3233\/IDT-240565_ref8","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.forsciint.2019.05.014","article-title":"Handwriting based writer recognition using implicit shape codebook","volume":"301","author":"Bennour","year":"2019","journal-title":"Forensic Sci Int"},{"key":"10.3233\/IDT-240565_ref9","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16 \u00d7 16\u00a0words: Transformers for image recognition at scale. 2010. arXiv 2020. arXiv preprint arXiv2010.11929."},{"issue":"2","key":"10.3233\/IDT-240565_ref10","doi-asserted-by":"crossref","first-page":"734","DOI":"10.3390\/s23020734","article-title":"Convolutional neural networks or vision transformers: Who will win the race for action recognitions in visual data","volume":"23","author":"Moutik","year":"2023","journal-title":"Sensors"},{"key":"10.3233\/IDT-240565_ref11","doi-asserted-by":"publisher","first-page":"1940","DOI":"10.1007\/978-3-031-46335-8_11","article-title":"Combination of local features and deep learning to historical manuscripts dating","author":"Boudraa","year":"2024","journal-title":"Intelligent Systems and Pattern Recognition"},{"key":"10.3233\/IDT-240565_ref12","doi-asserted-by":"crossref","unstructured":"Cloppet F, Eglin V, Kieu VC, Stutzmann D, Vincent N. ICFHR2016 Competition on the classification of medieval handwritings in latin script. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR); 2016.","DOI":"10.1109\/ICFHR.2016.0113"},{"key":"10.3233\/IDT-240565_ref13","doi-asserted-by":"crossref","first-page":"1371","DOI":"10.1109\/ICDAR.2017.224","article-title":"ICDAR2017 competition on the classification of medieval handwritings in latin script","author":"Cloppet","year":"2017","journal-title":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)"},{"key":"10.3233\/IDT-240565_ref14","doi-asserted-by":"crossref","unstructured":"Seuret M, Nicolaou A, Rodr\u00edguez-Salas D, Weichselbaumer N, Stutzmann D, Mayr M, et al. ICDAR 2021 competition on historical document classification. In: Document Analysis and Recognition\u2013ICDAR 2021: 16th International Conference, Lausanne, Switzerland, 2021 September 5\u201310.","DOI":"10.1007\/978-3-030-86337-1_41"},{"key":"10.3233\/IDT-240565_ref15","unstructured":"Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv150203167 Cs, f\u00e9vr. 2015."},{"key":"10.3233\/IDT-240565_ref16","unstructured":"Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv14091556 Cs, step. 2014."},{"key":"10.3233\/IDT-240565_ref17","doi-asserted-by":"crossref","first-page":"1371","DOI":"10.1109\/ICDAR.2017.224","article-title":"Icdar2017 competition on the classification of medieval handwritings in latin script","author":"Cloppet","year":"2017","journal-title":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)"},{"key":"10.3233\/IDT-240565_ref18","first-page":"10","article-title":"Clustering of medieval scripts through computer image analysis: towards an evaluation protocol","author":"Stutzmann","year":"2015","journal-title":"Digital Medievalist Journal"},{"key":"10.3233\/IDT-240565_ref19","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1109\/ICDAR.2015.7333893","article-title":"Writer identification using VLAD encoded contour-Zernike moments","author":"Christlein","year":"2015","journal-title":"Document Analysis and Recognition (ICDAR), 2015 13th International Conference"},{"key":"10.3233\/IDT-240565_ref20","doi-asserted-by":"crossref","unstructured":"Bennour A, Boudraa M, Siddiqi I, et al. A deep learning framework for historical manuscripts writer identification using data-driven features. Multimed Tools Appl. 2024.","DOI":"10.1007\/s11042-024-18187-y"},{"key":"10.3233\/IDT-240565_ref21","doi-asserted-by":"crossref","first-page":"208","DOI":"10.5201\/ipol.2011.bcm_nlm","article-title":"Non-local means denoising","volume":"1","author":"Buades","year":"2011","journal-title":"Image Process Line"},{"key":"10.3233\/IDT-240565_ref22","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1007\/978-94-017-1699-4_3","article-title":"Anisotropic diffusion","author":"Perona","year":"1994","journal-title":"Geometry-Driven Diffusion in Computer Vision"},{"issue":"1","key":"10.3233\/IDT-240565_ref23","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1137\/0917016","article-title":"Iterative methods for total variation denoising","volume":"17","author":"Vogel","year":"1996","journal-title":"SIAM J Sci Comput"},{"issue":"1","key":"10.3233\/IDT-240565_ref24","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1109\/TSMC.1979.4310076","article-title":"A threshold selection method from gray-level histograms","volume":"9","author":"Otsu","year":"1979","journal-title":"IEEE Trans Syst Man Cybern"},{"key":"10.3233\/IDT-240565_ref25","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1109\/TPAMI.1986.4767851","article-title":"A computational approach to edge detection","volume":"6","author":"Canny","year":"1986","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"2","key":"10.3233\/IDT-240565_ref26","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int J Comput Vis"},{"issue":"1","key":"10.3233\/IDT-240565_ref27","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1109\/TPAMI.2008.275","article-title":"Faster and better: A machine learning approach to corner detection","volume":"32","author":"Rosten","year":"2008","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"10.3233\/IDT-240565_ref28","first-page":"147","article-title":"A combined corner and edge detector","author":"Harris","year":"1988","journal-title":"Proceedings of the 4th Alvey Vision Conference"},{"key":"10.3233\/IDT-240565_ref29","doi-asserted-by":"crossref","unstructured":"Jin X, Han J. K-Means Clustering. In: Sammut C, Webb GI, editors. Encyclopedia of Machine Learning. Boston, MA: Springer; 2011.","DOI":"10.1007\/978-0-387-30164-8_425"},{"key":"10.3233\/IDT-240565_ref30","first-page":"770","article-title":"Deep residual learning for image recognition","author":"He","year":"2016","journal-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition"},{"key":"10.3233\/IDT-240565_ref31","doi-asserted-by":"crossref","first-page":"2261","DOI":"10.1109\/CVPR.2017.243","article-title":"Densely Connected Convolutional Networks","author":"Huang","year":"2017","journal-title":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)"},{"key":"10.3233\/IDT-240565_ref32","unstructured":"Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, et al. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv preprint arXiv2103.14030."},{"key":"10.3233\/IDT-240565_ref33","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1016\/j.patcog.2016.10.005","article-title":"Writer identification using GMM supervectors and exemplar-SVMs","volume":"63","author":"Christlein","year":"2017","journal-title":"Pattern Recognit"},{"key":"10.3233\/IDT-240565_ref34","doi-asserted-by":"publisher","DOI":"10.1016\/j.dsp.2024.104477"}],"container-title":["Intelligent Decision Technologies"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDT-240565","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:20:45Z","timestamp":1777454445000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDT-240565"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,16]]},"references-count":34,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.3233\/idt-240565","relation":{},"ISSN":["1872-4981","1875-8843"],"issn-type":[{"value":"1872-4981","type":"print"},{"value":"1875-8843","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,9,16]]}}}