{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T22:45:43Z","timestamp":1776811543490,"version":"3.51.2"},"reference-count":11,"publisher":"European Society of Computational Methods in Sciences and Engineering","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["JCM"],"published-print":{"date-parts":[[2022,9,5]]},"abstract":"<jats:p>. A offline character dataset of Tibetan Historical document in Uchen font, THCU, is presented to facilitate the research of Tibetan Historical document recognition. The dataset THCU includes two subsets: THCU-M and THCU-S. The THCU-M is annotated manually in original document images, including 121214 character samples and 238 character categories. The subset THCU-S is a simulation dataset, and its samples are generated based on the idea of component combination. There are four subsets in THCU-S, in which the numbers of character category are 7238, 2908, 562 and 245 respectively, and the numbers of sample in each category are 5000, 3000, 600 and 600 respectively. We also evaluate THCU dataset using a CNN based model as a baseline performance. The experiment shows that the performance of the model on the real data is greatly improved by adding the generated samples.<\/jats:p>","DOI":"10.3233\/jcm-226167","type":"journal-article","created":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T13:47:50Z","timestamp":1655819270000},"page":"1779-1794","source":"Crossref","is-referenced-by-count":3,"title":["Character recognition of Tibetan Historical document in Uchen font: Dataset and bench mark"],"prefix":"10.66113","volume":"22","author":[{"given":"Zhenjiang","family":"Li","sequence":"first","affiliation":[{"name":"School of Cyberspace Security, Gansu University of Political Science and Law, Lanzhou, Gansu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weilan","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Laboratory of China\u2019s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yiqun","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Laboratory of China\u2019s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu, China"},{"name":"School of Artificial Intelligence, Gansu University of Political Science and Law, Lanzhou, Gansu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qianxue","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Cyberspace Security, Gansu University of Political Science and Law, Lanzhou, Gansu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"55691","reference":[{"issue":"18","key":"10.3233\/JCM-226167_ref1","first-page":"5987","article-title":"A database for off-line handwritten Tibetan character recognition","volume":"9","author":"Huang","year":"2012","journal-title":"J Inf Comput Sci."},{"issue":"1","key":"10.3233\/JCM-226167_ref2","first-page":"27","article-title":"Wavelet transform and gradient direction based feature extraction method for off-line handwritten Tibetan letter recognition","volume":"30","author":"Heming","year":"2014","journal-title":"J Southeast Univ."},{"issue":"3","key":"10.3233\/JCM-226167_ref3","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1016\/j.ijleo.2013.07.101","article-title":"Sparse representation-based classification algorithm for optical Tibetan character recognition","volume":"125","author":"Huang","year":"2014","journal-title":"Optik-Int J Light Electr Opt."},{"issue":"1","key":"10.3233\/JCM-226167_ref6","doi-asserted-by":"crossref","first-page":"31","DOI":"10.5070\/H915130066","article-title":"Online unconstrained handwritten Tibetan character recognition using statistical recognition method","volume":"15","author":"Ma","year":"2016","journal-title":"Himalayan Linguist."},{"key":"10.3233\/JCM-226167_ref7","unstructured":"Wang WL, Lu XB, Cai ZQ, Shen WT, Fu J, Caike ZX. Online handwritten sample generated base on component combination for Tibetan-Sanskrit. J Chin Inf Process. 2017; 31(5): 64-73."},{"issue":"10","key":"10.3233\/JCM-226167_ref8","doi-asserted-by":"crossref","first-page":"1953003","DOI":"10.1142\/S0218001419530033","article-title":"Online Tibetan handwriting recognition for large character set on new databases","volume":"33","author":"Wang","year":"2019","journal-title":"Int J Patt Recognit Artif Intell."},{"key":"10.3233\/JCM-226167_ref9","doi-asserted-by":"crossref","first-page":"52641","DOI":"10.1109\/ACCESS.2020.2975023","article-title":"Segmentation and recognition for historical Tibetan document images","volume":"8","author":"Ma","year":"2020","journal-title":"IEEE Access."},{"key":"10.3233\/JCM-226167_ref10","doi-asserted-by":"crossref","first-page":"154435","DOI":"10.1109\/ACCESS.2021.3128536","article-title":"Accurate fine-grained layout analysis for the historical Tibetan document based on the instance segmentation","volume":"9","author":"Zhao","year":"2021","journal-title":"IEEE Access."},{"key":"10.3233\/JCM-226167_ref11","doi-asserted-by":"crossref","first-page":"25376","DOI":"10.1109\/ACCESS.2022.3151886","article-title":"Character detection and segmentation of historical Uchen Tibetan documents in complex situations","volume":"10","author":"Zhang","year":"2022","journal-title":"IEEE Access."},{"issue":"5","key":"10.3233\/JCM-226167_ref14","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/j.jvcir.2019.01.021","article-title":"A novel method of text line segmentation for historical document image of the Uchen Tibetan","volume":"61","author":"Li","year":"2019","journal-title":"J Visual Commun Image Representation."},{"key":"10.3233\/JCM-226167_ref15","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1016\/j.patcog.2016.08.005","article-title":"Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark","volume":"61","author":"Zhang","year":"2017","journal-title":"Patt Recognit."}],"container-title":["Journal of Computational Methods in Sciences and Engineering"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JCM-226167","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T22:06:31Z","timestamp":1776809191000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JCM-226167"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,5]]},"references-count":11,"journal-issue":{"issue":"5"},"URL":"https:\/\/doi.org\/10.3233\/jcm-226167","relation":{},"ISSN":["1472-7978","1875-8983"],"issn-type":[{"value":"1472-7978","type":"print"},{"value":"1875-8983","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,5]]}}}