{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,8]],"date-time":"2026-07-08T15:53:23Z","timestamp":1783526003798,"version":"3.55.0"},"reference-count":51,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2019,1,2]],"date-time":"2019-01-02T00:00:00Z","timestamp":1546387200000},"content-version":"vor","delay-in-days":1,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Science and Technology Funding of China","award":["61772158"],"award-info":[{"award-number":["61772158"]}]},{"name":"Science and Technology Funding of China","award":["61472103"],"award-info":[{"award-number":["61472103"]}]},{"name":"Science and Technology Funding of China","award":["U1711265"],"award-info":[{"award-number":["U1711265"]}]},{"name":"Science and Technology Funding of China","award":["61772158"],"award-info":[{"award-number":["61772158"]}]},{"name":"Science and Technology Funding of China","award":["61472103"],"award-info":[{"award-number":["61472103"]}]},{"name":"Science and Technology Funding of China","award":["U1711265"],"award-info":[{"award-number":["U1711265"]}]},{"name":"Science and Technology Funding Key Program of China","award":["61772158"],"award-info":[{"award-number":["61772158"]}]},{"name":"Science and Technology Funding Key Program of China","award":["61472103"],"award-info":[{"award-number":["61472103"]}]},{"name":"Science and Technology Funding Key Program of China","award":["U1711265"],"award-info":[{"award-number":["U1711265"]}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2019,1]]},"abstract":"<jats:p>In this paper, we propose a novel deep model for unbalanced distribution Character Recognition by employing focal loss based connectionist temporal classification (CTC) function. Previous works utilize Traditional CTC to compute prediction losses. However, some datasets may consist of extremely unbalanced samples, such as Chinese. In other words, both training and testing sets contain large amounts of low\u2010frequent samples. The low\u2010frequent samples have very limited influence on the model during training. To solve this issue, we modify the traditional CTC by fusing focal loss with it and thus make the model attend to the low\u2010frequent samples at training stage. In order to demonstrate the advantage of the proposed method, we conduct experiments on two types of datasets: synthetic and real image sequence datasets. The results on both datasets demonstrate that the proposed focal CTC loss function achieves desired performance on unbalanced datasets. Specifically, our method outperforms traditional CTC by 3 to 9 percentages in accuracy on average.<\/jats:p>","DOI":"10.1155\/2019\/9345861","type":"journal-article","created":{"date-parts":[[2019,1,2]],"date-time":"2019-01-02T13:08:37Z","timestamp":1546434517000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["Focal CTC Loss for Chinese Optical Character Recognition on Unbalanced Datasets"],"prefix":"10.1155","volume":"2019","author":[{"given":"Xinjie","family":"Feng","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3298-2574","authenticated-orcid":false,"given":"Hongxun","family":"Yao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5200-3420","authenticated-orcid":false,"given":"Shengping","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"311","published-online":{"date-parts":[[2019,1,2]]},"reference":[{"key":"e_1_2_10_1_2","volume-title":"Medical Imaging","author":"Zhang L.","year":"2019"},{"key":"e_1_2_10_2_2","volume-title":"Medical Imaging","author":"Zhang L.","year":"2019"},{"key":"e_1_2_10_3_2","doi-asserted-by":"publisher","DOI":"10.1155\/2017\/1320780"},{"key":"e_1_2_10_4_2","doi-asserted-by":"crossref","unstructured":"ZhouZ. ShinJ. ZhangL. GuruduS. GotwayM. andLiangJ. Fine-tuning convolutional neural networks for biomedical image analysis: Actively and incrementally Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition CVPR 2017 July 2017 USA 4761\u20134772 2-s2.0-85040371714.","DOI":"10.1109\/CVPR.2017.506"},{"key":"e_1_2_10_5_2","doi-asserted-by":"crossref","unstructured":"ZhangL. YangF. Daniel ZhangY. andZhuY. J. Road crack detection using deep convolutional neural network Proceedings of the 23rd IEEE International Conference on Image Processing ICIP 2016 September 2016 Phoenix AZ USA 3708\u20133712 2-s2.0-85006722293.","DOI":"10.1109\/ICIP.2016.7533052"},{"key":"e_1_2_10_6_2","article-title":"Hedging deep features for visual tracking","author":"Qi Y.","year":"2018","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_2_10_7_2","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava N.","year":"2014","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_10_8_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1996.tb02080.x"},{"key":"e_1_2_10_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-20192-9"},{"key":"e_1_2_10_10_2","doi-asserted-by":"publisher","DOI":"10.1117\/1.2819119"},{"key":"e_1_2_10_11_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jco.2016.07.001"},{"key":"e_1_2_10_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jco.2016.05.003"},{"key":"e_1_2_10_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_10_14_2","unstructured":"KrizhevskyA. SutskeverI. andHintonG. E. Imagenet classification with deep convolutional neural networks Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS \u203212) December 2012 Lake Tahoe Nev USA 1097\u20131105 2-s2.0-84876231242."},{"key":"e_1_2_10_15_2","unstructured":"WangT. WuD. J. CoatesA. andNgA. Y. End-to-end text recognition with convolutional neural networks Proceedings of the 21st International Conference on Pattern Recognition (ICPR \u203212) November 2012 3304\u20133308 2-s2.0-84874562673."},{"key":"e_1_2_10_16_2","unstructured":"BissaccoA. CumminsM. NetzerY. andNevenH. PhotoOCR: Reading text in uncontrolled conditions Proceedings of the 2013 14th IEEE International Conference on Computer Vision ICCV 2013 December 2013 Australia 785\u2013792 2-s2.0-84898778744."},{"key":"e_1_2_10_17_2","unstructured":"DengY. KanervistoA. andRushA. M. What you get is what you see: a visual markup decompiler 2016 https:\/\/arxiv.org\/abs\/1609.04938v1."},{"key":"e_1_2_10_18_2","doi-asserted-by":"crossref","unstructured":"LeeC.-Y.andOsinderoS. Recursive recurrent nets with attention modeling for OCR in the wild Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition CVPR 2016 July 2016 USA 2231\u20132239 2-s2.0-84986277814.","DOI":"10.1109\/CVPR.2016.245"},{"key":"e_1_2_10_19_2","doi-asserted-by":"crossref","unstructured":"GravesA. Fern\u00e1ndezS. GomezF. andSchmidhuberJ. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural \u2032networks Proceedings of the ICML 2006: 23rd International Conference on Machine Learning June 2006 USA 369\u2013376 2-s2.0-33749259827.","DOI":"10.1145\/1143844.1143891"},{"key":"e_1_2_10_20_2","doi-asserted-by":"crossref","unstructured":"WangK. BabenkoB. andBelongieS. End-to-end scene text recognition Proceedings of the IEEE International Conference on Computer Vision (ICCV \u203211) November 2011 Barcelona Spain IEEE 1457\u20131464 https:\/\/doi.org\/10.1109\/iccv.2011.6126402 2-s2.0-84863057818.","DOI":"10.1109\/ICCV.2011.6126402"},{"key":"e_1_2_10_21_2","doi-asserted-by":"crossref","unstructured":"NeumannL.andMatasJ. Scene text localization and recognition with oriented stroke detection Proceedings of the 14th IEEE International Conference on Computer Vision (ICCV \u203213) December 2013 97\u2013104 https:\/\/doi.org\/10.1109\/iccv.2013.19 2-s2.0-84898792558.","DOI":"10.1109\/ICCV.2013.19"},{"key":"e_1_2_10_22_2","doi-asserted-by":"crossref","unstructured":"LeeC.-Y. BhardwajA. DiW. JagadeeshV. andPiramuthuR. Region-based discriminative feature pooling for scene text recognition Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition CVPR 2014 June 2014 USA 4050\u20134057 2-s2.0-84911448414.","DOI":"10.1109\/CVPR.2014.516"},{"key":"e_1_2_10_23_2","doi-asserted-by":"crossref","unstructured":"BaiX. YaoC. andLiuW. A learned multi-scale representation for scene text recognition Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014 4042\u20134049.","DOI":"10.1109\/CVPR.2014.515"},{"key":"e_1_2_10_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2339814"},{"key":"e_1_2_10_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-014-0793-6"},{"key":"e_1_2_10_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2008.137"},{"key":"e_1_2_10_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0823-z"},{"key":"e_1_2_10_28_2","unstructured":"HannunA. CaseC. CasperJ. CatanzaroB. DiamosG. ElsenE. PrengerR. SatheeshS. SenguptaS. andCoatesA. Deep speech: Scaling up end-to-end speech recognition https:\/\/arxiv.org\/abs\/1412.5567."},{"key":"e_1_2_10_29_2","unstructured":"AmodeiD. AnanthanarayananS. AnubhaiR. BaiJ. BattenbergE. CaseC. CasperJ. CatanzaroB. ChengQ. andChenG. Deep speech 2: End-to-end speech recognition in english and mandarin International Conference on Machine Learning 2016 173\u2013182."},{"key":"e_1_2_10_30_2","unstructured":"GravesA.andSchmidhuberJ. Offline handwriting recognition with multidimensional recurrent neural networks Proceedings of the 22nd Annual Conference on Neural Information Processing Systems NIPS 2008 December 2008 Canada 545\u2013552 2-s2.0-71249112130."},{"key":"e_1_2_10_31_2","doi-asserted-by":"crossref","unstructured":"Ul-HasanA. AhmedS. B. RashidF. ShafaitF. andBreuelT. M. Offline printed urdu nastaleeq script recognition with bidirectional LSTM networks Proceedings of the 12th International Conference on Document Analysis and Recognition ICDAR 2013 August 2013 USA 1061\u20131065 2-s2.0-84889598113.","DOI":"10.1109\/ICDAR.2013.212"},{"key":"e_1_2_10_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2646371"},{"key":"e_1_2_10_33_2","doi-asserted-by":"crossref","unstructured":"ShiB. BaiX. andYaoC. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition CoRR abs\/1507.05717 https:\/\/arxiv.org\/abs\/1507.05717 https:\/\/doi.org\/10.1109\/TPAMI.2016.2646371 2-s2.0-85032274465.","DOI":"10.1109\/TPAMI.2016.2646371"},{"key":"e_1_2_10_34_2","doi-asserted-by":"crossref","unstructured":"BustaM. NeumannL. andMatasJ. Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework Proceedings of the 16th IEEE International Conference on Computer Vision ICCV 2017 October 2017 Italy 2223\u20132231 2-s2.0-85041908784.","DOI":"10.1109\/ICCV.2017.242"},{"key":"e_1_2_10_35_2","unstructured":"HeP. HuangW. QiaoY. LoyC. C. andTangX. Reading scene text in deep convolutional sequences Proceedings of the 30th AAAI Conference on Artificial Intelligence AAAI 2016 February 2016 USA 3501\u20133508 2-s2.0-85007271139."},{"key":"e_1_2_10_36_2","unstructured":"BahdanauD. ChoK. andBengioY. Neural machine translation by jointly learning to align and translate https:\/\/arxiv.org\/abs\/1409.0473."},{"key":"e_1_2_10_37_2","unstructured":"BaJ. MnihV. andKavukcuogluK. Multiple object recognition with visual attention https:\/\/arxiv.org\/abs\/1412.7755."},{"key":"e_1_2_10_38_2","unstructured":"XuK. BaJ. L. KirosR. ChoK. CourvilleA. SalakhutdinovR. ZemelR. S. andBengioY. Show attend and tell: Neural image caption generation with visual attention Proceedings of the 32nd International Conference on Machine Learning ICML 2015 July 2015 France 2048\u20132057 2-s2.0-84970002232."},{"key":"e_1_2_10_39_2","unstructured":"JaderbergM. SimonyanK. ZissermanA. andKavukcuogluK. Spatial transformer networks Proceedings of the 29th Annual Conference on Neural Information Processing Systems NIPS 2015 December 2015 Canada 2017\u20132025 2-s2.0-84965096967."},{"key":"e_1_2_10_40_2","doi-asserted-by":"crossref","unstructured":"LinT. GoyalP. GirshickR. HeK. andDollarP. Focal Loss for Dense Object Detection Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV) October 2017 Venice 2999\u20133007 https:\/\/doi.org\/10.1109\/ICCV.2017.324.","DOI":"10.1109\/ICCV.2017.324"},{"key":"e_1_2_10_41_2","doi-asserted-by":"crossref","unstructured":"LeeC.-Y.andOsinderoS. Recursive recurrent nets with attention modeling for ocr in the wild The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016.","DOI":"10.1109\/CVPR.2016.245"},{"key":"e_1_2_10_42_2","doi-asserted-by":"crossref","unstructured":"HeK. ZhangX. RenS. andSunJ. Deep residual learning for image recognition Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition CVPR 2016 July 2016 770\u2013778 2-s2.0-84986274465.","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_10_43_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_10_44_2","doi-asserted-by":"publisher","DOI":"10.1162\/153244303768966139"},{"key":"e_1_2_10_45_2","doi-asserted-by":"crossref","unstructured":"ZhouJ.andXuW. End-to-end learning of semantic role labeling using recurrent neural networks Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) July 2015 Beijing China 1127\u20131137 https:\/\/doi.org\/10.3115\/v1\/P15-1109.","DOI":"10.3115\/v1\/P15-1109"},{"key":"e_1_2_10_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/72.279181"},{"key":"e_1_2_10_47_2","unstructured":"PascanuR. MikolovT. andBengioY. On the difficulty of training recurrent neural networks Proceedings of the 30th International Conference on Machine Learning ICML 2013 June 2013 USA 2347\u20132355 2-s2.0-84897497795."},{"key":"e_1_2_10_48_2","doi-asserted-by":"crossref","unstructured":"DahlG. E. SainathT. N. andHintonG. E. Improving deep neural networks for LVCSR using rectified linear units and dropout Proceedings of the 38th IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP \u203213) May 2013 8609\u20138613 https:\/\/doi.org\/10.1109\/icassp.2013.6639346 2-s2.0-84890527827.","DOI":"10.1109\/ICASSP.2013.6639346"},{"key":"e_1_2_10_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-76153-9_28"},{"key":"e_1_2_10_50_2","doi-asserted-by":"crossref","unstructured":"ChenguangY. chinese_ocr 2018 https:\/\/github.com\/YCG09\/chinese_ocr https:\/\/doi.org\/10.1002\/cala.30776.","DOI":"10.1002\/cala.30776"},{"key":"e_1_2_10_51_2","unstructured":"SutskeverI. MartensJ. DahlG. andHintonG. On The Importance of Initialization And Momentum in Deep Learning International Conference on Machine Learning June 2013 1139\u20131147 2-s2.0-84897510162."}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2019\/9345861.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2019\/9345861.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2019\/9345861","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T14:29:03Z","timestamp":1723040943000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2019\/9345861"}},"subtitle":[],"editor":[{"given":"Li","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2019,1]]},"references-count":51,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,1]]}},"alternative-id":["10.1155\/2019\/9345861"],"URL":"https:\/\/doi.org\/10.1155\/2019\/9345861","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"value":"1076-2787","type":"print"},{"value":"1099-0526","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,1]]},"assertion":[{"value":"2018-09-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-12-19","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-01-02","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"9345861"}}