{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:16:09Z","timestamp":1750220169368,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,12,1]],"date-time":"2022-12-01T00:00:00Z","timestamp":1669852800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,12]]},"DOI":"10.1145\/3568562.3568624","type":"proceedings-article","created":{"date-parts":[[2022,11,29]],"date-time":"2022-11-29T00:25:01Z","timestamp":1669681501000},"page":"329-335","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Improving text recognition by combining visual and linguistic features of text"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9467-4978","authenticated-orcid":false,"given":"Cong","family":"Tran","sequence":"first","affiliation":[{"name":"Posts and Telecommunications Institute of Technology, Viet Nam"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5175-8805","authenticated-orcid":false,"given":"Khanh","family":"Nguyen-Trong","sequence":"additional","affiliation":[{"name":"Posts and Telecommunications Institute of Technology, Viet Nam"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0973-0889","authenticated-orcid":false,"given":"Cuong","family":"Pham","sequence":"additional","affiliation":[{"name":"Posts and Telecommunications Institute of Technology, Viet Nam"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8924-4356","authenticated-orcid":false,"given":"Dat","family":"Tran-Anh","sequence":"additional","affiliation":[{"name":"Posts and Telecommunications Institute of Technology, Viet Nam"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6300-336X","authenticated-orcid":false,"given":"Tien","family":"Nguyen-Thi-Tan","sequence":"additional","affiliation":[{"name":"Thai Nguyen University of Medicine and Pharmacy, Viet Nam"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,12]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"DocFormer: End-to-End Transformer for Document Understanding. In 2021 IEEE\/CVF International Conference on Computer Vision (ICCV). 973\u2013983","author":"Appalaraju Srikar","year":"2021","unstructured":"Srikar Appalaraju , Bhavan Jasani , Bhargava\u00a0Urala Kota , Yusheng Xie , and R. Manmatha . 2021 . DocFormer: End-to-End Transformer for Document Understanding. In 2021 IEEE\/CVF International Conference on Computer Vision (ICCV). 973\u2013983 . https:\/\/doi.org\/10.1109\/ICCV48922. 2021 .00103 10.1109\/ICCV48922.2021.00103 Srikar Appalaraju, Bhavan Jasani, Bhargava\u00a0Urala Kota, Yusheng Xie, and R. Manmatha. 2021. DocFormer: End-to-End Transformer for Document Understanding. In 2021 IEEE\/CVF International Conference on Computer Vision (ICCV). 973\u2013983. https:\/\/doi.org\/10.1109\/ICCV48922.2021.00103"},{"key":"e_1_3_2_1_2_1","volume-title":"3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings(2015)","author":"Bahdanau Dzmitry","year":"2015","unstructured":"Dzmitry Bahdanau , Kyung\u00a0Hyun Cho , and Yoshua Bengio . 2015 . Neural machine translation by jointly learning to align and translate . 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings(2015) , 1\u201315. arxiv:1409.0473 Dzmitry Bahdanau, Kyung\u00a0Hyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings(2015), 1\u201315. arxiv:1409.0473"},{"unstructured":"Xiaoxue Chen Lianwen Jin Yuanzhi Zhu Canjie Luo and Tianwei Wang. 2020. Text Recognition in the Wild: A Survey. arxiv:2005.03492\u00a0[cs.CV]  Xiaoxue Chen Lianwen Jin Yuanzhi Zhu Canjie Luo and Tianwei Wang. 2020. Text Recognition in the Wild: A Survey. arxiv:2005.03492\u00a0[cs.CV]","key":"e_1_3_2_1_3_1"},{"unstructured":"Mengmeng Cui Wei Wang Jinjin Zhang and Liang Wang. 2021. Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition. arxiv:2106.06960\u00a0[cs.CV]  Mengmeng Cui Wei Wang Jinjin Zhang and Liang Wang. 2021. Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition. arxiv:2106.06960\u00a0[cs.CV]","key":"e_1_3_2_1_4_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_5_1","DOI":"10.1109\/ICARCV.2016.7838771"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_6_1","DOI":"10.1007\/s10032-021-00363-7"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_7_1","DOI":"10.1109\/NICS.2015.7302222"},{"unstructured":"Armand Joulin Edouard Grave Piotr Bojanowski Matthijs Douze H\u00e9rve J\u00e9gou and Tomas Mikolov. 2016. FastText.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651(2016).  Armand Joulin Edouard Grave Piotr Bojanowski Matthijs Douze H\u00e9rve J\u00e9gou and Tomas Mikolov. 2016. FastText.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651(2016).","key":"e_1_3_2_1_8_1"},{"key":"e_1_3_2_1_9_1","volume-title":"SelfDoc: Self-Supervised Document Representation Learning. In 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5648\u20135656","author":"Li Peizhao","year":"2021","unstructured":"Peizhao Li , Jiuxiang Gu , Jason Kuen , Vlad\u00a0 I. Morariu , Handong Zhao , Rajiv Jain , Varun Manjunatha , and Hongfu Liu . 2021 . SelfDoc: Self-Supervised Document Representation Learning. In 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5648\u20135656 . https:\/\/doi.org\/10.1109\/CVPR46437.2021.00560 10.1109\/CVPR46437.2021.00560 Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad\u00a0I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, and Hongfu Liu. 2021. SelfDoc: Self-Supervised Document Representation Learning. In 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5648\u20135656. https:\/\/doi.org\/10.1109\/CVPR46437.2021.00560"},{"unstructured":"Yuliang* Liu Hao* Chen Chunhua Shen Tong He Lianwen Jin and Liangwei Wang. 2020. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network. arXiv preprint arXiv:2002.10200(2020).  Yuliang* Liu Hao* Chen Chunhua Shen Tong He Lianwen Jin and Liangwei Wang. 2020. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network. arXiv preprint arXiv:2002.10200(2020).","key":"e_1_3_2_1_10_1"},{"unstructured":"Vladimir Loginov. 2021. Why You Should Try the Real Data for the Scene Text Recognition. arxiv:2107.13938\u00a0[cs.CV]  Vladimir Loginov. 2021. Why You Should Try the Real Data for the Scene Text Recognition. arxiv:2107.13938\u00a0[cs.CV]","key":"e_1_3_2_1_11_1"},{"doi-asserted-by":"crossref","unstructured":"S. Manoharan. 2019. A SMART IMAGE PROCESSING ALGORITHM FOR TEXT RECOGNITION INFORMATION EXTRACTION AND VOCALIZATION FOR THE VISUALLY CHALLENGED.  S. Manoharan. 2019. A SMART IMAGE PROCESSING ALGORITHM FOR TEXT RECOGNITION INFORMATION EXTRACTION AND VOCALIZATION FOR THE VISUALLY CHALLENGED.","key":"e_1_3_2_1_12_1","DOI":"10.36548\/jiip.2019.1.004"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_13_1","DOI":"10.1016\/j.ipm.2018.06.001"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_14_1","DOI":"10.1109\/RIVF51545.2021.9642126"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_15_1","DOI":"10.1109\/CVPR46437.2021.00730"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_16_1","DOI":"10.14569\/IJACSA.2022.0130371"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_17_1","DOI":"10.1109\/JSEN.2021.3074642"},{"key":"e_1_3_2_1_18_1","article-title":"An end-to-end framework for the detection of mathematical expressions in scientific document images","volume":"39","author":"Phong Bui\u00a0Hai","year":"2022","unstructured":"Bui\u00a0Hai Phong , Thang\u00a0Manh Hoang , and Thi-Lan Le . 2022 . An end-to-end framework for the detection of mathematical expressions in scientific document images . Expert Syst. J. Knowl. Eng. 39 , 1 (2022). https:\/\/doi.org\/10.1111\/exsy.12800 10.1111\/exsy.12800 Bui\u00a0Hai Phong, Thang\u00a0Manh Hoang, and Thi-Lan Le. 2022. An end-to-end framework for the detection of mathematical expressions in scientific document images. Expert Syst. J. Knowl. Eng. 39, 1 (2022). https:\/\/doi.org\/10.1111\/exsy.12800","journal-title":"Expert Syst. J. Knowl. Eng."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_19_1","DOI":"10.1109\/ICCPCT.2017.8074355"},{"doi-asserted-by":"crossref","unstructured":"K. Shanmugam and B. Vanathi. 2019. Hardcopy Text Recognition and Vocalization for Visually Impaired and Illiterates in Bilingual Language. Springer International Publishing Cham 151-163. https:\/\/doi.org\/10.1007\/978-3-030-02674-5_11 10.1007\/978-3-030-02674-5_11","key":"#cr-split#-e_1_3_2_1_20_1.1","DOI":"10.1007\/978-3-030-02674-5_11"},{"doi-asserted-by":"crossref","unstructured":"K. Shanmugam and B. Vanathi. 2019. Hardcopy Text Recognition and Vocalization for Visually Impaired and Illiterates in Bilingual Language. Springer International Publishing Cham 151-163. https:\/\/doi.org\/10.1007\/978-3-030-02674-5_11","key":"#cr-split#-e_1_3_2_1_20_1.2","DOI":"10.1007\/978-3-030-02674-5_11"},{"doi-asserted-by":"crossref","unstructured":"Zhi Tian Weilin Huang Tong He Pan He and Yu Qiao. 2016. Detecting Text in Natural Image with Connectionist Text Proposal Network. arxiv:1609.03605\u00a0[cs.CV]  Zhi Tian Weilin Huang Tong He Pan He and Yu Qiao. 2016. Detecting Text in Natural Image with Connectionist Text Proposal Network. arxiv:1609.03605\u00a0[cs.CV]","key":"e_1_3_2_1_21_1","DOI":"10.1007\/978-3-319-46484-8_4"},{"key":"e_1_3_2_1_22_1","volume-title":"RIVF International Conference on Computing and Communication Technologies, RIVF 2021","author":"Tran Bao\u00a0Hieu","year":"2021","unstructured":"Bao\u00a0Hieu Tran , Duc\u00a0Viet Hoang , Nguyen\u00a0Manh Hiep , Pham Ngoc\u00a0Bao Anh , Hoang\u00a0Gia Bao , Nguyen\u00a0Duc Anh , Bui\u00a0Hai Phong , Thanh-Hung Nguyen , Phi\u00a0Le Nguyen , and Thi-Lan Le . 2021 . MC-OCR Challenge 2021: A Multi-modal Approach for Mobile-Captured Vietnamese Receipts Recognition . In RIVF International Conference on Computing and Communication Technologies, RIVF 2021 , Hanoi, Vietnam , August 19-21, 2021. IEEE, 1\u20136. https:\/\/doi.org\/10.1109\/RIVF51545.2021.9642088 10.1109\/RIVF51545.2021.9642088 Bao\u00a0Hieu Tran, Duc\u00a0Viet Hoang, Nguyen\u00a0Manh Hiep, Pham Ngoc\u00a0Bao Anh, Hoang\u00a0Gia Bao, Nguyen\u00a0Duc Anh, Bui\u00a0Hai Phong, Thanh-Hung Nguyen, Phi\u00a0Le Nguyen, and Thi-Lan Le. 2021. MC-OCR Challenge 2021: A Multi-modal Approach for Mobile-Captured Vietnamese Receipts Recognition. In RIVF International Conference on Computing and Communication Technologies, RIVF 2021, Hanoi, Vietnam, August 19-21, 2021. IEEE, 1\u20136. https:\/\/doi.org\/10.1109\/RIVF51545.2021.9642088"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_23_1","DOI":"10.1016\/j.pmcj.2022.101685"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_24_1","DOI":"10.1109\/ICCDW45521.2020.9318706"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_25_1","DOI":"10.1007\/s10032-021-00363-7"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_26_1","DOI":"10.1145\/3394486.3403172"},{"key":"e_1_3_2_1_27_1","volume-title":"PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks. (4","author":"Yu Wenwen","year":"2020","unstructured":"Wenwen Yu , Ning Lu , Xianbiao Qi , Ping Gong , and Rong Xiao . 2020 . PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks. (4 2020). http:\/\/arxiv.org\/abs\/2004.07464 Wenwen Yu, Ning Lu, Xianbiao Qi, Ping Gong, and Rong Xiao. 2020. PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks. (4 2020). http:\/\/arxiv.org\/abs\/2004.07464"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_28_1","DOI":"10.1007\/s11704-015-4488-0"}],"event":{"acronym":"SoICT 2022","name":"SoICT 2022: The 11th International Symposium on Information and Communication Technology","location":"Hanoi Vietnam"},"container-title":["The 11th International Symposium on Information and Communication Technology"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3568562.3568624","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3568562.3568624","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:39Z","timestamp":1750186839000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3568562.3568624"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12]]},"references-count":29,"alternative-id":["10.1145\/3568562.3568624","10.1145\/3568562"],"URL":"https:\/\/doi.org\/10.1145\/3568562.3568624","relation":{},"subject":[],"published":{"date-parts":[[2022,12]]},"assertion":[{"value":"2022-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}