{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T10:52:20Z","timestamp":1761648740120,"version":"3.41.0"},"reference-count":27,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2020,4,7]],"date-time":"2020-04-07T00:00:00Z","timestamp":1586217600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National key research and development plan project","award":["2019QY1800"],"award-info":[{"award-number":["2019QY1800"]}]},{"DOI":"10.13039\/501100005273","name":"Natural Science Foundation of Yunnan Province","doi-asserted-by":"crossref","award":["2018FB104"],"award-info":[{"award-number":["2018FB104"]}],"id":[{"id":"10.13039\/501100005273","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61732005, 61672271, 61761026, 61762056, and 61866020"],"award-info":[{"award-number":["61732005, 61672271, 61761026, 61762056, and 61866020"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Yunnan hightech industry development project","award":["201606"],"award-info":[{"award-number":["201606"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2020,5,31]]},"abstract":"<jats:p>In Neural Machine Translation (NMT), due to the limitations of the vocabulary, unknown words cannot be translated properly, which brings suboptimal performance of the translation system. For resource-scarce NMT that have small-scale training corpus, the effect is amplified. The traditional approach of amplifying the scale of the corpus is not applicable, because the parallel corpus is difficult to obtain in a resource-scarce setting; however, it is easy to obtain and utilize external knowledge, bilingual lexicon, and other resources. Therefore, we propose classification lexicon approach for processing unknown words in the Chinese-Vietnamese NMT task. Specifically, three types of unknown Chinese-Vietnamese words are classified and their corresponding classification lexicon are constructed by word alignment, Wikipedia extraction, and rule-based methods, respectively. After translation, the unknown words are restored by lexicon for post-processing. Experiment results on Chinese-Vietnamese, English-Vietnamese, and Mongolian-Chinese translations show that our approach significantly improves the accuracy and the performance of NMT especially in a resource-scarce setting.<\/jats:p>","DOI":"10.1145\/3373267","type":"journal-article","created":{"date-parts":[[2020,4,7]],"date-time":"2020-04-07T18:40:03Z","timestamp":1586284803000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Towards Integrated Classification Lexicon for Handling Unknown Words in Chinese-Vietnamese Neural Machine Translation"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4012-461X","authenticated-orcid":false,"given":"Wanjin","family":"Che","sequence":"first","affiliation":[{"name":"Faculty of Information Engineering and Automation, Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan, China"}]},{"given":"Zhengtao","family":"Yu","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan, China"}]},{"given":"Zhiqiang","family":"Yu","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan, China"}]},{"given":"Yonghua","family":"Wen","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan, China"}]},{"given":"Junjun","family":"Guo","sequence":"additional","affiliation":[{"name":"Faculty of Information Engineering and Automation, Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan, China"}]}],"member":"320","published-online":{"date-parts":[[2020,4,7]]},"reference":[{"volume-title":"Proceedings of the EMNLP.","year":"2013","author":"Kalchbrenner Nal","key":"e_1_2_1_1_1"},{"volume-title":"Proceedings of the ICLR.","year":"2015","author":"Bahdanau Dzmitry","key":"e_1_2_1_2_1"},{"volume-title":"Le","year":"2014","author":"Sutskever Ilya","key":"e_1_2_1_3_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_2_1_5_1","unstructured":"Y. Tang F. Meng Z. Lu etal 2016. Neural machine translation with external phrase memory. arXiv eprint arXiv:1606.01792.  Y. Tang F. Meng Z. Lu et al. 2016. Neural machine translation with external phrase memory. arXiv eprint arXiv:1606.01792."},{"key":"e_1_2_1_6_1","unstructured":"J. Zhang and C. Zong. 2016. Bridging neural machine translation and bilingual lexicon. arxiv:1610.0272.  J. Zhang and C. Zong. 2016. Bridging neural machine translation and bilingual lexicon. arxiv:1610.0272."},{"volume-title":"Proceedings of the IJCAI.","author":"Li X.","key":"e_1_2_1_7_1"},{"volume-title":"Proceedings of the IJCAI. 1629--1634","author":"Jiang L.","key":"e_1_2_1_8_1"},{"key":"e_1_2_1_9_1","first-page":"3","article-title":"2011. Using sublexical translations to handle the OOV problem in machine translation","volume":"10","author":"Huang C.","year":"2011","journal-title":"ACM Trans. Asian Lang. Inform."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1060"},{"volume-title":"Proceedings of the EAMT.","year":"2015","author":"Pinnis Marcis","key":"e_1_2_1_11_1"},{"volume-title":"Proceedings of the ACL. 322--327","year":"2012","author":"Stallard David","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1014"},{"volume-title":"Proceedings of the AISTATS. 246--252","year":"2005","author":"Morin Frederic","key":"e_1_2_1_14_1"},{"volume-title":"Proceedings of the NIPS. 1684--1692","author":"Hermann Karl Moritz","key":"e_1_2_1_15_1"},{"volume-title":"Proceedings of the ACL.","author":"Sennrich R.","key":"e_1_2_1_16_1"},{"volume-title":"Manning","year":"2016","author":"Luong Minh-Thang","key":"e_1_2_1_17_1"},{"volume-title":"Fonollosa","year":"2016","author":"Marta R.","key":"e_1_2_1_18_1"},{"volume-title":"Black","year":"2016","author":"Ling Wang","key":"e_1_2_1_19_1"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1160"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1002"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1001"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1207"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1179"},{"volume-title":"Proceedings of the AAAI.","year":"2017","author":"Zhang Meng","key":"e_1_2_1_25_1"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1062"},{"volume-title":"Proceedings of the ACL.","year":"2012","author":"Xiao Tong","key":"e_1_2_1_27_1"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3373267","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3373267","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:32:58Z","timestamp":1750199578000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3373267"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,7]]},"references-count":27,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2020,5,31]]}},"alternative-id":["10.1145\/3373267"],"URL":"https:\/\/doi.org\/10.1145\/3373267","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2020,4,7]]},"assertion":[{"value":"2019-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-04-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}