{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T15:18:42Z","timestamp":1759331922309,"version":"3.41.0"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,3,15]],"date-time":"2021-03-15T00:00:00Z","timestamp":1615766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61762076"],"award-info":[{"award-number":["61762076"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Fundamental Research Funds for the Central Universities of Northwest MinZu University","award":["31920190114, 31920190112"],"award-info":[{"award-number":["31920190114, 31920190112"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2021,3,31]]},"abstract":"<jats:p>Subword segmentation plays an important role in Tibetan neural machine translation (NMT). The structure of Tibetan words consists of two levels. First, words consist of a sequence of syllables, and then a syllable consists of a sequence of characters. According to this special word structure, we propose two methods for Tibetan subword segmentation, namely syllable-based and character-based methods. The former generates subwords based on the Tibetan syllables, and the latter is based on Tibetan characters. In addition, we carry out experiments with these two subword segmentation methods on low-resource Tibetan-to-Chinese NMT, respectively. The experimental results show that both of them can improve translation performance, in which the subword segmentation based on character sequences can achieve better results. Overall, our proposed character-based subword segmentation is more simple and effective. Moreover, it can achieve better experimental results without paying much attention to the linguistic features of Tibetan.<\/jats:p>","DOI":"10.1145\/3448216","type":"journal-article","created":{"date-parts":[[2021,3,15]],"date-time":"2021-03-15T18:52:49Z","timestamp":1615834369000},"page":"1-11","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Finding Better Subwords for Tibetan Neural Machine Translation"],"prefix":"10.1145","volume":"20","author":[{"given":"Yachao","family":"Li","sequence":"first","affiliation":[{"name":"China\u2019s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jing","family":"Jiang","sequence":"additional","affiliation":[{"name":"China\u2019s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jia","family":"Yangji","sequence":"additional","affiliation":[{"name":"China\u2019s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ning","family":"Ma","sequence":"additional","affiliation":[{"name":"China\u2019s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,3,15]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Jamie Ryan Kiros, and Geoffrey E. Hinton","author":"Ba Jimmy Lei","year":"2016","unstructured":"Jimmy Lei Ba , Jamie Ryan Kiros, and Geoffrey E. Hinton . 2016 . Layer normalization. arXiv:1607.06450. Retrieved from https:\/\/arxiv.org\/abs\/1607.06450. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. arXiv:1607.06450. Retrieved from https:\/\/arxiv.org\/abs\/1607.06450."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of ICLR.","author":"Bahdanau Dzmitry","year":"2015","unstructured":"Dzmitry Bahdanau , Kyunghyun Cho , and Yoshua Bengio . 2015 . Neural machine translation by jointly learning to align and translate . In Proceedings of ICLR. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1338"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.279181"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1008"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of SSST. 103\u2013111","author":"Cho Kyunghyun","year":"2014","unstructured":"Kyunghyun Cho , Bart van Merrienboer , Dzmitry Bahdanau , and Yoshua Bengio . 2014 . On the properties of neural machinetranslation: Encoder-decoder approaches . In Proceedings of SSST. 103\u2013111 . Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machinetranslation: Encoder-decoder approaches. In Proceedings of SSST. 103\u2013111."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1012"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_10_1","first-page":"64","article-title":"Segmenting and POS tagging classical tibetan using a memory-based tagger. Himalay","volume":"16","author":"Hill Nathan","year":"2017","unstructured":"Nathan Hill and Marieke Meelen . 2017 . Segmenting and POS tagging classical tibetan using a memory-based tagger. Himalay . Ling. 16 , 2 (2017), 64 -- 86 . DOI:https:\/\/doi.org\/10.5070\/H916234501 10.5070\/H916234501 Nathan Hill and Marieke Meelen. 2017. Segmenting and POS tagging classical tibetan using a memory-based tagger. Himalay. Ling. 16, 2 (2017), 64--86. DOI:https:\/\/doi.org\/10.5070\/H916234501","journal-title":"Ling."},{"key":"e_1_2_1_11_1","volume-title":"Salakhutdinov","author":"Hinton Geoffrey E.","year":"2012","unstructured":"Geoffrey E. Hinton , Nitish Srivastava , Alex Krizhevsky , Ilya Sutskever , and Ruslan R . Salakhutdinov . 2012 . Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012). Retrieved from https:\/\/arxiv.org\/abs\/1207.0580. Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012). Retrieved from https:\/\/arxiv.org\/abs\/1207.0580."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/IALP.2013.74"},{"key":"e_1_2_1_13_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba . 2015 . Adam : A method for stochastic optimization. In Proceedings of ICLR. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-4012"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1007"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of LoResMT. Association for Machine Translation in the Americas, 21\u201329","author":"Lai Wen","year":"2018","unstructured":"Wen Lai , Xiaobing Zhao , and Wei Bao . 2018 . Tibetan-chinese neural machine translation based on syllable segmentation . In Proceedings of LoResMT. Association for Machine Translation in the Americas, 21\u201329 . Wen Lai, Xiaobing Zhao, and Wei Bao. 2018. Tibetan-chinese neural machine translation based on syllable segmentation. In Proceedings of LoResMT. Association for Machine Translation in the Americas, 21\u201329."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1064"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of COLING. Association for Computational Linguistics, 3038\u20133048","author":"Li Yachao","year":"2018","unstructured":"Yachao Li , Junhui Li , and Min Zhang . 2018 . Adaptive weighting for neural machine translation . In Proceedings of COLING. Association for Computational Linguistics, 3038\u20133048 . Yachao Li, Junhui Li, and Min Zhang. 2018. Adaptive weighting for neural machine translation. In Proceedings of COLING. Association for Computational Linguistics, 3038\u20133048."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377851"},{"key":"e_1_2_1_20_1","first-page":"104","article-title":"Research on tibetan-chinese neural machine translation","volume":"31","author":"Li Yachao","year":"2017","unstructured":"Yachao Li , Deyi Xiong , Min Zhang , Jing Jiang , Ning Ma , and Jianmin Yin . 2017 . Research on tibetan-chinese neural machine translation . J. Chin. Inf. Process. 31 , 6 (2017), 104 \u2013 109 . Yachao Li, Deyi Xiong, Min Zhang, Jing Jiang, Ning Ma, and Jianmin Yin. 2017. Research on tibetan-chinese neural machine translation. J. Chin. Inf. Process. 31, 6 (2017), 104\u2013109.","journal-title":"J. Chin. Inf. Process."},{"key":"e_1_2_1_21_1","first-page":"52","article-title":"Research and implementation of tibetan automatic word segmentation based on conditional random field","volume":"27","author":"Li Yachao","year":"2013","unstructured":"Yachao Li , Jam Yangkyi , Chengqing Zong , and Hongzhi Yu . 2013 . Research and implementation of tibetan automatic word segmentation based on conditional random field . J. Chin. Inf. Process. 27 , 4 (2013), 52 \u2013 58 . Yachao Li, Jam Yangkyi, Chengqing Zong, and Hongzhi Yu. 2013. Research and implementation of tibetan automatic word segmentation based on conditional random field. J. Chin. Inf. Process. 27, 4 (2013), 52\u201358.","journal-title":"J. Chin. Inf. Process."},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. Institute of Digital Enhancement of Cognitive Processing, 168\u2013177","author":"Liu Huidan","year":"2011","unstructured":"Huidan Liu , Minghua Nuo , Longlong Ma , Jian Wu , and Yeping He . 2011 . Tibetan word segmentation as syllable tagging using conditional random field . In Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. Institute of Digital Enhancement of Cognitive Processing, 168\u2013177 . Huidan Liu, Minghua Nuo, Longlong Ma, Jian Wu, and Yeping He. 2011. Tibetan word segmentation as syllable tagging using conditional random field. In Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. Institute of Digital Enhancement of Cognitive Processing, 168\u2013177."},{"key":"e_1_2_1_23_1","first-page":"15","volume-title":"Proceedings of the EMNLP. Association for Computational Linguistics, 1412\u20131421","author":"Luong Thang","year":"1865","unstructured":"Thang Luong , Hieu Pham , and Christopher D. Manning . 2015. Effective approaches to attention-based neural machine translation . In Proceedings of the EMNLP. Association for Computational Linguistics, 1412\u20131421 . DOI:https:\/\/doi.org\/10. 1865 3\/v1\/D 15 - 1166 10.18653\/v1 Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the EMNLP. Association for Computational Linguistics, 1412\u20131421. DOI:https:\/\/doi.org\/10.18653\/v1\/D15-1166"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1162"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1013"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1624"},{"key":"e_1_2_1_30_1","volume-title":"et\u00a0al","author":"Wu Yonghui","year":"2016","unstructured":"Yonghui Wu , Mike Schuster , Zhifeng Chen , et\u00a0al . 2016 . Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. In arXiv:1609.08144. Retrieved from https:\/\/arxiv.org\/abs\/1609.08144. Yonghui Wu, Mike Schuster, Zhifeng Chen, et\u00a0al. 2016. Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. In arXiv:1609.08144. Retrieved from https:\/\/arxiv.org\/abs\/1609.08144."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01716-3_5"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00105"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3448216","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3448216","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:40Z","timestamp":1750193260000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3448216"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,15]]},"references-count":32,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,3,31]]}},"alternative-id":["10.1145\/3448216"],"URL":"https:\/\/doi.org\/10.1145\/3448216","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2021,3,15]]},"assertion":[{"value":"2020-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}