{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T07:33:10Z","timestamp":1757575990583,"version":"3.41.0"},"reference-count":31,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,6,19]],"date-time":"2023-06-19T00:00:00Z","timestamp":1687132800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2020AAA0108004"],"award-info":[{"award-number":["2020AAA0108004"]}]},{"name":"General Programmer of the National Natural Science Foundation of China","award":["61976078"],"award-info":[{"award-number":["61976078"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:p>Word alignment is an important task of detecting translation equivalents between a sentence pair. Although word alignment is no longer necessarily needed for neural machine translation, it\u2019s still useful in a wealth of applications, e.g., bilingual lexicon induction, constraint decoding, and so on. However, the most well-known word aligners are still Giza++ and fastAlign, both of which are implementations of traditional IBM models. To keep pace with the advance in NMT, there has been a surge of interest in replacing the IBM models with neural models. We follow this trend but aim to boost performance of word alignment between Japanese and Chinese, which share a large portion of Chinese characters. Our key idea is to leverage these common Chinese characters in both languages as an indicator for inferring alignment; i.e., the source and target words with the common Chinese characters should be most likely aligned. Following this idea, we propose three methods that leverage common Chinese characters to boost the mBERT-based word alignment, including reward factor, representation alignment, and contrastive training. Furthermore, we annotate and release a golden dataset for Japanese-Chinese word alignment. Experiments on the dataset show that our methods outperform several strong baselines in terms of AER score and verify the effectiveness of exploiting common Chinese characters.<\/jats:p>","DOI":"10.1145\/3594634","type":"journal-article","created":{"date-parts":[[2023,4,26]],"date-time":"2023-04-26T12:04:55Z","timestamp":1682510695000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Multilingual BERT-based Word Alignment By Incorporating Common Chinese Characters"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2257-178X","authenticated-orcid":false,"given":"Zezhong","family":"Li","sequence":"first","affiliation":[{"name":"Hefei University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9750-7032","authenticated-orcid":false,"given":"Xiao","family":"Sun","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4860-9184","authenticated-orcid":false,"given":"Fuji","family":"Ren","sequence":"additional","affiliation":[{"name":"University of Electronic Science and Technology of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1620-8490","authenticated-orcid":false,"given":"Jianjun","family":"Ma","sequence":"additional","affiliation":[{"name":"Dalian University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8860-7805","authenticated-orcid":false,"given":"Degen","family":"Huang","sequence":"additional","affiliation":[{"name":"Dalian University of Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0783-5487","authenticated-orcid":false,"given":"Piao","family":"Shi","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, China"}]}],"member":"320","published-online":{"date-parts":[[2023,6,19]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"263","article-title":"The mathematics of statistical machine translation","volume":"19","author":"Brown Peter F.","year":"1993","unstructured":"Peter F. Brown, Vincent J. Della Pietra, and Stephen A. DellaPietra. 1993. The mathematics of statistical machine translation. Computational Linguistics 19, 2 (1993), 263\u2013311.","journal-title":"Computational Linguistics"},{"key":"e_1_3_2_3_2","unstructured":"Steven Cao Nikita Kitaev and Dan Klein. 2020. Multilingual alignment of contextual word representations. In Proceedings of ICLR (ICLR\u201920) . Addis Ababa."},{"key":"e_1_3_2_4_2","first-page":"4781","volume-title":"Proceedings of ACL\/IJCNLP (ACL\/IJCNLP.21)","volume":"1","author":"Chen Chi","year":"2021","unstructured":"Chi Chen, Maosong Sun, and Yang Liu. 2021. Mask-align: Self-supervised neural word alignment. In Proceedings of ACL\/IJCNLP (ACL\/IJCNLP.21), Vol. 1, 4781\u20134791."},{"key":"e_1_3_2_5_2","unstructured":"Ting Chen Simon Kornblith Mohammad Norouzi and Geoffrey E. Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of ICML (ICML\u201920) . Vol. 119 1597\u20131607."},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","unstructured":"Yun Chen Yang Liu Guanhua Chen Xin Jiang and Qun Liu. 2020. Accurate word alignment induction from neural machine translation. In Proceedings of EMNLP (EMNLP\u201920) . 566\u2013576.","DOI":"10.18653\/v1\/2020.emnlp-main.42"},{"issue":"4","key":"e_1_3_2_7_2","first-page":"16:1\u201316:25","article-title":"Chinese-Japanese machine translation exploiting chinese characters","volume":"12","author":"Chu Chenhui","year":"2013","unstructured":"Chenhui Chu, Toshiaki Nakazawa, Daisuke Kawahara, and Sadao Kurohashi. 2013. Chinese-Japanese machine translation exploiting chinese characters. ACM Trans. Asian Lang. Inf. Process. 12, 4 (2013), 16:1\u201316:25.","journal-title":"ACM Trans. Asian Lang. Inf. Process."},{"key":"e_1_3_2_8_2","unstructured":"Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. In Proceedings of NIPS (NIPS\u201919) . Vol. 1."},{"key":"e_1_3_2_9_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT (NAACL-HLT\u201919) . Vol. 32 4171\u20134186."},{"key":"e_1_3_2_10_2","first-page":"2112","volume-title":"Proceedings of EACL (EACL\u201921)","author":"Dou Zi-Yi","year":"2021","unstructured":"Zi-Yi Dou and Graham Neubig. 2021. Word alignment by fine-tuning embeddings on parallel corpora. In Proceedings of EACL (EACL\u201921). 2112\u20132128."},{"key":"e_1_3_2_11_2","unstructured":"Chris Dyer Victor Chahuneau and Noah A. Smith. 2013. A simple fast and effective reparameterization of IBM model 2. In Proceedings of HLT-NAACL (HLT-NAACL\u201913) . 644\u2013648."},{"key":"e_1_3_2_12_2","article-title":"Representation degeneration problem in training natural language generation models","volume":"1907","author":"Gao Jun","year":"2019","unstructured":"Jun Gao, Di He, Xu Tan, Tao Qin, Liwei Wang, and Tieyan Liu. 2019. Representation degeneration problem in training natural language generation models. CoRR abs\/1907.12009 (2019).","journal-title":"CoRR"},{"key":"e_1_3_2_13_2","doi-asserted-by":"crossref","unstructured":"Sarthak Garg Stephan Peitz Udhyakumar Nallasamy and Matthias Paulik. 2019. Jointly learning to align and translate with transformer models. In Proceedings of EMNLP-IJCNLP (EMNLP-IJCNLP\u201919) . Vol. 1 4452\u20134461.","DOI":"10.18653\/v1\/D19-1453"},{"key":"e_1_3_2_14_2","doi-asserted-by":"crossref","unstructured":"Matthias Huck Diana Dutka and Alexander Fraser. 2018. Cross-lingual annotation projection is effective for neural part-of-speech tagging. In Proceedings of the 6th Workshop on NLP for Similar Languages Varieties and Dialects (WSSVD\u201918) . Vol. 1.","DOI":"10.18653\/v1\/W19-1425"},{"key":"e_1_3_2_15_2","doi-asserted-by":"crossref","unstructured":"Saurabh Kulshreshtha Jos\u00e9 Luis Redondo Garc\u00eda and Ching-Yun Chang. 2020. Cross-lingual alignment methods for multilingual BERT: A comparative study. In Proceedings of EMNLP (EMNLP\u201920) . 933\u2013942.","DOI":"10.18653\/v1\/2020.findings-emnlp.83"},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"Zezhong Li Fuji Ren Xiao Sun Degen Huang and Piao Shi. 2023. Exploiting Japanese-Chinese cognates with shared private representations for neural machine translation. ACM Trans. Asian Lang. Inf. Process. 22 1 (2023) 28:1\u201328:12.","DOI":"10.1145\/3533429"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Masaaki Nagata Katsuki Chousa and Masaaki Nishino. 2020. A supervised word alignment method based on cross-language span prediction using multilingual BERT. In Proceedings of EMNLP (EMNLP\u201920) . 555\u2013565.","DOI":"10.18653\/v1\/2020.emnlp-main.41"},{"key":"e_1_3_2_18_2","volume-title":"Proceedings of LREC (LREC\u201916)","author":"Nakazawa Toshiaki","year":"2016","unstructured":"Toshiaki Nakazawa, Manabu Yaguchi, Kiyotaka Uchimoto, Masao Utiyama, Eiichiro Sumita, Sadao Kurohashi, and Hitoshi Isahara. 2016. ASPEC: Asian scientific paper excerpt corpus. In Proceedings of LREC (LREC\u201916)."},{"key":"e_1_3_2_19_2","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1162\/089120103321337421","article-title":"A systematic comparison of various statistical alignment models","volume":"29","author":"Och Franz Josef","year":"2003","unstructured":"Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29, 1 (2003), 19\u201351.","journal-title":"Computational Linguistics"},{"key":"e_1_3_2_20_2","unstructured":"Tsuyoshi Okita. 2012. Annotated corpora for word alignment between Japanese and English and its evaluation with MAP-based word aligner. In Proceedings of ELRA (ELRA\u201912) ."},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","unstructured":"Santanu Pal Sudip Kumar Naskar Mihaela Vela Qun Liu and Josef van Genabith. 2017. Neural automatic post-editing using prior alignment and reranking. In Proceedings of EACL (EACL\u201917) . Vol. 2 349\u2013355.","DOI":"10.18653\/v1\/E17-2056"},{"key":"e_1_3_2_22_2","first-page":"4996","volume-title":"Proceedings of ACL (ACL\u201919)","volume":"1","author":"Pires Telmo","year":"2019","unstructured":"Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How multilingual is multilingual BERT? In Proceedings of ACL (ACL\u201919), Vol. 1, 4996\u20135001."},{"key":"e_1_3_2_23_2","doi-asserted-by":"crossref","unstructured":"Masoud Jalili Sabet Philipp Dufter Fran\u00e7ois Yvon and Hinrich Sch\u00fctze. 2020. SimAlign: High quality word alignments without parallel training data using static and contextualized embeddings. In Proceedings of EMNLP (EMNLP\u201920) . 1627\u20131643.","DOI":"10.18653\/v1\/2020.findings-emnlp.147"},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","unstructured":"Mike Schuster and Kaisuke Nakajima. 2012. Japanese and Korean voice search. In Proceedings of ICASSP (ICASSP\u201912) . 5149\u20135152.","DOI":"10.1109\/ICASSP.2012.6289079"},{"key":"e_1_3_2_25_2","unstructured":"Jinsong Su Biao Zhang Deyi Xiong Ruochen Li and Jianmin Yin. 2016. Convolution-enhanced bilingual recursive neural network for bilingual semantic modeling. In Proceedings of COLING (COLING\u201916) . 3071\u20133081."},{"key":"e_1_3_2_26_2","unstructured":"Ilya Sutskever Oriol Vinyals and Quoc V. Le. 2016. Incorporating discrete translation lexicons into neural machine translation. In Proceedings of EMNLP (EMNLP\u201916) . 1557\u20131567."},{"key":"e_1_3_2_27_2","unstructured":"Haoran Xu Benjamin Van Durme and Kenton W. Murray. 2021. BERT mBERT or BiBERT? A study on contextualized embeddings for neural machine translation. In Proceedings of EMNLP (EMNLP\u201921) . 6663\u20136675."},{"key":"e_1_3_2_28_2","article-title":"Adding interpretable attention to neural translation models improves word alignment","volume":"1901","author":"Zenkel Thomas","year":"2019","unstructured":"Thomas Zenkel, Joern Wuebker, and John DeNero. 2019. Adding interpretable attention to neural translation models improves word alignment. CoRR abs\/1901.11359 (2019).","journal-title":"CoRR"},{"key":"e_1_3_2_29_2","doi-asserted-by":"crossref","unstructured":"Biao Zhang Deyi Xiong and Jinsong Su. 2017. BattRAE: Bidimensional attention-based recursive autoencoders for learning bilingual phrase embeddings. In Proceedings of AAAI (AAAI\u201917) . 3372\u20133378.","DOI":"10.1609\/aaai.v31i1.10969"},{"issue":"2","key":"e_1_3_2_30_2","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1109\/TCYB.2018.2868982","article-title":"Alignment-supervised bidimensional attention-based recursive autoencoders for bilingual phrase representation","volume":"50","author":"Zhang Biao","year":"2020","unstructured":"Biao Zhang, Deyi Xiong, Jinsong Su, and Yue Qin. 2020. Alignment-supervised bidimensional attention-based recursive autoencoders for bilingual phrase representation. IEEE Trans. Cybern. 50, 2 (2020), 503\u2013513.","journal-title":"IEEE Trans. Cybern."},{"key":"e_1_3_2_31_2","first-page":"11712","volume-title":"Proceedings of AAAI (AAAI\u201922)","volume":"119","author":"Zhang Tong","year":"2022","unstructured":"Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, and Wen Zhao. 2022. Frequency-aware contrastive learning for neural machine translation. In Proceedings of AAAI (AAAI\u201922), Vol. 119, 11712\u201311720."},{"key":"e_1_3_2_32_2","unstructured":"Yujie Zhang Zhulong Wang Kiyotaka Uchimoto Qing Ma and Hitoshi Isahara. 2008. Word alignment annotation in a Japanese-Chinese parallel corpus. In Proceedings of LREC (LREC\u201908) ."}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3594634","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3594634","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:08Z","timestamp":1750183748000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3594634"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,19]]},"references-count":31,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6,30]]}},"alternative-id":["10.1145\/3594634"],"URL":"https:\/\/doi.org\/10.1145\/3594634","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2023,6,19]]},"assertion":[{"value":"2022-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-19","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-06-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}