{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:11:16Z","timestamp":1760710276503,"version":"3.41.0"},"reference-count":43,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T00:00:00Z","timestamp":1590969600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2020,9,30]]},"abstract":"<jats:p>Recent work achieved remarkable results in training neural machine translation (NMT) systems in a fully unsupervised way, with new and dedicated architectures that only rely on monolingual corpora. However, previous work also showed that unsupervised statistical machine translation (USMT) performs better than unsupervised NMT (UNMT), especially for distant language pairs. To take advantage of the superiority of USMT over UNMT, and considering that SMT suffers from well-known limitations overcome by NMT, we propose to define UNMT as NMT trained with the supervision of synthetic parallel data generated by USMT. This way we can exploit USMT up to its limits while ultimately relying on full-fledged NMT models to generate translations. We show significant improvements in translation quality over previous work and also that further improvements can be obtained by alternatively and iteratively training USMT and UNMT. Without the need of a dedicated architecture for UNMT, our simple approach can straightforwardly benefit from any recent and future advances in supervised NMT. Our systems achieve a new state-of-the-art for unsupervised machine translation in all of our six translation tasks for five diverse language pairs, surpassing even supervised SMT or NMT in some tasks. Furthermore, our analysis shows how crucial the comparability between the monolingual corpora used for unsupervised training is in improving translation quality.<\/jats:p>","DOI":"10.1145\/3389790","type":"journal-article","created":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T10:14:13Z","timestamp":1591006453000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems"],"prefix":"10.1145","volume":"19","author":[{"given":"Benjamin","family":"Marie","sequence":"first","affiliation":[{"name":"National Institute of Information and Communications Technology, Kyoto, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3799-8428","authenticated-orcid":false,"given":"Atsushi","family":"Fujita","sequence":"additional","affiliation":[{"name":"National Institute of Information and Communications Technology, Kyoto, Japan"}]}],"member":"320","published-online":{"date-parts":[[2020,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1073"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1399"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1019"},{"volume-title":"Proceedings of the 6th International Conference on Learning Representations. 12","year":"2018","author":"Artetxe Mikel","key":"e_1_2_1_4_1"},{"volume-title":"Proceedings of the 4th Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)","author":"Barrault Lo\u00efc","key":"e_1_2_1_5_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00051"},{"key":"e_1_2_1_7_1","first-page":"18","volume-title":"Findings of the 2018 conference on machine translation (WMT18). In Proceedings of the 3rd Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, 272\u2013303","author":"Bojar Ond\u0159ej","year":"2018"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/555"},{"key":"e_1_2_1_9_1","first-page":"12","volume-title":"Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 427\u2013436","author":"Cherry Colin","year":"2012"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1119"},{"first-page":"13","volume-title":"Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 644\u2013648","author":"Dyer Chris","key":"e_1_2_1_11_1"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1045"},{"volume-title":"Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies. 260\u2013286","author":"Goto Isao","key":"e_1_2_1_13_1"},{"key":"e_1_2_1_14_1","first-page":"13","volume-title":"Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 690\u2013696","author":"Heafield Kenneth","year":"2013"},{"key":"e_1_2_1_15_1","first-page":"07","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, 144\u2013151","author":"Huang Liang","year":"2007"},{"volume-title":"Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)","author":"Johnson Howard","key":"e_1_2_1_16_1"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-4020"},{"key":"e_1_2_1_18_1","first-page":"16","volume-title":"Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, 1147\u20131158","author":"Kajiwara Tomoyuki","year":"2016"},{"key":"e_1_2_1_19_1","first-page":"12","volume-title":"Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 130\u2013140","author":"Klementiev Alexandre","year":"2012"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/1557769.1557821"},{"key":"e_1_2_1_21_1","first-page":"11","volume-title":"Proceedings of the 6th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 284\u2013293","author":"Lambert Patrik","year":"2011"},{"volume-title":"Cross-lingual language model pretraining. CoRR abs\/1901.07291","year":"2019","author":"Lample Guillaume","key":"e_1_2_1_22_1"},{"volume-title":"Proceedings of the 6th International Conference on Learning Representations. 14","year":"2018","author":"Lample Guillaume","key":"e_1_2_1_23_1"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1549"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3168054"},{"volume-title":"Unsupervised neural machine translation initialized by unsupervised statistical machine translation. CoRR abs\/1810.12703","year":"2018","author":"Marie Benjamin","key":"e_1_2_1_26_1"},{"volume-title":"Advances in Neural Information Processing Systems 26","author":"Mikolov Tomas","key":"e_1_2_1_27_1"},{"key":"e_1_2_1_28_1","first-page":"02","volume-title":"Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 295\u2013302","author":"Och Franz Josef","year":"2002"},{"key":"e_1_2_1_29_1","first-page":"02","volume-title":"Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311\u2013318","author":"Papineni Kishore","year":"2002"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-6319"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.3301241"},{"volume-title":"Sharifah Mahani Aljunied, Luong Chi Mai, Vu Tat Thang, Nguyen Phuong Thai, Vichet Chea, Rapid Sun, Sethserey Sam, Sopheap Seng, Khin Mar Soe, K hin Thandar Nwet, Masao Utiyama, and Chenchen Ding.","year":"2016","author":"Riza Hammam","key":"e_1_2_1_32_1"},{"volume-title":"WikiMatrix: Mining 135M parallel sentences in 1620 language pairs from Wikipedia. CoRR abs\/1907.05791","year":"2019","author":"Schwenk Holger","key":"e_1_2_1_33_1"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1009"},{"volume-title":"Proceedings of the 5th International Conference on Learning Representations. 10","author":"Smith Samuel L.","key":"e_1_2_1_35_1"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1072"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1138"},{"key":"e_1_2_1_38_1","first-page":"07","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, 25\u201332","author":"Ueffing Nicola","year":"2007"},{"volume-title":"Advances in Neural Information Processing Systems 30. Curran Associates","author":"Vaswani Ashish","key":"e_1_2_1_39_1"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1005"},{"key":"e_1_2_1_41_1","first-page":"12","volume-title":"Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 972\u2013983","author":"Zens Richard","year":"2012"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1166"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1176"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3389790","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3389790","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:32Z","timestamp":1750200092000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3389790"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6]]},"references-count":43,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,9,30]]}},"alternative-id":["10.1145\/3389790"],"URL":"https:\/\/doi.org\/10.1145\/3389790","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2020,6]]},"assertion":[{"value":"2019-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}