{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T09:56:57Z","timestamp":1772877417824,"version":"3.50.1"},"reference-count":37,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2020,10,6]],"date-time":"2020-10-06T00:00:00Z","timestamp":1601942400000},"content-version":"vor","delay-in-days":554,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Existing approaches to neural machine translation (NMT) generate the target language sequence token-by-token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional\u2013neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49, and 1.04 BLEU points, respectively, and obtains the state-of-the-art per- formance on Chinese-English and English- German translation tasks.1<\/jats:p>","DOI":"10.1162\/tacl_a_00256","type":"journal-article","created":{"date-parts":[[2019,4,16]],"date-time":"2019-04-16T19:33:41Z","timestamp":1555443221000},"page":"91-105","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":57,"title":["Synchronous Bidirectional Neural Machine Translation"],"prefix":"10.1162","volume":"7","author":[{"given":"Long","family":"Zhou","sequence":"first","affiliation":[{"name":"National Laboratory of Pattern Recognition, CASIA, Beijing, China"},{"name":"University of Chinese Academy of Sciences, Beijing, China. long.zhou@nlpr.ia.ac.cn"}]},{"given":"Jiajun","family":"Zhang","sequence":"additional","affiliation":[{"name":"National Laboratory of Pattern Recognition, CASIA, Beijing, China"},{"name":"University of Chinese Academy of Sciences, Beijing, China. jjzhang@nlpr.ia.ac.cn"}]},{"given":"Chengqing","family":"Zong","sequence":"additional","affiliation":[{"name":"National Laboratory of Pattern Recognition, CASIA, Beijing, China"},{"name":"University of Chinese Academy of Sciences, Beijing, China"},{"name":"CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, China. cqzong@nlpr.ia.ac.cn"}]}],"member":"281","published-online":{"date-parts":[[2019,4,1]]},"reference":[{"key":"2021060823291618800_bib1","unstructured":"Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu , AnirudhGoyal, RyanLowe, JoellePineau, AaronCourville, and YoshuaBengio. 2017. An actor-critic algorithm for sequence prediction. In Proceedings of ICLR 2017."},{"key":"2021060823291618800_bib2","unstructured":"Dzmitry Bahdanau, Kyunghyun Cho , and YoshuaBengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR 2015."},{"key":"2021060823291618800_bib3","doi-asserted-by":"crossref","unstructured":"Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu , and MacduffHughes. 2018. The best of both worlds: Combining recent advances in neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 76\u201386. Association for Computational Linguistics.","DOI":"10.18653\/v1\/P18-1008"},{"key":"2021060823291618800_bib4","doi-asserted-by":"crossref","unstructured":"Yongchao Deng, Shanbo Cheng, Jun Lu , KaiSong, JingangWang, ShenglanWu, LiangYao, GuchunZhang, HaiboZhang, PeiZhang, ChangfengZhu, and BoxingChen. 2018. Alibaba\u2019s neural machine translation systems for wmt18. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 368\u2013376. Association for Computational Linguistics.","DOI":"10.18653\/v1\/W18-6408"},{"key":"2021060823291618800_bib5","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee , and KristinaToutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805."},{"key":"2021060823291618800_bib6","doi-asserted-by":"crossref","unstructured":"Andrew Finch and EiichiroSumita. 2009. Bidirectional phrase-based statistical machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1124\u20131132. Association for Computational Linguistics.","DOI":"10.3115\/1699648.1699658"},{"key":"2021060823291618800_bib7","unstructured":"Jonas Gehring, Michael Auli, David Grangier, Denis Yarats , and Yann N.Dauphin. 2017. Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1243\u20131252, International Convention Centre, Sydney, Australia. PMLR."},{"key":"2021060823291618800_bib8","unstructured":"Di He , HanqingLu, YingceXia, TaoQin, LiweiWang, and Tie-YanLiu. 2017. Decoding with value networks for neural machine translation. In I.Guyon, U. V.Luxburg, S.Bengio, H.Wallach, R.Fergus, S.Vishwanathan, and R.Garnett, editors, Advances in Neural Information Processing Systems 30, pages 178\u2013187. Curran Associates, Inc."},{"key":"2021060823291618800_bib9","unstructured":"Cong Duy Vu Hoang, Gholamreza Haffari , and TrevorCohn. 2017. Towards decoding as continuous optimisation in neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 146\u2013156. Association for Computational Linguistics."},{"key":"2021060823291618800_bib10","doi-asserted-by":"crossref","unstructured":"Philipp Koehn and RebeccaKnowles. 2017. Six challenges for neural machine translation. In Proceedings of the First Workshop on Neural Machine Translation, pages 28\u201339. Association for Computational Linguistics.","DOI":"10.18653\/v1\/W17-3204"},{"key":"2021060823291618800_bib11","unstructured":"Jiwei Li, Will Monroe , and DanJurafsky. 2017. Learning to decode for future success. arXiv preprint arXiv:1701.06549."},{"key":"2021060823291618800_bib12","unstructured":"Xintong Li, Lemao Liu, Zhaopeng Tu , ShumingShi, and MaxMeng. 2018. Target foresight based attention for neural machine translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1380\u20131390. Association for Computational Linguistics."},{"key":"2021060823291618800_bib13","doi-asserted-by":"crossref","unstructured":"Lemao Liu, Masao Utiyama, Andrew Finch , and EiichiroSumita. 2016. Agreement on target-bidirectional neural machine translation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 411\u2013416. Association for Computational Linguistics.","DOI":"10.18653\/v1\/N16-1046"},{"key":"2021060823291618800_bib14","doi-asserted-by":"crossref","unstructured":"Yuchen Liu, Long Zhou, Yining Wang, Yang Zhao, Jiajun Zhang , and ChengqingZong. 2018. A comparable study on model averaging, ensembling and reranking in nmt. In Natural Language Processing and Chinese Computing, pages 299\u2013308, Cham. Springer International Publishing.","DOI":"10.1007\/978-3-319-99501-4_26"},{"key":"2021060823291618800_bib15","doi-asserted-by":"crossref","unstructured":"Thang Luong, Hieu Pham , and Christopher D.Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1412\u20131421. Association for Computational Linguistics.","DOI":"10.18653\/v1\/D15-1166"},{"key":"2021060823291618800_bib16","unstructured":"Haitao Mi, Baskaran Sankaran, Zhiguo Wang , and AbeIttycheriah. 2016. Coverage embedding models for neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 955\u2013960. Association for Computational Linguistics."},{"key":"2021060823291618800_bib17","unstructured":"Jan Niehues, Eunah Cho, Thanh-Le Ha , and AlexWaibel. 2016. Pre-translation for neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1828\u20131836. The COLING 2016 Organizing Committee."},{"key":"2021060823291618800_bib18","doi-asserted-by":"crossref","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward , and Wei-JingZhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.","DOI":"10.3115\/1073083.1073135"},{"key":"2021060823291618800_bib19","doi-asserted-by":"crossref","unstructured":"Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone , and PhilipWilliams. 2017. The university of edinburgh\u2019s neural mt systems for wmt17. In Proceedings of the Second Conference on Machine Translation, pages 389\u2013399. Association for Computational Linguistics.","DOI":"10.18653\/v1\/W17-4739"},{"key":"2021060823291618800_bib20","doi-asserted-by":"crossref","unstructured":"Rico Sennrich, Barry Haddow , and AlexandraBirch. 2016a. Edinburgh neural machine translation systems for wmt 16. In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, pages 371\u2013376. Association for Computational Linguistics.","DOI":"10.18653\/v1\/W16-2323"},{"key":"2021060823291618800_bib21","doi-asserted-by":"crossref","unstructured":"Rico Sennrich, Barry Haddow , and AlexandraBirch. 2016b. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715\u20131725. Association for Computational Linguistics.","DOI":"10.18653\/v1\/P16-1162"},{"key":"2021060823291618800_bib22","unstructured":"Dmitriy Serdyuk, Nan Rosemary Ke, Alessandro Sordoni, Adam Trischler, Chris Pal , and YoshuaBengio. 2018. Twin networks: Matching the future for sequence generation. In International Conference on Learning Representations."},{"key":"2021060823291618800_bib23","doi-asserted-by":"crossref","unstructured":"Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu , MaosongSun, and YangLiu. 2016. Minimum risk training for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1683\u20131692. Association for Computational Linguistics.","DOI":"10.18653\/v1\/P16-1159"},{"key":"2021060823291618800_bib24","unstructured":"Ilya Sutskever, Oriol Vinyals , and Quoc VLe. 2014. Sequence to sequence learning with neural networks. In Z.Ghahramani, M.Welling, C.Cortes, N. D.Lawrence, and K. Q.Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 3104\u20133112. Curran Associates, Inc."},{"key":"2021060823291618800_bib25","unstructured":"Zhixing Tan, Boli Wang, Jinming Hu , YidongChen, and XiaodongShi. 2017. Xmu neural machine translation systems for wmt 17. In Proceedings of the Second Conference on Machine Translation, pages 400\u2013404. Association for Computational Linguistics."},{"key":"2021060823291618800_bib26","unstructured":"Zhaopeng Tu , ZhengdongLu, YangLiu, XiaohuaLiu, and HangLi. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 76\u201385. Association for Computational Linguistics."},{"key":"2021060823291618800_bib27","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141 ukasz Kaiser , and IlliaPolosukhin. 2017. Attention is all you need. In I.Guyon, U. V.Luxburg, S.Bengio, H.Wallach, R.Fergus, S.Vishwanathan, and R.Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5998\u20136008. Curran Associates, Inc."},{"key":"2021060823291618800_bib28","doi-asserted-by":"crossref","unstructured":"Taro Watanabe and Eiichiro Sumita. 2002. Bidirectional decoding for statistical machine translation. In COLING 2002: The 19th International Conference on Computational Linguistics.","DOI":"10.3115\/1072228.1072278"},{"key":"2021060823291618800_bib29","unstructured":"Yonghui Wu , MikeSchuster, ZhifengChen, Quoc VLe, MohammadNorouzi, WolfgangMacherey, MaximKrikun, YuanCao, QinGao, KlausMacherey, JeffKlingner, ApurvaShah, MelvinJohnson, XiaobingLiu, LukaszKaiser, StephanGouws, YoshikiyoKato, TakuKudo, HidetoKazawa, KeithStevens, GeorgeKurian, NishantPatil, WeiWang, CliffYoung, JasonSmith, JasonRiesa, AlexRudnick, OriolVinyals, GregCorrado, MacduffHughes, and JeffreyDean. 2016. Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144."},{"key":"2021060823291618800_bib30","unstructured":"Yingce Xia, Fei Tian, Lijun Wu , JianxinLin, TaoQin, NenghaiYu, and Tie-YanLiu. 2017. Deliberation networks: Sequence generation beyond one-pass decoding. In I.Guyon, U. V.Luxburg, S.Bengio, H.Wallach, R.Fergus, S.Vishwanathan, and R.Garnett, editors, Advances in Neural Information Processing Systems 30, pages 1784\u20131794. Curran Associates, Inc."},{"key":"2021060823291618800_bib31","doi-asserted-by":"crossref","unstructured":"Yilin Yang, Liang Huang , and MingboMa. 2018. Breaking the beam search curse: A study of (re-)scoring methods and stopping criteria for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3054\u20133059. Association for Computational Linguistics.","DOI":"10.18653\/v1\/D18-1342"},{"key":"2021060823291618800_bib32","unstructured":"Hui Zhang, Kristina Toutanova, Chris Quirk , and JianfengGao. 2013. Beyond left-to-right: Multiple decomposition structures for smt. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 12\u201321. Association for Computational Linguistics."},{"key":"2021060823291618800_bib33","doi-asserted-by":"crossref","unstructured":"Jiajun Zhang and ChengqingZong. 2015. Deep neural networks in machine translation: An overview. IEEE Intelligent Systems, 30(5):16\u201325.","DOI":"10.1109\/MIS.2015.69"},{"key":"2021060823291618800_bib34","doi-asserted-by":"crossref","unstructured":"Xiangwen Zhang, Jinsong Su , YueQin, YangLiu, RongrongJi, and HongjiWang. 2018. Asynchronous bidirectional decoding for neural machine translation. In Proceedings of AAAI 2018.","DOI":"10.1609\/aaai.v32i1.11984"},{"key":"2021060823291618800_bib35","doi-asserted-by":"crossref","unstructured":"Zaixiang Zheng, Hao Zhou, Shujian Huang, Lili Mou, Xinyu Dai, Jiajun Chen , and ZhaopengTu. 2018. Modeling past and future for neural machine translation. Transactions of the Assocation for Computational Linguistics, 6:145\u2013157.","DOI":"10.1162\/tacl_a_00011"},{"key":"2021060823291618800_bib36","doi-asserted-by":"crossref","unstructured":"Long Zhou, Wenpeng Hu , JiajunZhang, and ChengqingZong. 2017a. Neural system combination for machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 378\u2013384. Association for Computational Linguistics.","DOI":"10.18653\/v1\/P17-2060"},{"key":"2021060823291618800_bib37","doi-asserted-by":"crossref","unstructured":"Long Zhou, Jiajun Zhang , and ChengqingZong. 2017b. Look-ahead attention for generation in neural machine translation. In Natural Language Processing and Chinese Computing, pages 211\u2013223, Cham. Springer International Publishing.","DOI":"10.1007\/978-3-319-73618-1_18"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00256\/1923665\/tacl_a_00256.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00256\/1923665\/tacl_a_00256.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,16]],"date-time":"2024-07-16T23:27:08Z","timestamp":1721172428000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00256\/43504\/Synchronous-Bidirectional-Neural-Machine"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,1]]},"references-count":37,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00256","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,4]]},"published":{"date-parts":[[2019,4,1]]}}}