{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T16:25:39Z","timestamp":1776961539270,"version":"3.51.4"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"10","license":[{"start":{"date-parts":[[2024,6,1]],"date-time":"2024-06-01T00:00:00Z","timestamp":1717200000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2024,6,1]],"date-time":"2024-06-01T00:00:00Z","timestamp":1717200000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int. J. Mach. Learn. &amp; Cyber."],"published-print":{"date-parts":[[2025,10]]},"DOI":"10.1007\/s13042-024-02206-3","type":"journal-article","created":{"date-parts":[[2024,6,1]],"date-time":"2024-06-01T01:01:35Z","timestamp":1717203695000},"page":"7147-7161","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["DRA: dynamic routing attention for neural machine translation with low-resource languages"],"prefix":"10.1007","volume":"16","author":[{"given":"Zhenhan","family":"Wang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ran","family":"Song","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhengtao","family":"Yu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cunli","family":"Mao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shengxiang","family":"Gao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,6,1]]},"reference":[{"key":"2206_CR1","doi-asserted-by":"publisher","unstructured":"Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: NIPS\u201914: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp 3104\u20133112 . https:\/\/doi.org\/10.5555\/2969033.2969173","DOI":"10.5555\/2969033.2969173"},{"key":"2206_CR2","unstructured":"Bahdanau D, Cho KH, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations"},{"key":"2206_CR3","doi-asserted-by":"publisher","unstructured":"Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning, pp 1243\u20131252 . https:\/\/doi.org\/10.5555\/3305381.3305510","DOI":"10.5555\/3305381.3305510"},{"key":"2206_CR4","doi-asserted-by":"publisher","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 6000\u20136010 . https:\/\/doi.org\/10.5555\/3295222.3295349","DOI":"10.5555\/3295222.3295349"},{"key":"2206_CR5","unstructured":"Devlin J, Chang M.W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4171\u20134186. https:\/\/aclanthology.org\/N19-1423"},{"key":"2206_CR6","doi-asserted-by":"crossref","unstructured":"Wang Q, Li B, Xiao T, Zhu J, Li C, Wong D.F, Chao LS (2019) Learning deep transformer models for machine translation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1810\u20131822 . https:\/\/aclanthology.org\/P19-1176\/","DOI":"10.18653\/v1\/P19-1176"},{"key":"2206_CR7","doi-asserted-by":"crossref","unstructured":"Wu L, Wang Y, Xia Y, Tian F, Gao F, Qin T, Lai J, Liu TY (2019) Depth growing for neural machine translation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5558\u20135563 . https:\/\/aclanthology.org\/P19-1558\/","DOI":"10.18653\/v1\/P19-1558"},{"key":"2206_CR8","doi-asserted-by":"crossref","unstructured":"Bapna A, Chen M, Firat O, Cao Y, Wu Y (2018) Training deeper neural machine translation models with transparent attention. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3028\u20133033 . https:\/\/aclanthology.org\/D18-1338\/","DOI":"10.18653\/v1\/D18-1338"},{"key":"2206_CR9","doi-asserted-by":"crossref","unstructured":"Gu S, Zhang J, Meng F, Feng Y, Xie W, Zhou J, Yu D (2020) Token-level adaptive training for neural machine translation. In: Proceedings of the 2020 conference on empirical methods in natural language processing, pp 1035\u20131046 . https:\/\/aclanthology.org\/2020.emnlp-main.76","DOI":"10.18653\/v1\/2020.emnlp-main.76"},{"key":"2206_CR10","doi-asserted-by":"publisher","unstructured":"Mikolov T, Sutskever I, chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems, pp 3111\u20133119 . https:\/\/doi.org\/10.5555\/2999792.2999959","DOI":"10.5555\/2999792.2999959"},{"key":"2206_CR11","doi-asserted-by":"crossref","unstructured":"Parikh A, T\u00e4ckstr\u00f6m O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2249\u20132255 . https:\/\/aclanthology.org\/D16-1244","DOI":"10.18653\/v1\/D16-1244"},{"key":"2206_CR12","doi-asserted-by":"publisher","DOI":"10.1145\/3660522","author":"P Hao","year":"2024","unstructured":"Hao P, Zhang J, Huang X, Hao Z, Li A, Yu Z, Yu PS (2024) Unsupervised social bot detection via structural information theory. ACM Trans Inf Syst. https:\/\/doi.org\/10.1145\/3660522","journal-title":"ACM Trans Inf Syst"},{"key":"2206_CR13","doi-asserted-by":"crossref","unstructured":"Zhang B, Xiong D, Su J (2020) Neural machine translation with deep attention. In: IEEE transactions on pattern analysis and machine intelligence, pp 154\u2013163. https:\/\/ieeexplore.ieee.org\/document\/8493282","DOI":"10.1109\/TPAMI.2018.2876404"},{"key":"2206_CR14","doi-asserted-by":"crossref","unstructured":"Zhang B, Xiong D, Xie J, Su J (2020) Neural machine translation with gru-gated attention model. In: IEEE transactions on neural networks and learning systems, pp 4688\u20134698 . https:\/\/api.semanticscholar.org\/CorpusID:209895328","DOI":"10.1109\/TNNLS.2019.2957276"},{"key":"2206_CR15","doi-asserted-by":"crossref","unstructured":"Zhang B, Xiong D, Su J (2018) Accelerating neural transformer via an average attention network. In: Proceedings of the 56th annual meeting of the association for computational linguistics, pp. 1789\u20131798 . https:\/\/aclanthology.org\/P18-1166.pdf","DOI":"10.18653\/v1\/P18-1166"},{"key":"2206_CR16","doi-asserted-by":"crossref","unstructured":"Zhang B, Xiong D, Ge Y, Yao J, Yue H, Su J (2022) Aan+: gneralized average attention network for accelerating neural transformer. J Artif Intell 677\u2013708. https:\/\/www.jair.org\/index.php\/jair\/article\/view\/13896","DOI":"10.1613\/jair.1.13896"},{"key":"2206_CR17","doi-asserted-by":"crossref","unstructured":"Lin H, Meng F, Su J, Yin Y, Yin Y, Ge Y (2020) Dynamic context-guided capsule network for multimodal machine translation. In: Proceedings of the 28th ACM international conference on multimedia, pp. 1320\u20131329 . https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3413715","DOI":"10.1145\/3394171.3413715"},{"key":"2206_CR18","unstructured":"Shaw P, Uszkoreit J, Vaswani A (2020) Self-attention with relative position representations. In: Proceedings of the 2020 conference on empirical methods in natural language processing, pp 2249\u20132255 . https:\/\/aclanthology.org\/2020.emnlp-main.80"},{"key":"2206_CR19","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770\u2013778. https:\/\/ieeexplore.ieee.org\/document\/7780459","DOI":"10.1109\/CVPR.2016.90"},{"key":"2206_CR20","doi-asserted-by":"crossref","unstructured":"Sennrich R, Haddow B, Firat O, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1715\u20131725 . https:\/\/aclanthology.org\/P16-1162\/","DOI":"10.18653\/v1\/P16-1162"},{"key":"2206_CR21","doi-asserted-by":"crossref","unstructured":"Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Grangier D, Auli M (2019) fairseq: A fast, extensible toolkit for sequence modeling. In: Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics, pp 48\u201353 . https:\/\/aclanthology.org\/N19-4009.pdf","DOI":"10.18653\/v1\/N19-4009"},{"key":"2206_CR22","unstructured":"Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations"},{"key":"2206_CR23","doi-asserted-by":"crossref","unstructured":"McDonald C, Chiang D (2021) Syntax-based attention masking for neural machine translation. In: Proceedings of the fourth conference on machine translation, pp 47\u201352. https:\/\/aclanthology.org\/2021.naacl-srw.7","DOI":"10.18653\/v1\/2021.naacl-srw.7"},{"key":"2206_CR24","doi-asserted-by":"crossref","unstructured":"Zhang T, Ye W, Yang B, Zhang L, Ren X, Liu D, Sun J, Zhang S, Zhang H, Zhao W (2022) Frequency-aware contrastive learning for neural machine translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 11712\u201311720 . https:\/\/aaai.org\/papers\/11712-frequency-aware-contrastive-learning-for-neural-machine-translation\/","DOI":"10.1609\/aaai.v36i10.21426"},{"key":"2206_CR25","doi-asserted-by":"crossref","unstructured":"Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp. 311\u2013318. https:\/\/aclanthology.org\/P02-1040.pdf","DOI":"10.3115\/1073083.1073135"},{"key":"2206_CR26","doi-asserted-by":"crossref","unstructured":"Rei R, Stewart C, Farinha A.C, Lavie A (2020) Comet: A neural framework for mt evaluation. In: Proceedings of the 2020 conference on empirical methods in natural language processing, pp. 2685\u20132702 . https:\/\/aclanthology.org\/2020.emnlp-main.213.pdf","DOI":"10.18653\/v1\/2020.emnlp-main.213"},{"key":"2206_CR27","doi-asserted-by":"crossref","unstructured":"McDonald C, Chiang D (2021) Syntax-based attention masking for neural machine translation. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: Student research workshop, pp 47\u201352 . https:\/\/aclanthology.org\/2021.naacl-srw.7","DOI":"10.18653\/v1\/2021.naacl-srw.7"},{"key":"2206_CR28","unstructured":"Shiv V.L, Quirk C (2019) Novel positional encodings to enable tree-based transformers. In: Proceedings of the 33rd international conference on neural information processing systems, pp. 12081\u201312091 . https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2019\/file\/6e0917469214d8fbd8c517dcdc6b8dcf-Paper.pdf"},{"key":"2206_CR29","doi-asserted-by":"crossref","unstructured":"Wu S, Zhang D, Yang N, Li M, Zhou M (2017) Sequence-to-dependency neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 698\u2013707 . https:\/\/aclanthology.org\/P17-1065\/","DOI":"10.18653\/v1\/P17-1065"},{"key":"2206_CR30","doi-asserted-by":"crossref","unstructured":"Zhang M, Li Z, Fu G, Zhang M (2019) Syntax-enhanced neural machine translation with syntax-aware word representations. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics, pp 1151\u20131161 . https:\/\/aclanthology.org\/N19-1118.pdf","DOI":"10.18653\/v1\/N19-1118"},{"key":"2206_CR31","doi-asserted-by":"crossref","unstructured":"Omote Y, Tamura A, Ninomiya T (2019) Dependency-based relative positional encoding for transformer nmt. In: Proceedings of the international conference on recent advances in natural language processing, pp 854\u2013861. https:\/\/acl-bg.org\/proceedings\/2019\/RANLP%202019\/pdf\/RANLP099.pdf","DOI":"10.26615\/978-954-452-056-4_099"},{"key":"2206_CR32","doi-asserted-by":"crossref","unstructured":"Luong MT, Manning CD (2016) Achieving open vocabulary neural machine translation with hybrid word-character models. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1054\u20131063. https:\/\/aclanthology.org\/P16-1100","DOI":"10.18653\/v1\/P16-1100"},{"key":"2206_CR33","doi-asserted-by":"crossref","unstructured":"Lee J, Cho K, Hofmann T (2017) Fully character-level neural machine translation without explicit segmentation. In: Transactions of the association for computational linguistics, pp. 365\u2013378 (2017). https:\/\/aclanthology.org\/Q17-1026\/","DOI":"10.1162\/tacl_a_00067"},{"key":"2206_CR34","doi-asserted-by":"crossref","unstructured":"Gowda T, May J (2020) Finding the optimal vocabulary size for neural machine translation. In: Findings of the association for computational linguistics: EMNLP 2020, pp 3955\u20133964. https:\/\/aclanthology.org\/2020.findings-emnlp.352","DOI":"10.18653\/v1\/2020.findings-emnlp.352"},{"key":"2206_CR35","doi-asserted-by":"crossref","unstructured":"Xu J, Zhou H, Gan C, Zheng Z, Li L (2021) Vocabulary learning via optimal transport for neural machine translation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 7361\u20137373 . https:\/\/aclanthology.org\/2021.acl-long.571","DOI":"10.18653\/v1\/2021.acl-long.571"},{"key":"2206_CR36","doi-asserted-by":"crossref","unstructured":"Zhang T, Zhang L, Ye W, Li B, Sun J, Zhu X, Zhao W, Zhang S (2021) Point, disambiguate and copy: Incorporating bilingual dictionaries for neural machine translation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 3970\u20133979 . https:\/\/aclanthology.org\/2021.acl-long.307","DOI":"10.18653\/v1\/2021.acl-long.307"},{"key":"2206_CR37","doi-asserted-by":"crossref","unstructured":"Zhang S, Liu Y, Meng F, Chen Y, Xu J, Liu J, Zhou J (2022) Conditional bilingual mutual information based adaptive training for neural machine translation. In: Proceedings of the 60th annual meeting of the association for computational linguistics, pp 2377\u20132389. https:\/\/aclanthology.org\/2022.acl-long.169\/","DOI":"10.18653\/v1\/2022.acl-long.169"}],"container-title":["International Journal of Machine Learning and Cybernetics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-024-02206-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13042-024-02206-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-024-02206-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T16:57:03Z","timestamp":1760547423000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13042-024-02206-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,1]]},"references-count":37,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["2206"],"URL":"https:\/\/doi.org\/10.1007\/s13042-024-02206-3","relation":{},"ISSN":["1868-8071","1868-808X"],"issn-type":[{"value":"1868-8071","type":"print"},{"value":"1868-808X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,1]]},"assertion":[{"value":"26 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 May 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 June 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}