{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T15:35:17Z","timestamp":1759160117334,"version":"3.41.2"},"reference-count":35,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,10,8]],"date-time":"2021-10-08T00:00:00Z","timestamp":1633651200000},"content-version":"vor","delay-in-days":280,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>Powerful deep learning approach frees us from feature engineering in many artificial intelligence tasks. The approach is able to extract efficient representations from the input data, if the data are large enough. Unfortunately, it is not always possible to collect large and quality data. For tasks in low\u2010resource contexts, such as the Russian\u2009\u27f6\u2009Vietnamese machine translation, insights into the data can compensate for their humble size. In this study of modelling Russian\u2009\u27f6\u2009Vietnamese translation, we leverage the input Russian words by decomposing them into not only features but also subfeatures. First, we break down a Russian word into a set of linguistic features: part\u2010of\u2010speech, morphology, dependency labels, and lemma. Second, the lemma feature is further divided into subfeatures labelled with tags corresponding to their positions in the lemma. Being consistent with the source side, Vietnamese target sentences are represented as sequences of subtokens. Sublemma\u2010based neural machine translation proves itself in our experiments on Russian\u2010Vietnamese bilingual data collected from TED talks. Experiment results reveal that the proposed model outperforms the best available Russian\u2009\u27f6\u2009Vietnamese model by 0.97 BLEU. In addition, automatic machine judgment on the experiment results is verified by human judgment. The proposed sublemma\u2010based model provides an alternative to existing models when we build translation systems from an inflectionally rich language, such as Russian, Czech, or Bulgarian, in low\u2010resource contexts.<\/jats:p>","DOI":"10.1155\/2021\/5935958","type":"journal-article","created":{"date-parts":[[2021,10,11]],"date-time":"2021-10-11T02:56:15Z","timestamp":1633920975000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Sublemma\u2010Based Neural Machine Translation"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7796-4778","authenticated-orcid":false,"given":"Thien","family":"Nguyen","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5607-1938","authenticated-orcid":false,"given":"Huu","family":"Nguyen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0793-9590","authenticated-orcid":false,"given":"Phuoc","family":"Tran","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2021,10,8]]},"reference":[{"key":"e_1_2_9_1_2","doi-asserted-by":"crossref","unstructured":"ChoK. MerrienboerB. GulcehreC. BahdanauD. BougaresF. SchwenkH. andBengioY. Learning phrase representations using RNN encoder-decoder for statistical machine translation Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014) October 2014 Doha Qatar 1724\u20131734.","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_2_9_2_2","doi-asserted-by":"crossref","unstructured":"LuongM.-T. PhamH. andManningC. D. Effective approaches to attention-based neural machine translation Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing September 2015 Lisbon Portugal 1412\u20131421.","DOI":"10.18653\/v1\/D15-1166"},{"key":"e_1_2_9_3_2","unstructured":"GehringJ. AuliM. GrangierD. YaratsD. andDauphinY. N. Convolutional sequence to sequence learning Proceedings of the International Conference on Machine Learning January 2017 Ho Chi Minh City Vietnam 1243\u20131252."},{"key":"e_1_2_9_4_2","unstructured":"VaswaniA. ShazeerN. ParmarN. UszkoreitJ. JonesL. GomezA. N. KaiserL. andPolosukhinI. Attention is all you need Proceedings of the Advances in neural information processing systems December 2017 Long Beach CA USA 5998\u20136008."},{"key":"e_1_2_9_5_2","doi-asserted-by":"crossref","unstructured":"GargS. PeitzS. NallasamyU. andPaulikM. Jointly learning to align and translate with transformer models Proceedings of the EMNLP-IJCNLP 2019-2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing November 2020 Hong Kong China 4453\u20134462 https:\/\/doi.org\/10.18653\/v1\/d19-1453.","DOI":"10.18653\/v1\/D19-1453"},{"key":"e_1_2_9_6_2","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/8859452"},{"key":"e_1_2_9_7_2","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/4795187"},{"key":"e_1_2_9_8_2","doi-asserted-by":"crossref","unstructured":"ReimersN.andGurevychI. Making monolingual sentence embeddings multilingual using knowledge distillation Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) November 2020 4512\u20134525.","DOI":"10.18653\/v1\/2020.emnlp-main.365"},{"key":"e_1_2_9_9_2","unstructured":"DingS. RenduchintalaA. andDuhK. A call for prudent choice of subword merge operations in neural machine translation Proceedings of the Machine Translation Summit XVII August 2019 Dublin Ireland 204\u2013213."},{"key":"e_1_2_9_10_2","doi-asserted-by":"crossref","unstructured":"WuY.andZhaoH. Finding better subword segmentation for neural machine translation Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data October 2018 Changsha China Springer 53\u201364.","DOI":"10.1007\/978-3-030-01716-3_5"},{"key":"e_1_2_9_11_2","doi-asserted-by":"crossref","unstructured":"WangC. ChoK. andGuJ. Neural machine translation with byte-level subwords Proceedings of the AAAI Conference on Artificial Intelligence February 2020 New York NY USA 9154\u20139160.","DOI":"10.1609\/aaai.v34i05.6451"},{"key":"e_1_2_9_12_2","doi-asserted-by":"crossref","unstructured":"PinnisM. Kri\u0161lauksR. DeksneD. andMiksT. Neural machine translation for morphologically rich languages with improved sub-word units and synthetic data Proceedings of the International Conference on Text Speech and Dialogue August 2017 Prague Czech Republic 237\u2013245.","DOI":"10.1007\/978-3-319-64206-2_27"},{"key":"e_1_2_9_13_2","doi-asserted-by":"crossref","unstructured":"DeguchiH. UtiyamaM. TamuraA. NinomiyaT. andSumitaE. Bilingual subword segmentation for neural machine translation Proceedings of the 28th International Conference on Computational Linguistics September 2020 Barcelona Spain 4287\u20134297.","DOI":"10.18653\/v1\/2020.coling-main.378"},{"key":"e_1_2_9_14_2","doi-asserted-by":"crossref","unstructured":"SennrichR. HaddowB. andBirchA. Neural machine translation of rare words with subword units Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics August 2016 Berlin Germany 1715\u20131725.","DOI":"10.18653\/v1\/P16-1162"},{"key":"e_1_2_9_15_2","unstructured":"HuetS. ManishinaE. andLef\u00e8vreF. Factored machine translation systems for Russian-English 2013."},{"key":"e_1_2_9_16_2","doi-asserted-by":"crossref","unstructured":"BirchA. OsborneM. andKoehnP. CCG supertags in factored statistical machine translation Proceedings of the second workshop on Statistical Machine Translation June 2007 Prague Czech Republic 9\u201316.","DOI":"10.3115\/1626355.1626357"},{"key":"e_1_2_9_17_2","unstructured":"KoehnP.andHoangH. Factored translation models Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) June 2007 Prague Czech Republic 868\u2013876."},{"key":"e_1_2_9_18_2","doi-asserted-by":"crossref","unstructured":"WangY. WangL. ZengX. WongD. F. ChaoL. S. andLuY. Factored statistical machine translation for grammatical error correction Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task June 2014 Baltimore MD USA 83\u201390.","DOI":"10.3115\/v1\/W14-1711"},{"key":"e_1_2_9_19_2","doi-asserted-by":"crossref","unstructured":"SennrichR.andHaddowB. Linguistic input features improve neural machine translation 2016 http:\/\/arxiv.org\/abs\/1606.02892 https:\/\/doi.org\/10.18653\/v1\/w16-2209.","DOI":"10.18653\/v1\/W16-2209"},{"key":"e_1_2_9_20_2","doi-asserted-by":"crossref","unstructured":"KudoT.andRichardsonJ. SentencePiece: a simple and language independent subword tokenizer and detokenizer for Neural Text Processing Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations November 2018 Brussels Belgium 66\u201371.","DOI":"10.18653\/v1\/D18-2012"},{"key":"e_1_2_9_21_2","doi-asserted-by":"publisher","DOI":"10.1155\/2016\/9821608"},{"key":"e_1_2_9_22_2","doi-asserted-by":"crossref","unstructured":"NguyenT. NguyenH. andTranP. Exploring neural machine translation on the Russian-Vietnamese language pair Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing June 2021 Sendai Japan 393\u2013400.","DOI":"10.1007\/978-981-33-6757-9_49"},{"key":"e_1_2_9_23_2","doi-asserted-by":"crossref","unstructured":"QiP. ZhangY. ZhangY. BoltonJ. andManningC. D. Stanza: a {Python} natural language processing toolkit for many human languages 2020 https:\/\/nlp.stanford.edu\/pubs\/qi2020stanza.pdf.","DOI":"10.18653\/v1\/2020.acl-demos.14"},{"key":"e_1_2_9_24_2","unstructured":"NivreJ. de MarneffeM.-C. GinterF. GoldbergY. HajicJ. ManningC. D. McDonaldR. PetrovS. PyysaloS. SilveiraN. TsarfatyR. andZemanD. Universal dependencies v1: A multilingual treebank collection Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916) May 2016 Portoro\u017e Slovenia 1659\u20131666."},{"key":"e_1_2_9_25_2","first-page":"4585","article-title":"Universal Stanford dependencies: A cross-linguistic typology","volume":"14","author":"De Marneffe M.-C.","year":"2014","journal-title":"LREC"},{"key":"e_1_2_9_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/2988237"},{"key":"e_1_2_9_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3133323"},{"key":"e_1_2_9_28_2","doi-asserted-by":"publisher","DOI":"10.1155\/2021\/5515407"},{"key":"e_1_2_9_29_2","unstructured":"KleinG. KimY. DengY. NguyenV. SenellartJ. andRushA. M. OpenNMT: neural machine translation toolkit Proceedings of the 13th Conference of the Association for Machine Translation in the Americas March 2018 Boston MA USA 177\u2013184."},{"key":"e_1_2_9_30_2","unstructured":"KleinG. HernandezF. NguyenV. andSenellartJ. The OpenNMT neural machine translation toolkit: 2020 edition Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (AMTA 2020) October 2020 Orlando FL USA 102\u2013109."},{"key":"e_1_2_9_31_2","doi-asserted-by":"crossref","unstructured":"FreitagM.andAl-OnaizanY. Beam search strategies for neural machine translation Proceedings of the First Workshop on Neural Machine Translation July 2017 Melbourne Australia 56\u201360.","DOI":"10.18653\/v1\/W17-3207"},{"key":"e_1_2_9_32_2","unstructured":"M\u00fcllerR. KornblithS. andHintonG. E. When does label smoothing help? Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 NeurIPS 2019 December 2019 Vancouver Canada 4696\u20134705 https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/f1748d6b0fd9d439f71450117eba2725-Abstract.html."},{"key":"e_1_2_9_33_2","unstructured":"KingmaD. P.andBaJ. Adam: {A} method for stochastic optimization Proceedings of the 3rd International Conference on Learning Representations {ICLR} 2015 May 2015 San Diego CA USA http:\/\/arxiv.org\/abs\/1412.6980."},{"key":"e_1_2_9_34_2","doi-asserted-by":"crossref","unstructured":"PapineniK. RoukosS. WardT. andZhuW.-J. BLEU: a method for automatic evaluation of machine translation Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics July 2002 Stroudsburg PA USA 311\u2013318.","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_2_9_35_2","doi-asserted-by":"crossref","unstructured":"KoehnP. HoangH. BirchA. andCallison-BurchC. Moses: Open source toolkit for statistical machine translation Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions June 2007 Stroudsburg PA USA 177\u2013180.","DOI":"10.3115\/1557769.1557821"}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/5935958.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2021\/5935958.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/5935958","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,9]],"date-time":"2024-08-09T21:28:14Z","timestamp":1723238894000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/5935958"}},"subtitle":[],"editor":[{"given":"Shahzad","family":"Sarfraz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/5935958"],"URL":"https:\/\/doi.org\/10.1155\/2021\/5935958","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"type":"print","value":"1076-2787"},{"type":"electronic","value":"1099-0526"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2021-05-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-09-24","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-10-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"5935958"}}