{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T17:16:32Z","timestamp":1771002992822,"version":"3.50.1"},"reference-count":26,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2025,5,3]],"date-time":"2025-05-03T00:00:00Z","timestamp":1746230400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Computational Methods in Sciences and Engineering"],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:p>Machine translation (MT) for underrepresented languages, such as Xhosa, presents significant challenges due to limited linguistic resources and the complex nature of these languages. This research proposes a novel approach to improve MT accuracy for Xhosa-to-English translation using Adaptive Gradient Boosted Bidirectional Encoder Representations from Transformers (AdaGrad-BBERT). This method combines the powerful capabilities of BERT with adaptive gradient boosting, enhancing contextual understanding and overall translation accuracy. The system integrates a series of preprocessing steps to optimize model performance. Initially, the dataset undergoes text cleaning, including the removal of noise, normalization of punctuation, and correction of spelling inconsistencies. Tokenization uses BERT\u2019s word-piece model, effectively handling rare and out-of-vocabulary words. Part-of-speech tagging and dependency parsing are applied to capture syntactic relationships specific to Xhosa, which has distinct grammatical structures compared to English. Pre-trained BERT embeddings are employed to generate rich, context-sensitive representations of Xhosa words, ensuring more accurate translations. The encoder-decoder architecture with an attention mechanism is fine-tuned using the AdaGrad-BBERT optimization technique. Translation quality is evaluated using the BLEU score, and model performance is assessed over multiple training epochs. Simulations are conducted using the TensorFlow framework to train and evaluate the model on the Xhosa-English dataset, with results demonstrating significant improvements in translation accuracy, the BLUE score increased 0.896. The proposed system highlights the potential of AdaGrad-BBERT in bridging the gap in MT for underrepresented languages, offering a scalable solution for enhancing MT. This solution holds promising applications in education, cross-cultural communication, and digital inclusion.<\/jats:p>","DOI":"10.1177\/14727978251337995","type":"journal-article","created":{"date-parts":[[2025,5,3]],"date-time":"2025-05-03T10:48:38Z","timestamp":1746269318000},"page":"4523-4538","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Improving machine translation accuracy for underrepresented languages in linguistic research using transformer models"],"prefix":"10.1177","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-5134-7402","authenticated-orcid":false,"given":"Yuanyuan","family":"Liu","sequence":"first","affiliation":[{"name":"Xi\u2019an Kedagaoxin University"}]}],"member":"179","published-online":{"date-parts":[[2025,5,3]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1093\/biosci\/biac062"},{"issue":"1","key":"e_1_3_4_3_2","first-page":"6","article-title":"Exclusion of the non-English-speaking world from the scientific literature: recommendations for change for addiction journals and publishers","volume":"40","author":"Bahji A","year":"2023","unstructured":"Bahji A, Acion L, Laslett AM, et al. Exclusion of the non-English-speaking world from the scientific literature: recommendations for change for addiction journals and publishers. Nordisk Alkohol Nark 2023; 40(1): 6\u201313.","journal-title":"Nordisk Alkohol Nark"},{"key":"e_1_3_4_4_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.3002184"},{"key":"e_1_3_4_5_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-021-98019-3"},{"key":"e_1_3_4_6_2","doi-asserted-by":"publisher","DOI":"10.1080\/10508406.2024.2346915"},{"key":"e_1_3_4_7_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-023-51090-4"},{"key":"e_1_3_4_8_2","doi-asserted-by":"publisher","DOI":"10.3390\/s22113995"},{"key":"e_1_3_4_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3348410"},{"key":"e_1_3_4_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2022.3169897"},{"key":"e_1_3_4_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3310244"},{"key":"e_1_3_4_12_2","doi-asserted-by":"publisher","DOI":"10.1136\/bmjopen-2021-052315"},{"key":"e_1_3_4_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3077350"},{"key":"e_1_3_4_14_2","doi-asserted-by":"publisher","DOI":"10.3390\/electronics14020243"},{"issue":"4","key":"e_1_3_4_15_2","first-page":"353","article-title":"Comparative analysis of transformer models for sentiment analysis in low-resource languages","volume":"15","author":"Aliyu Y","year":"2024","unstructured":"Aliyu Y, Sarlan A, Danyaro KU, et al. Comparative analysis of transformer models for sentiment analysis in low-resource languages. Int J Adv Comput Sci Appl 2024; 15(4): 353\u2013364.","journal-title":"Int J Adv Comput Sci Appl"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.3389\/frai.2022.995667"},{"key":"e_1_3_4_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3326104"},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3308818"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.3390\/app14051707"},{"key":"e_1_3_4_20_2","doi-asserted-by":"publisher","DOI":"10.3390\/app132312673"},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.3390\/app13179530"},{"key":"e_1_3_4_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.122088"},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.122417"},{"key":"e_1_3_4_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.122412"},{"key":"e_1_3_4_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.121734"},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2022.116819"},{"issue":"1","key":"e_1_3_4_27_2","first-page":"675","article-title":"English-Chinese translation quality assessment based on phrase statistical machine translation decoding algorithm","volume":"1","author":"Li J","year":"2024","unstructured":"Li J. English-Chinese translation quality assessment based on phrase statistical machine translation decoding algorithm. International Journal of Maritime Engineering 2024; 1(1): 675\u2013688.","journal-title":"International Journal of Maritime Engineering"}],"container-title":["Journal of Computational Methods in Sciences and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14727978251337995","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14727978251337995","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14727978251337995","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T16:31:11Z","timestamp":1771000271000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14727978251337995"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,3]]},"references-count":26,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["10.1177\/14727978251337995"],"URL":"https:\/\/doi.org\/10.1177\/14727978251337995","relation":{},"ISSN":["1472-7978","1875-8983"],"issn-type":[{"value":"1472-7978","type":"print"},{"value":"1875-8983","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,3]]}}}