{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T17:18:44Z","timestamp":1771003124823,"version":"3.50.1"},"reference-count":31,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Computational Methods in Sciences and Engineering"],"published-print":{"date-parts":[[2025,1]]},"abstract":"<jats:p>The voice recognition of non-native tongue English learners is an important challenge in the field of speech recognition. Existing technology still has obvious defects when dealing with non-native tongue pronunciation. This study combined BERT (Bidirectional Encoder Representations from Transformers) and CNN-HMM models and introduced an attention mechanism to improve the accuracy of voice recognition of non-native tongue English learners. It used the pre-trained BERT model to extract the context of the voice signal and used the CNN (Convolutional Neural Network) for local feature extraction, and used the Hidden Markov model (HMM) to make a sequence model building model to capture ability of key features. The experimental results show that the accuracy rate of voice recognition of the BERT-CNN-HMM model in this article reaches 88.9% under normal speed, which is significantly better than 78.5% of the traditional HMM model. Under different noise levels, the accuracy of the article\u2019s model in low noise, medium noise, and high noise environments is 86.9%, 80.5%, and 72.8%, respectively, which are higher than other comparative models. The accuracy of the models in this article remained above 83.8% when processing different accents, showing strong adaptation to strong adaptation and generality. It can be seen from the experimental results that the model of this article can significantly improve the accuracy of the voice recognition of non-native English learners and provide a new direction for further research in the field of voice recognition.<\/jats:p>","DOI":"10.1177\/14727978251321955","type":"journal-article","created":{"date-parts":[[2025,3,12]],"date-time":"2025-03-12T02:06:22Z","timestamp":1741745182000},"page":"944-962","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Apply BERT optimization algorithm to improve the accuracy of speech recognition for non-native English learners"],"prefix":"10.1177","volume":"25","author":[{"given":"Zhenli","family":"Tao","sequence":"first","affiliation":[{"name":"Wuhan University of Bioengineering"}]}],"member":"179","published-online":{"date-parts":[[2025,3,11]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1080\/15472450.2019.1646132"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2020.3037031"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2911077"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.15858\/engtea.78.1.202303.3"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.3233\/JIFS-189388"},{"issue":"1","key":"e_1_3_2_7_2","first-page":"31","article-title":"Investigating the efficacy of using online resources for production training in learning non-native vowel contrasts","volume":"11","author":"Aljohani A","year":"2023","unstructured":"Aljohani A, Alshangiti W. Investigating the efficacy of using online resources for production training in learning non-native vowel contrasts. Int J Engl Lang Educ 2023; 11(1): 31\u201352.","journal-title":"Int J Engl Lang Educ"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1080\/09588221.2020.1839504"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1080\/15475441.2022.2107522"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3488380"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/bxac013"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/s40593-022-00290-6"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-021-02460-w"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-021-11771-6"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2022.3186162"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2020.2992900"},{"issue":"2","key":"e_1_3_2_17_2","first-page":"73","article-title":"Students\u2019 perceptions on the use of tongue twisters in learning pronunciation","volume":"1","author":"Amar LN","year":"2019","unstructured":"Amar LN, Mu\u2019in F, Asmi R. Students\u2019 perceptions on the use of tongue twisters in learning pronunciation. Ling Educ 2019; 1(2): 73\u201392.","journal-title":"Ling Educ"},{"issue":"5","key":"e_1_3_2_18_2","first-page":"8128","article-title":"A review for reduction of noise by wavelet transform in audio signals","volume":"6","author":"Thu LN","year":"2019","unstructured":"Thu LN, Win A, Oo HN. A review for reduction of noise by wavelet transform in audio signals. International Research Journal of Engineering and Technology 2019; 6(5): 8128\u20138131.","journal-title":"International Research Journal of Engineering and Technology"},{"issue":"3","key":"e_1_3_2_19_2","first-page":"653","article-title":"Speech quality enhancement through noise cancellation using an adaptive algorithm","volume":"49","author":"Kapoor J","year":"2022","unstructured":"Kapoor J, Pathak A, Rai M, et al. Speech quality enhancement through noise cancellation using an adaptive algorithm. IAENG Int J Comput Sci 2022; 49(3): 653\u2013665.","journal-title":"IAENG Int J Comput Sci"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSEN.2020.3009112"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1121\/10.0013874"},{"issue":"4","key":"e_1_3_2_22_2","first-page":"49","article-title":"Analysis of digital voice features extraction methods","volume":"1","author":"Shayeb JNI","year":"2019","unstructured":"Shayeb JNI, Alqadi Z, Nader J. Analysis of digital voice features extraction methods. Int J Educ Res 2019; 1(4): 49\u201355.","journal-title":"Int J Educ Res"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-021-10781-8"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-021-09958-2"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btab133"},{"issue":"2","key":"e_1_3_2_26_2","doi-asserted-by":"crossref","first-page":"13","DOI":"10.34257\/GJCSTDVOL19IS2PG13","article-title":"Classification of image using convolutional neural network (CNN)","volume":"19","author":"Hossain MA","year":"2019","unstructured":"Hossain MA, Alam Sajib MS. Classification of image using convolutional neural network (CNN). Global J Comput Sci Technol 2019; 19(2): 13\u201314.","journal-title":"Global J Comput Sci Technol"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3030418"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1111\/ele.13610"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1080\/03610918.2019.1586926"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1049\/sil2.12129"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1080\/0952813X.2020.1744198"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.spinee.2021.02.007"}],"container-title":["Journal of Computational Methods in Sciences and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14727978251321955","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14727978251321955","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14727978251321955","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T16:31:29Z","timestamp":1771000289000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14727978251321955"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1]]}},"alternative-id":["10.1177\/14727978251321955"],"URL":"https:\/\/doi.org\/10.1177\/14727978251321955","relation":{},"ISSN":["1472-7978","1875-8983"],"issn-type":[{"value":"1472-7978","type":"print"},{"value":"1875-8983","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1]]}}}