{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:48:29Z","timestamp":1777704509423,"version":"3.51.4"},"reference-count":21,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2020,3,30]],"date-time":"2020-03-30T00:00:00Z","timestamp":1585526400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2020,5,29]]},"abstract":"<jats:p>Speech analysis for extracting attributes such as the speaker, gender, accent and like has been a field of great interest and has been widely studied. The paper presents a novel architecture for accent identification by using a cascade of two deep-learning architecture. We design and test our proposed architecture on common voice dataset. The architecture consists of a cascade of Convolutional Neural Network (CNN) and Convolutional Recurrent Neural Network (CRNN). It is trained on Mel-spectrogram of the audios. We consider five of the most popular English accents groups namely India, Australia, US, England, Canada in this study. The proposed model has an accuracy of 78.48% using CNN and 83.21% using CRNN.<\/jats:p>","DOI":"10.3233\/jifs-179715","type":"journal-article","created":{"date-parts":[[2020,3,31]],"date-time":"2020-03-31T15:07:19Z","timestamp":1585667239000},"page":"6347-6352","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["Foreign accent classification using deep neural nets"],"prefix":"10.1177","volume":"38","author":[{"given":"Utkarsh","family":"Singh","sequence":"first","affiliation":[{"name":"Computer Science and Engineering, National Institute of Technology Silchar, Silchar, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Akshay","family":"Gupta","sequence":"additional","affiliation":[{"name":"Electronics and Instrumentation Engineering, National Institute of Technology Silchar, Silchar, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dipjyoti","family":"Bisharad","sequence":"additional","affiliation":[{"name":"Electronics and Communication Engineering, National Institute of Technology Silchar, Silchar, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wasim","family":"Arif","sequence":"additional","affiliation":[{"name":"Electronics and Communication Engineering, National Institute of Technology Silchar, Silchar, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,3,30]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/AUTOID.2005.10"},{"key":"e_1_3_2_3_2","doi-asserted-by":"crossref","unstructured":"PedersenC. and DiederichJ. Accent Classification Using Support Vector Machines in 6th IEEE\/ACIS International Conference on Computer and Information Science (ICIS 2007) no. July. IEEE (2007) pp 444\u2013449.","DOI":"10.1109\/ICIS.2007.47"},{"key":"e_1_3_2_4_2","doi-asserted-by":"crossref","unstructured":"NajafianM. SafaviS. WeberP. and RussellM. Identification of British English regional accents using fusion of i-vector and multi-accent phonotactic systems in Proceedings of Odyssey - The Speaker and Language Recognition Workshop no. June (2016) pp 132\u2013139.","DOI":"10.21437\/Odyssey.2016-19"},{"key":"e_1_3_2_5_2","doi-asserted-by":"crossref","unstructured":"Torres-CarrasquilloP.A. SturimD. ReynoldsD.A. McCreeA. Eigen-channel compensation and discriminatively trained Gaussian mixture models for dialect and accent recognition in Proceedings of the Annual Conference of the International Speech Communication Association INTERSPEECH no. 1 (2008) pp 723\u2013726.","DOI":"10.21437\/Interspeech.2008-226"},{"key":"e_1_3_2_6_2","first-page":"338","article-title":"Improved Vector Quantization Approach for Discrete HMM Speech Recognition System","volume":"4","author":"Debyeche M.","year":"2007","unstructured":"DebyecheM., HatonJ.P. and HouacineA., Improved Vector Quantization Approach for Discrete HMM Speech Recognition System, Int Arab J Inf Technol4.4 (2007), 338\u2013344.","journal-title":"Int Arab J Inf Technol"},{"key":"e_1_3_2_7_2","unstructured":"ChuA. LaiP. and LeD. Accent Classification of Non-Native English Speakers 1\u20138 (2017) (unpublished)"},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","unstructured":"ChanM.V. FengX. HeinenJ.A. and NiederjohnR.J. Classification of SpeechAccentswithNeuralNetworks in Proc. ICNN\u201994 International Conference on Neural Networks (1994) pp. 4483\u20134486.","DOI":"10.1109\/ICNN.1994.374994"},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","unstructured":"KaoC.-C. WangW. SunM. and WangC. RCRNN: Region-based Convolutional Recurrent Neural Network for Audio Event Detection 8 (2018) 1\u20135.","DOI":"10.21437\/Interspeech.2018-2323"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2015.7301268"},{"key":"e_1_3_2_11_2","doi-asserted-by":"crossref","unstructured":"BartzC. HeroldT. YangH. and MeinelC. Language Identification Using Deep Convolutional Recurrent Neural Networks in Neural Information Processing Springer International Publishing (2017) pp 880\u2013890.","DOI":"10.1007\/978-3-319-70136-3_93"},{"issue":"1","key":"e_1_3_2_12_2","article-title":"Convolutional Recurrent Neural Networks for Music Classification, ICASSP","volume":"136","author":"Choi K.","year":"2016","unstructured":"ChoiK., FazekasG., SandlerM. and ChoK., Convolutional Recurrent Neural Networks for Music Classification, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 136(1) 2392\u20132396, 9, (2016).","journal-title":"IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings"},{"key":"e_1_3_2_13_2","unstructured":"VoglR. DorferM. WidmerG. and KneesP. Drum Transcription via Joint Beat and Drum Modeling using Convolutional Recurrent Neural Networks in Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference (2017) pp. 150\u2013157."},{"key":"e_1_3_2_14_2","doi-asserted-by":"crossref","unstructured":"NajafianM. SafaviS. HansenJ.H. and RussellM. Improving speech recognition using limited accent diverse British English training data with deep neural networks in IEEE International Workshop on Machine Learning for Signal Processing MLSP vol. 2016-November IEEE 9 (2016) pp 1\u20136.","DOI":"10.1109\/MLSP.2016.7738854"},{"key":"e_1_3_2_15_2","unstructured":"TrevinoA. Accent Classification using Neural Networks. OpenStax CNX. 15 Dec (2005)."},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"JiaoY. TuM. BerishaV. and LissJ. Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short-Term Features in Proceedings of the Annual Conference of the International Speech Communication Association INTERSPEECH vol. 0812-Sept no. September 9 (2016) pp. 2388\u20132392.","DOI":"10.21437\/Interspeech.2016-1148"},{"key":"e_1_3_2_17_2","unstructured":"HanY. and LeeK. Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation arXiv preprint arXiv:1607.02383. 2016 Jul 8."},{"key":"e_1_3_2_18_2","doi-asserted-by":"crossref","unstructured":"HumphreyD. BrookC. and MacDonaldA. Exposing audio data to the web: an API and prototype Proceedings of the 19th international conference on World Wide Web. ACM (2010).","DOI":"10.1145\/1772690.1772932"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.31578\/hum.v3i2.277"},{"key":"e_1_3_2_20_2","unstructured":"EnsslinA. GoorimoortheeT. CarletonS. BulitkoV. and HernandezS.P. Deep Learning for Speech Accent Detection in Videogames in Thirteenth Artificial Intelligence and Interactive Digital Entertainment Conference 2017 Sep 19."},{"key":"e_1_3_2_21_2","unstructured":"StefanD. New-Dialect Formation in Canada: Evidence from the English Modal Auxiliaries Amsterdam and Philadelphia: John Benjamins (2008)."},{"key":"e_1_3_2_22_2","unstructured":"YanQ. VaseghiS. RentzosD. HoC.-H. and TurajlicE. Analysis of acoustic correlates of British Australian and American accents in 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721). IEEE pp. 345\u2013350."}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179715","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-179715","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179715","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:41:22Z","timestamp":1777455682000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-179715"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,30]]},"references-count":21,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,5,29]]}},"alternative-id":["10.3233\/JIFS-179715"],"URL":"https:\/\/doi.org\/10.3233\/jifs-179715","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,30]]}}}