{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T20:02:38Z","timestamp":1780776158285,"version":"3.54.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,6,25]],"date-time":"2024-06-25T00:00:00Z","timestamp":1719273600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,25]],"date-time":"2024-06-25T00:00:00Z","timestamp":1719273600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"NTNU Norwegian University of Science and Technology"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J AUDIO SPEECH MUSIC PROC."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Dysarthria is a speech disorder that affects the ability to communicate due to articulation difficulties. This research proposes a novel method for automatic dysarthria detection (ADD) and automatic dysarthria severity level assessment (ADSLA) by using a variable continuous wavelet transform (CWT) layered convolutional neural network (CNN) model. To determine their efficiency, the proposed model is assessed using two distinct corpora, TORGO and UA-Speech, comprising both dysarthria patients and healthy subject speech signals. The research study explores the effectiveness of CWT-layered CNN models that employ different wavelets such as Amor, Morse, and Bump. The study aims to analyze the models\u2019 performance without the need for feature extraction, which could provide deeper insights into the effectiveness of the models in processing complex data. Also, raw waveform modeling preserves the original signal\u2019s integrity and nuance, making it ideal for applications like speech recognition, signal processing, and image processing. Extensive analysis and experimentation have revealed that the Amor wavelet surpasses the Morse and Bump wavelets in accurately representing signal characteristics. The Amor wavelet outperforms the others in terms of signal reconstruction fidelity, noise suppression capabilities, and feature extraction accuracy. The proposed CWT-layered CNN model emphasizes the importance of selecting the appropriate wavelet for signal-processing tasks. The Amor wavelet is a reliable and precise choice for applications. The UA-Speech dataset is crucial for more accurate dysarthria classification. Advanced deep learning techniques can simplify early intervention measures and expedite the diagnosis process.<\/jats:p>","DOI":"10.1186\/s13636-024-00357-3","type":"journal-article","created":{"date-parts":[[2024,6,25]],"date-time":"2024-06-25T19:00:30Z","timestamp":1719342030000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["Automatic dysarthria detection and severity level assessment using CWT-layered CNN model"],"prefix":"10.1186","volume":"2024","author":[{"given":"Shaik","family":"Sajiha","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kodali","family":"Radha","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dhulipalla","family":"Venkata\u00a0Rao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nammi","family":"Sneha","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Suryanarayana","family":"Gunnam","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9743-1701","authenticated-orcid":false,"given":"Durga Prasad","family":"Bavirisetti","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,6,25]]},"reference":[{"issue":"3","key":"357_CR1","doi-asserted-by":"publisher","first-page":"1323","DOI":"10.1007\/s00415-022-11464-6","volume":"270","author":"MJ Vansteensel","year":"2023","unstructured":"M.J. Vansteensel, E. Klein, G. van Thiel, M. Gaytant, Z. Simmons, J.R. Wolpaw, T.M. Vaughan, Towards clinical application of implantable brain-computer interfaces for people with late-stage ALS: Medical and ethical considerations. J. Neurol. 270(3), 1323\u20131336 (2023)","journal-title":"J. Neurol."},{"key":"357_CR2","doi-asserted-by":"crossref","unstructured":"S.M. Shabber, M. Bansal, K. Radha, in 2023 International Conference on Electrical, Electronics, Communication and Computers (ELEXCOM). Machine learning-assisted diagnosis of speech disorders: A review of dysarthric speech (IEEE, Roorkee, India, 2023), pp. 1\u20136","DOI":"10.1109\/ELEXCOM58812.2023.10370116"},{"key":"357_CR3","doi-asserted-by":"crossref","unstructured":"S.M. Shabber, M. Bansal, K. Radha, in 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT). A review and classification of amyotrophic lateral sclerosis with speech as a biomarker (IEEE, Delhi, India, 2023), pp. 1\u20137","DOI":"10.1109\/ICCCNT56998.2023.10308048"},{"issue":"3","key":"357_CR4","doi-asserted-by":"publisher","first-page":"660","DOI":"10.1111\/1460-6984.12715","volume":"57","author":"M Carl","year":"2022","unstructured":"M. Carl, E.S. Levy, M. Icht, Speech treatment for hebrew-speaking adolescents and young adults with developmental dysarthria: A comparison of mSIT and Beatalk. Int. J. Lang. Commun. Disord. 57(3), 660\u2013679 (2022)","journal-title":"Int. J. Lang. Commun. Disord."},{"key":"357_CR5","unstructured":"V. Mendoza Ramos, The added value of speech technology in clinical care of patients with dysarthria. Ph.D. thesis, University of Antwerp (2022)"},{"key":"357_CR6","doi-asserted-by":"crossref","unstructured":"Z. Yue, E. Loweimi, H. Christensen, J. Barker, Z. Cvetkovic, in INTERSPEECH. Dysarthric speech recognition from raw waveform with parametric CNNs. (IEEE, Incheon, Korea, 2022), pp. 31\u201335","DOI":"10.21437\/Interspeech.2022-163"},{"issue":"4","key":"357_CR7","first-page":"791","volume":"9","author":"N Tavabi","year":"2022","unstructured":"N. Tavabi, D. St\u00fcck, A. Signorini, C. Karjadi, T. Al Hanai, M. Sandoval, C. Lemke, J. Glass, S. Hardy, M. Lavallee et al., Cognitive digital biomarkers from automated transcription of spoken language. J. Prev. Alzheimer Dis. 9(4), 791\u2013800 (2022)","journal-title":"J. Prev. Alzheimer Dis."},{"issue":"3","key":"357_CR8","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1007\/s10772-023-10039-8","volume":"26","author":"K Radha","year":"2023","unstructured":"K. Radha, M. Bansal, Towards modeling raw speech in gender identification of children using sincNet over ERB scale. Int. J. Speech Technol. 26(3), 651\u2013663 (2023)","journal-title":"Int. J. Speech Technol."},{"key":"357_CR9","doi-asserted-by":"crossref","unstructured":"J. Millet, N. Zeghidour, in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Learning to detect dysarthria from raw speech (IEEE, Brighton, UK, 2019), pp. 5831\u20135835","DOI":"10.1109\/ICASSP.2019.8682324"},{"key":"357_CR10","doi-asserted-by":"crossref","unstructured":"S. Sajiha, K. Radha, D.V. Rao, V. Akhila, N. Sneha, in 2024 National Conference on Communications (NCC). Dysarthria diagnosis and dysarthric speaker identification using raw speech model (IEEE, Chennai, India, 2024)","DOI":"10.1109\/NCC60321.2024.10485694"},{"issue":"10","key":"357_CR11","doi-asserted-by":"publisher","first-page":"6228","DOI":"10.1007\/s00034-023-02399-y","volume":"42","author":"K Radha","year":"2023","unstructured":"K. Radha, M. Bansal, Feature fusion and ablation analysis in gender identification of preschool children from spontaneous speech. Circ. Syst. Signal Process. 42(10), 6228\u20136252 (2023)","journal-title":"Circ. Syst. Signal Process."},{"issue":"10","key":"357_CR12","doi-asserted-by":"publisher","first-page":"1490","DOI":"10.3390\/e24101490","volume":"24","author":"K Radha","year":"2022","unstructured":"K. Radha, M. Bansal, Audio augmentation for non-native children\u2019s speech recognition through discriminative learning. Entropy 24(10), 1490 (2022)","journal-title":"Entropy"},{"key":"357_CR13","doi-asserted-by":"crossref","unstructured":"K. Radha, M. Bansal, S.M. Shabber, in 2022 2nd International Conference on Artificial Intelligence and Signal Processing (AISP). Accent classification of native and non-native children using harmonic pitch (IEEE, Amaravati, India, 2022), pp. 1\u20136","DOI":"10.1109\/AISP53593.2022.9760588"},{"key":"357_CR14","doi-asserted-by":"crossref","unstructured":"K. Radha, M. Bansal, R. Sharma, in 2023 10th International Conference on Signal Processing and Integrated Networks (SPIN). Whitening transformation of i-vectors in closed-set speaker verification of children (IEEE, Noida, India, 2023), pp. 243\u2013248","DOI":"10.1109\/SPIN57001.2023.10116604"},{"issue":"7","key":"357_CR15","doi-asserted-by":"publisher","first-page":"1483","DOI":"10.1016\/j.ymssp.2005.09.012","volume":"20","author":"AK Jardine","year":"2006","unstructured":"A.K. Jardine, D. Lin, D. Banjevic, A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mech. Syst. Signal Process. 20(7), 1483\u20131510 (2006)","journal-title":"Mech. Syst. Signal Process."},{"key":"357_CR16","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1109\/RBME.2020.3006860","volume":"14","author":"S Latif","year":"2020","unstructured":"S. Latif, J. Qadir, A. Qayyum, M. Usama, S. Younis, Speech technology for healthcare: Opportunities, challenges, and state of the art. IEEE Rev. Biomed. Eng. 14, 342\u2013356 (2020)","journal-title":"IEEE Rev. Biomed. Eng."},{"key":"357_CR17","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1016\/B978-0-444-52901-5.00022-8","volume":"110","author":"P Enderby","year":"2013","unstructured":"P. Enderby, Disorders of communication: Dysarthria. Handb. Clin. Neurol. 110, 273\u2013281 (2013)","journal-title":"Handb. Clin. Neurol."},{"key":"357_CR18","doi-asserted-by":"crossref","unstructured":"S.K. Maharana, A. Illa, R. Mannem, Y. Belur, P. Shetty, V.P. Kumar, S. Vengalil, K. Polavarapu, N. Atchayaram, P.K. Ghosh, in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Acoustic-to-articulatory inversion for dysarthric speech by using cross-corpus acoustic-articulatory data (IEEE, Toronto, Ontario, Canada, 2021), pp. 6458\u20136462","DOI":"10.1109\/ICASSP39728.2021.9413625"},{"key":"357_CR19","doi-asserted-by":"crossref","unstructured":"B. Suhas, D. Patel, N.R. Koluguri, Y. Belur, P. Reddy, A. Nalini, R. Yadav, D. Gope, P.K. Ghosh, in INTERSPEECH. Comparison of speech tasks and recording devices for voice based automatic classification of healthy subjects and patients with amyotrophic lateral sclerosis. (IEEE, Graz, Austria, 2019), pp. 4564\u20134568","DOI":"10.21437\/Interspeech.2019-1285"},{"issue":"5","key":"357_CR20","doi-asserted-by":"publisher","first-page":"S46","DOI":"10.1044\/jshr.3905.s46","volume":"39","author":"KM Yorkston","year":"1996","unstructured":"K.M. Yorkston, Treatment efficacy: Dysarthria. J. Speech Lang. Hear. Res. 39(5), S46\u2013S57 (1996)","journal-title":"J. Speech Lang. Hear. Res."},{"issue":"2","key":"357_CR21","doi-asserted-by":"publisher","first-page":"390","DOI":"10.1109\/JSTSP.2019.2949912","volume":"14","author":"H Chandrashekar","year":"2019","unstructured":"H. Chandrashekar, V. Karjigi, N. Sreedevi, Spectro-temporal representation of speech for intelligibility assessment of dysarthria. IEEE J. Sel. Top. Signal Process. 14(2), 390\u2013399 (2019)","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"357_CR22","doi-asserted-by":"crossref","unstructured":"A. Hernandez, E.J. Yeo, S. Kim, M. Chung, in INTERSPEECH. Dysarthria detection and severity assessment using rhythm-based metrics. (IEEE, Shanghai, China, 2020), pp. 2897\u20132901","DOI":"10.21437\/Interspeech.2020-2354"},{"key":"357_CR23","doi-asserted-by":"publisher","unstructured":"K. Radha, M. Bansal, V.R. Dulipalla, Variable STFT layered CNN model for automated dysarthria detection and severity assessment using raw speech. Circ. Syst. Signal Process. 43, 3261\u20133278 (2024).\u00a0https:\/\/doi.org\/10.1007\/s00034-024-02611-7","DOI":"10.1007\/s00034-024-02611-7"},{"key":"357_CR24","doi-asserted-by":"publisher","first-page":"67745","DOI":"10.1109\/ACCESS.2020.2986171","volume":"8","author":"N Narendra","year":"2020","unstructured":"N. Narendra, P. Alku, Glottal source information for pathological voice detection. IEEE Access 8, 67745\u201367755 (2020)","journal-title":"IEEE Access"},{"key":"357_CR25","doi-asserted-by":"crossref","unstructured":"A. Kachhi, A. Therattil, P. Gupta, H.A. Patil, in International Conference on Speech and Computer. Continuous wavelet transform for severity-level classification of dysarthria (Springer, Gurugram, India, 2022), pp. 312\u2013324","DOI":"10.1007\/978-3-031-20980-2_27"},{"key":"357_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.specom.2022.12.004","volume":"147","author":"AA Joshy","year":"2023","unstructured":"A.A. Joshy, R. Rajan, Dysarthria severity classification using multi-head attention and multi-task learning. Speech Commun. 147, 1\u201311 (2023)","journal-title":"Speech Commun."},{"key":"357_CR27","doi-asserted-by":"crossref","unstructured":"C. Divakar, R. Harsha, K. Radha, D.V. Rao, N. Madhavi, T. Bharadwaj, in 2024 14th International Conference on Cloud Computing, Data Science & Engineering (Confluence). Explainable AI for CNN-LSTM network in PCG-based valvular heart disease diagnosis (IEEE, Noida, India, 2024), pp. 92\u201397","DOI":"10.1109\/Confluence60223.2024.10463207"},{"key":"357_CR28","doi-asserted-by":"crossref","unstructured":"K. Radha, D.V. Rao, K.V.K. Sai, R.T. Krishna, A. Muneera, in 2024 International Conference on Green Energy, Computing and Sustainable Technology (GECOST). Detecting autism spectrum disorder from raw speech in children using STFT layered CNN model (IEEE, Miri, Sarawak, Malaysia, 2024), pp. 437\u2013441","DOI":"10.1109\/GECOST60902.2024.10474705"},{"key":"357_CR29","doi-asserted-by":"publisher","unstructured":"K. Radha, M. Bansal, R. Sharma, Raw waveform-based custom scalogram CRNN in cardiac abnormality diagnosis. IEEE Access. 12, 13986\u201314004 (2024). https:\/\/doi.org\/10.1109\/ACCESS.2024.3356075","DOI":"10.1109\/ACCESS.2024.3356075"},{"key":"357_CR30","doi-asserted-by":"crossref","unstructured":"C. Bhat, B. Vachhani, S.K. Kopparapu, in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Automatic assessment of dysarthria severity level using audio descriptors (IEEE, New Orleans, USA, 2017), pp. 5070\u20135074","DOI":"10.1109\/ICASSP.2017.7953122"},{"key":"357_CR31","doi-asserted-by":"publisher","first-page":"224","DOI":"10.1109\/LSP.2021.3050362","volume":"28","author":"J Fritsch","year":"2021","unstructured":"J. Fritsch, M. Magimai-Doss, Utterance verification-based dysarthric speech intelligibility assessment using phonetic posterior features. IEEE Signal Process. Lett. 28, 224\u2013228 (2021)","journal-title":"IEEE Signal Process. Lett."},{"key":"357_CR32","doi-asserted-by":"publisher","unstructured":"D. Korzekwa, R. Barra-Chicote, B. Kostek, T. Drugman, M. Lajszczak, in INTERSPEECH. Interpretable deep learning model for the detection and reconstruction of dysarthric speech. (IEEE, Graz, Austria, 2019), pp. 3890\u20133894. https:\/\/doi.org\/10.21437\/Interspeech.2019-1206","DOI":"10.21437\/Interspeech.2019-1206"},{"key":"357_CR33","doi-asserted-by":"crossref","unstructured":"P. Gupta, P.K. Chodingala, H.A. Patil, in 2022 30th European Signal Processing Conference (EUSIPCO). Morlet wavelet-based voice liveness detection using convolutional neural network (IEEE, Belgrade, Serbia, 2022), pp. 100\u2013104","DOI":"10.23919\/EUSIPCO55093.2022.9909835"},{"key":"357_CR34","unstructured":"P. Gupta, S. Gupta, H. Patil, in 9th International Conference on Pattern Recognition and Machine Intelligence. Voice liveness detection using bump wavelet with CNN (Springer, Kolkata, India, 2021)"},{"key":"357_CR35","doi-asserted-by":"publisher","first-page":"107661","DOI":"10.1016\/j.engappai.2023.107661","volume":"131","author":"K Radha","year":"2024","unstructured":"K. Radha, M. Bansal, R.B. Pachori, Speech and speaker recognition using raw waveform modeling for adult and children\u2019s speech: A comprehensive review. Eng. Appl. Artif. Intell. 131, 107661 (2024)","journal-title":"Eng. Appl. Artif. Intell."},{"key":"357_CR36","doi-asserted-by":"publisher","first-page":"103069","DOI":"10.1016\/j.specom.2024.103069","volume":"159","author":"K Radha","year":"2024","unstructured":"K. Radha, M. Bansal, R.B. Pachori, Automatic speaker and age identification of children from raw speech using sincNet over ERB scale. Speech Commun. 159, 103069 (2024)","journal-title":"Speech Commun."},{"key":"357_CR37","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1007\/s10579-011-9145-0","volume":"46","author":"F Rudzicz","year":"2012","unstructured":"F. Rudzicz, A.K. Namasivayam, T. Wolff, The TORGO database of acoustic and articulatory speech from speakers with dysarthria. Lang. Resour. Eval. 46, 523\u2013541 (2012)","journal-title":"Lang. Resour. Eval."},{"key":"357_CR38","doi-asserted-by":"crossref","unstructured":"H. Kim, M. Hasegawa-Johnson, A. Perlman, J.R. Gunderson, T.S. Huang, K.L. Watkin, S. Frame, in INTERSPEECH. Dysarthric speech database for universal access research, vol. 2008. (IEEE, Incheon, Korea, 2008), pp. 1741\u20131744","DOI":"10.21437\/Interspeech.2008-480"},{"key":"357_CR39","doi-asserted-by":"crossref","unstructured":"D.H. Shih, C.H. Liao, T.W. Wu, X.Y. Xu, M.H. Shih, Dysarthria speech detection using convolutional neural networks with gated recurrent unit. Healthcare 10(10), 1956 (2022)","DOI":"10.3390\/healthcare10101956"}],"container-title":["EURASIP Journal on Audio, Speech, and Music Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13636-024-00357-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13636-024-00357-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13636-024-00357-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,22]],"date-time":"2024-11-22T18:04:35Z","timestamp":1732298675000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmp-eurasipjournals.springeropen.com\/articles\/10.1186\/s13636-024-00357-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,25]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["357"],"URL":"https:\/\/doi.org\/10.1186\/s13636-024-00357-3","relation":{},"ISSN":["1687-4722"],"issn-type":[{"value":"1687-4722","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,25]]},"assertion":[{"value":"8 April 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 June 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 June 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"33"}}