{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T06:37:55Z","timestamp":1777963075448,"version":"3.51.4"},"reference-count":29,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2022,2,17]],"date-time":"2022-02-17T00:00:00Z","timestamp":1645056000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Chulalongkorn University Technology Center","award":["N\/A"],"award-info":[{"award-number":["N\/A"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The Montreal cognitive assessment (MoCA), a widely accepted screening tool for identifying patients with mild cognitive impairment (MCI), includes a language fluency test of verbal functioning; its scores are based on the number of unique correct words produced by the test taker. However, it is possible that unique words may be counted differently for various languages. This study focuses on Thai as a language that differs from English in terms of word combinations. We applied various automatic speech recognition (ASR) techniques to develop an assisted scoring system for the MoCA language fluency test with Thai language support. This was a challenge because Thai is a low-resource language for which domain-specific data are not publicly available, especially speech data from patients with MCIs. Furthermore, the great variety of pronunciation, intonation, tone, and accent of the patients, all of which might differ from healthy controls, bring more complexity to the model. We propose a hybrid time delay neural network hidden Markov model (TDNN-HMM) architecture for acoustic model training to create our ASR system that is robust to environmental noise and to the variation of voice quality impacted by MCI. The LOTUS Thai speech corpus was incorporated into the training set to improve the model\u2019s generalization. A preprocessing algorithm was implemented to reduce the background noise and improve the overall data quality before feeding data into the TDNN-HMM system for automatic word detection and language fluency score calculation. The results show that the TDNN-HMM model in combination with data augmentation using lattice-free maximum mutual information (LF-MMI) objective function provides a word error rate (WER) of 30.77%. To our knowledge, this is the first study to develop an ASR with Thai language support to automate the scoring system of MoCA\u2019s language fluency assessment.<\/jats:p>","DOI":"10.3390\/s22041583","type":"journal-article","created":{"date-parts":[[2022,2,17]],"date-time":"2022-02-17T20:26:41Z","timestamp":1645129601000},"page":"1583","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Using Automatic Speech Recognition to Assess Thai Speech Language Fluency in the Montreal Cognitive Assessment (MoCA)"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0800-6349","authenticated-orcid":false,"given":"Pimarn","family":"Kantithammakorn","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Proadpran","family":"Punyabukkana","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ploy N.","family":"Pratanwanich","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand"},{"name":"Chula Intelligent and Complex Systems Research Unit, Chulalongkorn University, Bangkok 10330, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Solaphat","family":"Hemrungrojn","sequence":"additional","affiliation":[{"name":"Department of Psychiatry, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand"},{"name":"Cognitive Fitness and Biopsychological Technology Research Unit, Chulalongkorn University, Bangkok 10330, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chaipat","family":"Chunharas","sequence":"additional","affiliation":[{"name":"Cognitive Clinical & Computational Neuroscience Research Unit, Department of Internal Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand"},{"name":"Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok 10330, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4007-1124","authenticated-orcid":false,"given":"Dittaya","family":"Wanvarie","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,2,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1093\/bmb\/ldp033","article-title":"Age-associated cognitive decline","volume":"92","author":"Deary","year":"2009","journal-title":"Br. Med. Bull."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"65","DOI":"10.4103\/tcmj.tcmj_18_17","article-title":"The borderland between normal aging and dementia","volume":"29","author":"Lo","year":"2017","journal-title":"Tzu Chi Med. J."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1016\/j.amjmed.2018.01.022","article-title":"Dementia","volume":"131","author":"Gale","year":"2018","journal-title":"Am. J. Med."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1177\/1715163517690745","article-title":"Dementia","volume":"150","author":"Duong","year":"2017","journal-title":"Can. Pharm. J."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"320","DOI":"10.1007\/s11920-012-0291-x","article-title":"Mild Cognitive Impairment in Older Adults","volume":"14","author":"Geda","year":"2012","journal-title":"Curr. Psychiatry Rep."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1985","DOI":"10.1001\/archneur.58.12.1985","article-title":"Current Concepts in Mild Cognitive Impairment","volume":"58","author":"Petersen","year":"2001","journal-title":"Arch. Neurol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1016\/0022-3956(75)90026-6","article-title":"Mini-mental state. A practical method for grading the cognitive state of patients for the clinician","volume":"12","author":"Folstein","year":"1975","journal-title":"J. Psychiatr. Res."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"695","DOI":"10.1111\/j.1532-5415.2005.53221.x","article-title":"The Montreal Cognitive Assessment, MoCA: A Brief Screening Tool for Mild Cognitive Impairment","volume":"53","author":"Nasreddine","year":"2005","journal-title":"J. Am. Geriatr. Soc."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1080\/13554794.2011.608365","article-title":"A comparison of screening tools for the assessment of Mild Cognitive Impairment: Preliminary findings","volume":"18","author":"Ahmed","year":"2012","journal-title":"Neurocase"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.specom.2006.10.004","article-title":"Thai speech processing technology: A review","volume":"49","author":"Wutiwiwatchai","year":"2007","journal-title":"Speech Commun."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1109\/MAHC.2009.5","article-title":"Computers and the Thai Language","volume":"31","author":"Koanantakool","year":"2009","journal-title":"IEEE Ann. Hist. Comput."},{"key":"ref_12","unstructured":"Chaiwongsai, J., Chiracharit, W., Chamnongthai, K., and Miyanaga, Y. (2009, January 8\u201311). An architecture of HMM-based isolated-word speech recognition with tone detection function. Proceedings of the 2008 International Symposium on Intelligent Signal Processing and Communications Systems, Bangkok, Thailand."},{"key":"ref_13","first-page":"112","article-title":"Automatic speech analysis for the assessment of patients with predementia and Alzheimer\u2019s disease","volume":"1","author":"Satt","year":"2015","journal-title":"Alzheimer\u2019s Dementia: Diagn. Assess. Dis. Monit."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhou, L., Fraser, K.C., and Rudzicz, F. (2016, January 8\u201312). Speech Recognition in Alzheimer\u2019s Disease and in its Assessment. Proceedings of the Interspeech 2016, San Francisco, CA, USA.","DOI":"10.21437\/Interspeech.2016-1228"},{"key":"ref_15","unstructured":"Povey, D., Boulianne, G., Burget, L., Motlicek, P., and Schwarz, P. (2011, January 11\u201315). The Kaldi Speech Recognition. Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Waikoloa, HI, USA. Available online: http:\/\/kaldi.sf.net\/."},{"key":"ref_16","first-page":"772","article-title":"What do verbal fluency tasks measure? Predictors of verbal fluency performance in older adults","volume":"5","author":"Eshao","year":"2014","journal-title":"Front. Psychol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.specom.2015.09.010","article-title":"Using automatic speech recognition to assess spoken responses to cognitive tests of semantic verbal fluency","volume":"75","author":"Pakhomov","year":"2015","journal-title":"Speech Commun."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Tr\u00f6ger, J., Linz, N., K\u00f6nig, A., Robert, P., and Alexandersson, J. (2018, January 21\u201324). Telephone-based Dementia Screening I. Proceedings of the 12th EAI International Conference on Pervasive Computing Technologies for Healthcare, New York, NY, USA.","DOI":"10.1145\/3240925.3240943"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lauraitis, A., Maskeli\u016bnas, R., Dama\u0161evi\u010dius, R., and Krilavi\u010dius, T. (2020). A Mobile Application for Smart Computer-Aided Self-Administered Testing of Cognition, Speech, and Motor Impairment. Sensors, 20.","DOI":"10.3390\/s20113236"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"13254","DOI":"10.1016\/j.eswa.2011.04.142","article-title":"Phoneme and tonal accent recognition for Thai speech","volume":"38","author":"Chansareewittaya","year":"2011","journal-title":"Expert Syst. Appl."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hu, X., Saiko, M., and Hori, C. (2014, January 9\u201312). Incorporating tone features to convolutional neural network to improve Mandarin\/Thai speech recognition. Proceedings of the Signal and Information Processing Association Annual Summit and Conference (APSIPA), Kuala Lumpur, Malaysia.","DOI":"10.1109\/APSIPA.2014.7041576"},{"key":"ref_22","unstructured":"(2022, January 01). RNNoise\u2014Recurrent Neural Network for Audio. Available online: https:\/\/github.com\/xiph\/rnnoise."},{"key":"ref_23","unstructured":"Kasuriya, S., Sornlertlamvanich, V., Cotsomrong, P., Kanokphara, S., and Thatphithakkul, N. (2003, January 1\u20133). Thai Speech Corpus for Speech Recognition. Proceedings of the Oriental COCOSDA, Singapore."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Juang, B.H., and Rabiner, L.R. (2005). Automatic Speech Recognition\u2014A Brief History of the Technology Development, Atlanta Rutgers University and the University of California.","DOI":"10.1016\/B0-08-044854-2\/00906-8"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1109\/MSP.2012.2205597","article-title":"Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups","volume":"29","author":"Hinton","year":"2012","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ghahremani, P., Baba Ali, B., Povey, D., Riedhammer, K., Trmal, J., and Khudanpur, S. (2014, January 4\u20139). A pitch extraction algorithm tuned for automatic speech recognition. Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy.","DOI":"10.1109\/ICASSP.2014.6854049"},{"key":"ref_27","unstructured":"Anastasakos, T., McDonough, J., and Makhoul, J. (1997, January 21\u201324). Speaker adaptive training: A maximum likelihood approach to speaker normalization. Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1109\/29.21701","article-title":"Phoneme recognition using time-delay neural networks","volume":"37","author":"Waibel","year":"1989","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Hadian, H., Sameti, H., Povey, D., and Khudanpur, S. (2018, January 2\u20136). End-to-end Speech Recognition Using Lattice-free MMI. Proceedings of the 19th Annual Conference of the International Speech Communication Association, Hyderabad, India.","DOI":"10.21437\/Interspeech.2018-1423"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/4\/1583\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:21:50Z","timestamp":1760134910000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/4\/1583"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,17]]},"references-count":29,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2022,2]]}},"alternative-id":["s22041583"],"URL":"https:\/\/doi.org\/10.3390\/s22041583","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,17]]}}}