{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T07:11:33Z","timestamp":1768893093454,"version":"3.49.0"},"reference-count":66,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T00:00:00Z","timestamp":1760054400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Science and Higher Education of the Republic of Kazakhstan","award":["IRN AP AP23488624"],"award-info":[{"award-number":["IRN AP AP23488624"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>With the rapid development of artificial intelligence and machine learning technologies, automatic speech recognition (ASR) and text-to-speech (TTS) have become key components of the digital transformation of society. The Kazakh language, as a representative of the Turkic language family, remains a low-resource language with limited audio corpora, language models, and high-quality speech synthesis systems. This study provides a comprehensive analysis of existing speech recognition and synthesis models, emphasizing their applicability and adaptation to the Kazakh language. Special attention is given to linguistic and technical barriers, including the agglutinative structure, rich vowel system, and phonemic variability. Both open-source and commercial solutions were evaluated, including Whisper, GPT-4 Transcribe, ElevenLabs, OpenAI TTS, Voiser, KazakhTTS2, and TurkicTTS. Speech recognition systems were assessed using BLEU, WER, TER, chrF, and COMET, while speech synthesis was evaluated with MCD, PESQ, STOI, and DNSMOS, thus covering both lexical\u2013semantic and acoustic\u2013perceptual characteristics. The results demonstrate that, for speech-to-text (STT), the strongest performance was achieved by Soyle on domain-specific data (BLEU 74.93, WER 18.61), while Voiser showed balanced accuracy (WER 40.65\u201337.11, chrF 80.88\u201384.51) and GPT-4 Transcribe achieved robust semantic preservation (COMET up to 1.02). In contrast, Whisper performed weakest (WER 77.10, BLEU 13.22), requiring further adaptation for Kazakh. For text-to-speech (TTS), KazakhTTS2 delivered the most natural perceptual quality (DNSMOS 8.79\u20138.96), while OpenAI TTS achieved the best spectral accuracy (MCD 123.44\u2013117.11, PESQ 1.14). TurkicTTS offered reliable intelligibility (STOI 0.15, PESQ 1.16), and ElevenLabs produced natural but less spectrally accurate speech.<\/jats:p>","DOI":"10.3390\/info16100879","type":"journal-article","created":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:50:16Z","timestamp":1760107816000},"page":"879","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Speech Recognition and Synthesis Models and Platforms for the Kazakh Language"],"prefix":"10.3390","volume":"16","author":[{"given":"Aidana","family":"Karibayeva","sequence":"first","affiliation":[{"name":"Information Systems Department, Faculty of Information Technology and Artificial Intelligence, Farabi University, Almaty 050040, Kazakhstan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8768-0349","authenticated-orcid":false,"given":"Vladislav","family":"Karyukin","sequence":"additional","affiliation":[{"name":"Information Systems Department, Faculty of Information Technology and Artificial Intelligence, Farabi University, Almaty 050040, Kazakhstan"}]},{"given":"Balzhan","family":"Abduali","sequence":"additional","affiliation":[{"name":"Information Systems Department, Faculty of Information Technology and Artificial Intelligence, Farabi University, Almaty 050040, Kazakhstan"}]},{"given":"Dina","family":"Amirova","sequence":"additional","affiliation":[{"name":"Information Systems Department, Faculty of Information Technology and Artificial Intelligence, Farabi University, Almaty 050040, Kazakhstan"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Vacher, M., Aman, F., Rossato, S., and Portet, F. (2015). Development of Automatic Speech Recognition Techniques for Elderly Home Support: Applications and Challenges. Lecture Notes in Computer Science, Proceedings of the International Conference on Human Aspects of IT for the Aged Population, Springer.","DOI":"10.1007\/978-3-319-20913-5_32"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Bekarystankyzy, A., Mamyrbayev, O., Mendes, M., Fazylzhanova, A., and Assam, M. (2024). Multilingual end-to-end ASR for low-resource Turkic languages with common alphabets. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-64848-1"},{"key":"ref_3","first-page":"741","article-title":"Inferring the Complete Set of Kazakh Endings as a Language Resource","volume":"Volume 1287","author":"Hernes","year":"2020","journal-title":"Advances in Computational Collective Intelligence: Proceedings of the 12th International Conference, ICCCI 2020, Da Nang, Vietnam, 30 November\u20133 December 2020"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1832403","DOI":"10.1080\/23311916.2020.1856500","article-title":"Morphological Segmentation Method for Turkic Language Neural Machine Translation","volume":"7","author":"Tukeyev","year":"2020","journal-title":"Cogent Eng."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1007\/978-3-030-88081-1_48","article-title":"Universal Programs for Stemming, Segmentation, Morphological Analysis of Turkic Words","volume":"Volume 12876","author":"Nguyen","year":"2021","journal-title":"Computational Collective Intelligence: Proceedings of the International Conference (ICCCI 2021), Rhodes, Greece, 29 September\u20131 October 2021"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Tukeyev, U., Gabdullina, N., Karipbayeva, N., Abdurakhmonova, N., Balabekova, T., and Karibayeva, A. (2024, January 15\u201317). Computational Model of Morphology and Stemming of Uzbek Words on Complete Set of Endings. Proceedings of the 2024 IEEE 3rd International Conference on Problems of Informatics, Electronics and Radio Engineering (PIERE), Novosibirsk, Russia.","DOI":"10.1109\/PIERE62470.2024.10805062"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Kadyrbek, N., Mansurova, M., Shomanov, A., and Makharova, G. (2023). The Development of a Kazakh Speech Recognition Model Using a Convolutional Neural Network with Fixed Character Level Filters. Big Data Cogn. Comput., 7.","DOI":"10.3390\/bdcc7030132"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yeshpanov, R., Mussakhojayeva, S., and Khassanov, Y. (2023, January 20\u201324). Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration. Proceedings of the INTERSPEECH, 2023, Dublin, Ireland.","DOI":"10.21437\/Interspeech.2023-249"},{"key":"ref_9","unstructured":"Mussakhojayeva, S., Janaliyeva, A., Mirzakhmetov, A., Khassanov, Y., and Varol, H.A. (September, January 30). KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset. Proceedings of the INTERSPEECH, Brno, Czechia."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"5880","DOI":"10.30534\/ijatcse\/2020\/249942020","article-title":"Development of Automatic Speech Recognition for Kazakh Language Using Transfer Learning","volume":"9","author":"Kuanyshbay","year":"2020","journal-title":"Int. J. Adv. Trends Comput. Sci. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Orken, M., Dina, O., Keylan, A., Tolganay, T., and Mohamed, O. (2022). A study of transformer-based end-to-end speech recognition system for Kazakh language. Sci. Rep., 12.","DOI":"10.1038\/s41598-022-12260-y"},{"key":"ref_12","first-page":"201","article-title":"Automatic Speech Recognition: A Survey of Deep Learning Techniques and Approaches","volume":"7","author":"Ahlawat","year":"2025","journal-title":"Int. J. Cogn. Comput. Eng."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Rosenberg, A., Zhang, Y., Ramabhadran, B., Jia, Y., Moreno, P., Wu, Y., and Wu, Z. (2019, January 14\u201318). Speech recognition with augmented synthesized speech. Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop, Singapore.","DOI":"10.1109\/ASRU46091.2019.9003990"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhang, C., Li, B., Sainath, T., Strohman, T., Mavandadi, S., Chang, S.-Y., and Haghani, P. (2022, January 18\u201322). Streaming end-to-end multilingual speech recognition with joint language identification. Proceedings of the INTERSPEECH, 2022, Incheon, Korea.","DOI":"10.21437\/Interspeech.2022-11249"},{"key":"ref_15","unstructured":"Zhang, Y., Han, W., Qin, J., Wang, Y., Bapna, A., Chen, Z., Chen, N., Li, B., Axelrod, V., and Wang, G. (2023). Google USM: Scaling automatic speech recognition beyond 100 languages. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1186\/s13636-024-00349-3","article-title":"Exploration of Whisper Fine-Tuning Strategies for Low-Resource ASR","volume":"2024","author":"Liu","year":"2024","journal-title":"EURASIP J. Audio Speech Music Process."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Metze, F., Gandhe, A., Miao, Y., Sheikh, Z., Wang, Y., Xu, D., Zhang, H., Kim, J., Lane, I., and Lee, W.K. (2015, January 19\u201324). Semi-supervised training in low-resource ASR and KWS. Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia.","DOI":"10.1109\/ICASSP.2015.7178862"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Du, W., Maimaitiyiming, Y., Nijat, M., Li, L., Hamdulla, A., and Wang, D. (2023). Automatic Speech Recognition for Uyghur, Kazakh, and Kyrgyz: An Overview. Appl. Sci., 13.","DOI":"10.3390\/app13010326"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Mukhamadiyev, A., Mukhiddinov, M., Khujayarov, I., Ochilov, M., and Cho, J. (2023). Development of Language Models for Continuous Uzbek Speech Recognition System. Sensors, 23.","DOI":"10.3390\/s23031145"},{"key":"ref_20","unstructured":"Veitsman, Y., and Hartmann, M. (2025, January 19\u201320). Recent Advancements and Challenges of Turkic Central Asian Language Processing. Proceedings of the First Workshop on Language Models for Low-Resource Languages, Abu Dhabi, United Arab Emirates."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Oyucu, S. (2023). A Novel End-to-End Turkish Text-to-Speech (TTS) System via Deep Learning. Electronics, 12.","DOI":"10.3390\/electronics12081900"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Polat, H., Turan, A.K., Ko\u00e7ak, C., and Ula\u015f, H.B. (2024). Implementation of a Whisper Architecture-Based Turkish ASR System and Evaluation of Fine-Tuning with LoRA Adapter. Electronics, 13.","DOI":"10.3390\/electronics13214227"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Musaev, M., Mussakhojayeva, S., Khujayorov, I., Khassanov, Y., Ochilov, M., and Atakan Varol, H. (2021). USC: An open-source Uzbek speech corpus and initial speech recognition experiments. Speech and Computer, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-030-87802-3_40"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Mussakhojayeva, S., Dauletbek, K., Yeshpanov, R., and Varol, H.A. (2023). Multilingual Speech Recognition for Turkic Languages. Information, 14.","DOI":"10.3390\/info14020074"},{"key":"ref_25","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention Is All You Need. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gulati, A., Qin, J., Chiu, C.C., Parmar, N., Zhang, Y., Yu, J., Han, W., Wang, S., Zhang, Z., and Wu, Y. (2020, January 25\u201329). Conformer: Convolution-augmented Transformer for Speech Recognition. Proceedings of the INTERSPEECH, Shanghai, China.","DOI":"10.21437\/Interspeech.2020-3015"},{"key":"ref_27","unstructured":"Radford, A., Kim, J.W., Xu, T., Brockman, G., McLeavey, C., and Sutskever, I. (2022). Robust Speech Recognition via Large-Scale Weak Supervision. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Watanabe, S., Hori, T., Karita, S., Hayashi, T., Nishitoba, J., Unno, Y., Soplin, N.E., Heymann, J., Wiesner, M., and Chen, N. (2018). ESPnet: End-to-End Speech Processing Toolkit. arXiv.","DOI":"10.21437\/Interspeech.2018-1456"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzm\u00e1n, F., Grave, E., Ott, M., Zettlemoyer, L., and Stoyanov, V. (2020, January 19). Unsupervised Cross-lingual Representation Learning at Scale. Proceedings of the ACL, Online.","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"ref_30","unstructured":"Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv."},{"key":"ref_31","unstructured":"(2025, June 10). ESPnet Toolkit. Available online: https:\/\/github.com\/espnet\/espnet."},{"key":"ref_32","unstructured":"Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motl\u00ed\u010dek, P., Qian, Y., and Schwarz, P. (2011, January 11\u201315). The Kaldi speech recognition toolkit. Proceedings of the ASRU, Hilton Waikoloa Village Resort, Waikoloa, HI, USA."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., and Funtowicz, M. (2020, January 16\u201320). Transformers: State-of-the-art Natural Language Processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Shen, J., Pang, R., Weiss, R.J., Schuster, M., Jaitly, N., Yang, Z., Chen, Z., Zhang, Y., Wang, Y., and Skerrv-Ryan, R. (2018, January 15\u201320). Natural TTS synthesis by conditioning WaveNet on mel spectrogram predictions. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada.","DOI":"10.1109\/ICASSP.2018.8461368"},{"key":"ref_35","unstructured":"Ren, Y., Hu, C., Tan, X., Qin, T., Zhao, S., Zhao, Z., and Liu, T.Y. (2020). FastSpeech 2: Fast and High-Quality End-to-End Text to Speech. arXiv."},{"key":"ref_36","unstructured":"Kong, J., Kim, J., and Bae, J. (2020). HiFi-GAN: Generative Adversarial Network for Efficient and High Fidelity Speech Synthesis. arXiv."},{"key":"ref_37","first-page":"62","article-title":"Kazakh Speech and Recognition Methods: Error Analysis and Improvement Prospects","volume":"20","author":"Karabaliyev","year":"2024","journal-title":"Sci. J. Astana IT Univ."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Rakhimova, D., Duisenbekkyzy, Z., and Adali, E. (2025). Investigation of ASR Models for Low-Resource Kazakh Child Speech: Corpus Development, Model Adaptation, and Evaluation. Appl. Sci., 15.","DOI":"10.3390\/app15168989"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Khassanov, Y., Mussakhojayeva, S., Mirzakhmetov, A., Adiyev, A., Nurpeiissov, M., and Varol, H.A. (2021, January 19\u201323). A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL, Online.","DOI":"10.18653\/v1\/2021.eacl-main.58"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Kozhirbayev, Z., and Islamgozhayev, T. (2023). Cascade Speech Translation for the Kazakh Language. Appl. Sci., 13.","DOI":"10.3390\/app13158900"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"369","DOI":"10.1016\/j.procs.2023.12.219","article-title":"Speech recognition for Kazakh language: A research paper","volume":"231","author":"Kapyshev","year":"2024","journal-title":"Procedia Comput. Sci."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Mussakhojayeva, S., Gilmullin, R., Khakimov, B., Galimov, M., Orel, D., Abilbekov, A., and Varol, H.A. (2024, January 19\u201322). Noise-Robust Multilingual Speech Recognition and the Tatar Speech Corpus. Proceedings of the 2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Osaka, Japan.","DOI":"10.1109\/ICAIIC60209.2024.10463419"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Mussakhojayeva, S., Khassanov, Y., and Varol, H.A. (2022, January 18\u201322). KSC2: An industrial-scale open-source Kazakh speech corpus. Proceedings of the INTERSPEECH, Incheon, Korea.","DOI":"10.21437\/Interspeech.2022-421"},{"key":"ref_44","unstructured":"(2025, June 10). Common Voice. Available online: https:\/\/commonvoice.mozilla.org\/ru\/datasets."},{"key":"ref_45","unstructured":"(2025, June 10). KazakhTTS. Available online: https:\/\/github.com\/IS2AI\/Kazakh_TTS."},{"key":"ref_46","unstructured":"(2025, June 10). Kazakh Speech Corpus. Available online: https:\/\/www.openslr.org\/102\/."},{"key":"ref_47","unstructured":"(2025, June 10). Kazakh Speech Dataset. Available online: https:\/\/www.openslr.org\/140\/."},{"key":"ref_48","unstructured":"(2025, September 10). ISSAI. Available online: https:\/\/github.com\/IS2AI\/."},{"key":"ref_49","unstructured":"(2025, June 02). Whisper. Available online: https:\/\/github.com\/openai\/whisper."},{"key":"ref_50","unstructured":"(2025, July 02). GPT-4o-transcribe (OpenAI). Available online: https:\/\/platform.openai.com\/docs\/models\/gpt-4o-transcribe."},{"key":"ref_51","unstructured":"(2025, June 02). Soyle. Available online: https:\/\/github.com\/IS2AI\/Soyle."},{"key":"ref_52","unstructured":"(2025, June 20). ElevenLabs Scribe. Available online: https:\/\/elevenlabs.io\/docs\/capabilities\/speech-to-text."},{"key":"ref_53","unstructured":"(2025, June 30). Voiser. Available online: https:\/\/voiser.net\/."},{"key":"ref_54","unstructured":"(2025, June 10). MMS (Massively Multilingual Speech). Available online: https:\/\/github.com\/facebookresearch\/fairseq\/tree\/main\/examples\/mms."},{"key":"ref_55","unstructured":"(2025, June 12). TurkicTTS. Available online: https:\/\/github.com\/IS2AI\/TurkicTTS."},{"key":"ref_56","unstructured":"(2025, July 02). ElevenLabs TTS. Available online: https:\/\/elevenlabs.io\/docs\/capabilities\/text-to-speech."},{"key":"ref_57","unstructured":"(2025, June 30). OpenAI TTS. Available online: https:\/\/platform.openai.com\/docs\/guides\/text-to-speech."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Papineni, K., Roukos, S., Ward, T., and Zhu, W.J. (2002, January 6\u201312). BLEU: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA.","DOI":"10.3115\/1073083.1073135"},{"key":"ref_59","unstructured":"Gillick, L., and Cox, S. (1989, January 23\u201326). Some Statistical Issues in the Comparison of Speech Recognition Algorithms. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Glasgow, UK."},{"key":"ref_60","unstructured":"Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. (2006, January 8\u201312). A Study of Translation Edit Rate with Targeted Human Annotation. Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, Cambridge, MA, USA. Available online: https:\/\/aclanthology.org\/2006.amta-papers.25\/."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Popovi\u0107, M. (2015, January 17\u201318). chrF: Character n-gram F-score for automatic MT evaluation. Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal.","DOI":"10.18653\/v1\/W15-3049"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Rei, R., Farinha, A.C., and Martins, A.F.T. (2020, January 16\u201320). COMET: A Neural Framework for MT Evaluation. Proceedings of the EMNLP, Online.","DOI":"10.18653\/v1\/2020.emnlp-main.213"},{"key":"ref_63","unstructured":"Kubichek, R. (1993, January 19\u201321). Mel-cepstral distance measure for objective speech quality assessment. Proceedings of the IEEE Pacific Rim Conference on Communications Computers and Signal Processing, Victoria, BC, Canada."},{"key":"ref_64","unstructured":"Rix, A.W., Beerends, J.G., Hollier, M.P., and Hekstra, A.P. (2001, January 7\u201311). Perceptual evaluation of speech quality (PESQ). Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), Salt Lake City, UT, USA."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"2125","DOI":"10.1109\/TASL.2011.2114881","article-title":"An Algorithm for Intelligibility Prediction of Time\u2013Frequency Weighted Noisy Speech","volume":"19","author":"Taal","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Reddy, C.K., Gopal, V., and Cutler, R. (2020, January 6\u201311). DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors. Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada.","DOI":"10.1109\/ICASSP39728.2021.9414878"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/10\/879\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:19:53Z","timestamp":1760109593000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/10\/879"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,10]]},"references-count":66,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["info16100879"],"URL":"https:\/\/doi.org\/10.3390\/info16100879","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,10]]}}}