{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T21:06:13Z","timestamp":1777583173929,"version":"3.51.4"},"reference-count":53,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2022,6,20]],"date-time":"2022-06-20T00:00:00Z","timestamp":1655683200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>The purpose of speech enhancement is to improve the quality of speech signals degraded by noise, reverberation, or other artifacts that can affect the intelligibility, automatic recognition, or other attributes involved in speech technologies and telecommunications, among others. In such applications, it is essential to provide methods to enhance the signals to allow the understanding of the messages or adequate processing of the speech. For this purpose, during the past few decades, several techniques have been proposed and implemented for the abundance of possible conditions and applications. Recently, those methods based on deep learning seem to outperform previous proposals even on real-time processing. Among the new explorations found in the literature, the hybrid approaches have been presented as a possibility to extend the capacity of individual methods, and therefore increase their capacity for the applications. In this paper, we evaluate a hybrid approach that combines both deep learning and wavelet transformation. The extensive experimentation performed to select the proper wavelets and the training of neural networks allowed us to assess whether the hybrid approach is of benefit or not for the speech enhancement task under several types and levels of noise, providing relevant information for future implementations.<\/jats:p>","DOI":"10.3390\/computation10060102","type":"journal-article","created":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T01:43:27Z","timestamp":1655775807000},"page":"102","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3313-8324","authenticated-orcid":false,"given":"Michelle","family":"Guti\u00e9rrez-Mu\u00f1oz","sequence":"first","affiliation":[{"name":"Electrical Engineering Department, University of Costa Rica, San Jos\u00e9 11501-2060, Costa Rica"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6833-9938","authenticated-orcid":false,"given":"Marvin","family":"Coto-Jim\u00e9nez","sequence":"additional","affiliation":[{"name":"Electrical Engineering Department, University of Costa Rica, San Jos\u00e9 11501-2060, Costa Rica"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,6,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"012027","DOI":"10.1088\/1742-6596\/1627\/1\/012027","article-title":"Research on Speech Signal Denoising Algorithm Based on Wavelet Analysis","volume":"1627","author":"Tan","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Krishna, G., Tran, C., Yu, J., and Tewfik, A.H. (2019, January 12\u201317). Speech recognition with no speech or with noisy speech. Proceedings of the ICASSP 2019\u20142019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK.","DOI":"10.1109\/ICASSP.2019.8683453"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Meyer, B.T., Mallidi, S.H., Martinez, A.M.C., Pay\u00e1-Vay\u00e1, G., Kayser, H., and Hermansky, H. (2016, January 13\u201316). Performance monitoring for automatic speech recognition in noisy multi-channel environments. Proceedings of the 2016 IEEE Spoken Language Technology Workshop (SLT). IEEE, San Diego, CA, USA.","DOI":"10.1109\/SLT.2016.7846244"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Coto-Jimenez, M., Goddard-Close, J., Di Persia, L., and Rufiner, H.L. (2018, January 18\u201320). Hybrid speech enhancement with wiener filters and deep LSTM denoising autoencoders. Proceedings of the 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), San Carlos, Costa Rica.","DOI":"10.1109\/IWOBI.2018.8464132"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/j.bspc.2018.09.010","article-title":"Multi-objective learning based speech enhancement method to increase speech quality and intelligibility for hearing aid device users","volume":"48","author":"Lai","year":"2019","journal-title":"Biomed. Signal Process. Control"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Park, G., Cho, W., Kim, K.S., and Lee, S. (2020). Speech Enhancement for Hearing Aids with Deep Learning on Environmental Noises. Appl. Sci., 10.","DOI":"10.3390\/app10176077"},{"key":"ref_7","unstructured":"Kulkarni, D.S., Deshmukh, R.R., and Shrishrimal, P.P. (2016). A review of speech signal enhancement techniques. Int. J. Comput. Appl., 139."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chaudhari, A., and Dhonde, S. (2015, January 8\u201310). A review on speech enhancement techniques. Proceedings of the 2015 International Conference on Pervasive Computing (ICPC), Pune, India.","DOI":"10.1109\/PERVASIVE.2015.7087096"},{"key":"ref_9","unstructured":"Benesty, J., Makino, S., and Chen, J. (2005). Speech Enhancement, Springer Science & Business Media."},{"key":"ref_10","first-page":"1","article-title":"Different approaches of spectral subtraction method for enhancing the speech signal in noisy environments","volume":"2","author":"Fukane","year":"2011","journal-title":"Int. J. Sci. Eng. Res."},{"key":"ref_11","unstructured":"Evans, N.W., Mason, J.S., Liu, W.M., and Fauve, B. (2006, January 14\u201319). An assessment on the fundamental limitations of spectral subtraction. Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Toulouse, France."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Liu, D., Smaragdis, P., and Kim, M. (2014, January 14\u201318). Experiments on deep learning for speech denoising. Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association, Singapore.","DOI":"10.21437\/Interspeech.2014-574"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"982","DOI":"10.1109\/TASLP.2015.2416653","article-title":"Learning spectral mapping for speech dereverberation and denoising","volume":"23","author":"Han","year":"2015","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Coto-Jim\u00e9nez, M. (2018, January 22\u201327). Robustness of LSTM neural networks for the enhancement of spectral parameters in noisy speech signals. Proceedings of the Mexican International Conference on Artificial Intelligence, Guadalajara, Mexico.","DOI":"10.1007\/978-3-030-04497-8_19"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1007\/s10772-018-9516-7","article-title":"Study on processing of wavelet speech denoising in speech recognition system","volume":"21","author":"Zhong","year":"2018","journal-title":"Int. J. Speech Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1051","DOI":"10.1007\/s10772-019-09645-2","article-title":"A review of supervised learning algorithms for single channel speech enhancement","volume":"22","author":"Saleem","year":"2019","journal-title":"Int. J. Speech Technol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.specom.2020.04.002","article-title":"A review of multi-objective deep learning speech denoising methods","volume":"122","author":"Azarang","year":"2020","journal-title":"Speech Commun."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1016\/j.dsp.2012.06.011","article-title":"Wavelet based speech presence probability estimator for speech enhancement","volume":"22","author":"Lun","year":"2012","journal-title":"Digit. Signal Process."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Balaji, V., Sathiya Priya, J., Dinesh Kumar, J., and Karthi, S. (2021). Radial basis function neural network based speech enhancement system using SLANTLET transform through hybrid vector wiener filter. Inventive Communication and Computational Technologies, Springer.","DOI":"10.1007\/978-981-15-7345-3_61"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1007\/s10772-021-09830-2","article-title":"Performance measurement of a hybrid speech enhancement technique","volume":"24","author":"Bahadur","year":"2021","journal-title":"Int. J. Speech Technol."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lun, D.P.K., and Hsung, T.C. (June, January 30). Improved wavelet based a-priori SNR estimation for speech enhancement. Proceedings of the 2010 IEEE International Symposium on Circuits and Systems, Paris, France.","DOI":"10.1109\/ISCAS.2010.5537182"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1620","DOI":"10.1016\/j.specom.2006.06.004","article-title":"Wavelet speech enhancement based on time\u2013scale adaptation","volume":"48","author":"Bahoura","year":"2006","journal-title":"Speech Commun."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.csl.2015.06.001","article-title":"Speech enhancement based on wavelet packet of an improved principal component analysis","volume":"35","author":"Bouzid","year":"2016","journal-title":"Comput. Speech Lang."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1504\/IJCVR.2019.098801","article-title":"Use of radial basis function network with discrete wavelet transform for speech enhancement","volume":"9","author":"Ram","year":"2019","journal-title":"Int. J. Comput. Vis. Robot."},{"key":"ref_25","first-page":"2","article-title":"Denoising speech signals by wavelet transform","volume":"6","author":"Mihov","year":"2009","journal-title":"Annu. J. Electron."},{"key":"ref_26","unstructured":"Chui, C.K. (2016). An Introduction to Wavelets, Elsevier."},{"key":"ref_27","first-page":"83","article-title":"Studies on implementation of Harr and Daubechies wavelet for denoising of speech signal","volume":"4","author":"Chavan","year":"2010","journal-title":"Int. J. Circuits Syst. Signal Process."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Priyadarshani, N., Marsland, S., Castro, I., and Punchihewa, A. (2016). Birdsong denoising using wavelets. PLoS ONE, 11.","DOI":"10.1371\/journal.pone.0146790"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Al-Qazzaz, N.K., Ali, S., Ahmad, S.A., Islam, M.S., and Ariff, M.I. (2014, January 8\u201310). Selection of mother wavelets thresholding methods in denoising multi-channel EEG signals during working memory task. Proceedings of the 2014 IEEE conference on biomedical engineering and sciences (IECBES), Miri, Sarawak, Malaysia.","DOI":"10.1109\/IECBES.2014.7047488"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1109\/MCAS.2009.932556","article-title":"A short introduction to wavelets and their applications","volume":"9","author":"Gargour","year":"2009","journal-title":"IEEE Circuits Syst. Mag."},{"key":"ref_31","unstructured":"Mallat, S. (2008). A Wavelet Tour of Signal Processing: The Sparse Way, Academic Press."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1109\/5992.841791","article-title":"The what, how, and why of wavelet shrinkage denoising","volume":"2","author":"Taswell","year":"2000","journal-title":"Comput. Sci. Eng."},{"key":"ref_33","unstructured":"Donoho, D., and Johnstone, I. (1992). Ideal Spatial Adaptation via Wavelet Shrinkage. Biometrika. To Appear, Department of Statistics, Stanford University. Technical Report, Also Tech. Report."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1109\/18.382009","article-title":"De-noising by soft-thresholding","volume":"41","author":"Donoho","year":"1995","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_35","unstructured":"Xiu-min, Z., and Gui-tao, C. (2009, January 13\u201314). A novel de-noising method for heart sound signal using improved thresholding function in wavelet domain. Proceedings of the 2009 International Conference on Future BioMedical Information Engineering (FBIE), Sanya, China."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Oktar, M.A., Nibouche, M., and Baltaci, Y. (2016, January 16\u201319). Denoising speech by notch filter and wavelet thresholding in real time. Proceedings of the 2016 24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Turkey.","DOI":"10.1109\/SIU.2016.7495864"},{"key":"ref_37","first-page":"2040","article-title":"Performance analysis of wavelet thresholding methods in denoising of audio signals of some Indian Musical Instruments","volume":"4","author":"Verma","year":"2012","journal-title":"Int. J. Eng. Sci. Technol."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Valencia, D., Orejuela, D., Salazar, J., and Valencia, J. (30\u20132, January 30). Comparison analysis between rigrsure, sqtwolog, heursure and minimaxi techniques using hard and soft thresholding methods. Proceedings of the 2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA), Bucaramanga, Colombia.","DOI":"10.1109\/STSIVA.2016.7743309"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"9245","DOI":"10.1016\/j.jfranklin.2017.05.042","article-title":"An on-line orthogonal wavelet denoising algorithm for high-resolution surface scans","volume":"355","author":"Schimmack","year":"2018","journal-title":"J. Frankl. Inst."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"10123","DOI":"10.1016\/j.jfranklin.2019.08.023","article-title":"A structural property of the wavelet packet transform method to localise incoherency of a signal","volume":"356","author":"Schimmack","year":"2019","journal-title":"J. Frankl. Inst."},{"key":"ref_41","unstructured":"Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y. (2016). Deep Learning, MIT Press."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"e00938","DOI":"10.1016\/j.heliyon.2018.e00938","article-title":"State-of-the-art in artificial neural network applications: A survey","volume":"4","author":"Abiodun","year":"2018","journal-title":"Heliyon"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"e12967","DOI":"10.1002\/2050-7038.12967","article-title":"Optimal BRA based electric demand prediction strategy considering instance-based learning of the forecast factors","volume":"31","author":"Waseem","year":"2021","journal-title":"Int. Trans. Electr. Energy Syst."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1109\/JSTSP.2019.2908700","article-title":"Deep learning for audio signal processing","volume":"13","author":"Purwins","year":"2019","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_46","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Westhausen, N.L., and Meyer, B.T. (2020, January 25\u201329). Dual-Signal Transformation LSTM Network for Real-Time Noise Suppression. Proceedings of the Interspeech 2020, Shanghai, China.","DOI":"10.21437\/Interspeech.2020-2631"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Mercorelli, P. (2017). A Fault Detection and Data Reconciliation Algorithm in Technical Processes with the Help of Haar Wavelets Packets. Algorithms, 10.","DOI":"10.3390\/a10010013"},{"key":"ref_49","unstructured":"Kominek, J., and Black, A.W. (2004, January 20\u201322). The CMU Arctic speech databases. Proceedings of the Fifth ISCA Workshop on Speech Synthesis, Vienna, Austria."},{"key":"ref_50","unstructured":"Rix, A.W., Beerends, J.G., Hollier, M.P., and Hekstra, A.P. (2001, January 7\u201311). Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (Cat. No. 01CH37221), Salt Lake City, UT, USA."},{"key":"ref_51","first-page":"755","article-title":"Perceptual Evaluation of Speech Quality (PESQ) The New ITU Standard for End-to-End Speech Quality Assessment Part I\u2013Time-Delay Compensation","volume":"50","author":"Rix","year":"2002","journal-title":"J. Audio Eng. Soc."},{"key":"ref_52","first-page":"8677043","article-title":"Denoising speech based on deep learning and wavelet decomposition","volume":"2021","author":"Wang","year":"2021","journal-title":"Sci. Program."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Gnanamanickam, J., Natarajan, Y., and KR, S.P. (2021). A hybrid speech enhancement algorithm for voice assistance application. Sensors, 21.","DOI":"10.3390\/s21217025"}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/10\/6\/102\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:35:50Z","timestamp":1760139350000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/10\/6\/102"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,20]]},"references-count":53,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2022,6]]}},"alternative-id":["computation10060102"],"URL":"https:\/\/doi.org\/10.3390\/computation10060102","relation":{},"ISSN":["2079-3197"],"issn-type":[{"value":"2079-3197","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,20]]}}}