{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T23:10:20Z","timestamp":1771024220517,"version":"3.50.1"},"reference-count":65,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T00:00:00Z","timestamp":1662595200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T00:00:00Z","timestamp":1662595200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100012774","name":"Innovationsfonden","doi-asserted-by":"publisher","award":["9065-0004"],"award-info":[{"award-number":["9065-0004"]}],"id":[{"id":"10.13039\/100012774","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J AUDIO SPEECH MUSIC PROC."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this paper, we propose a supervised single-channel speech enhancement method that combines Kullback-Leibler (KL) divergence-based non-negative matrix factorization (NMF) and a hidden Markov model (NMF-HMM). With the integration of the HMM, the temporal dynamics information of speech signals can be taken into account. This method includes a training stage and an enhancement stage. In the training stage, the sum of the Poisson distribution, leading to the KL divergence measure, is used as the observation model for each state of the HMM. This ensures that a computationally efficient multiplicative update can be used for the parameter update of this model. In the online enhancement stage, a novel minimum mean square error estimator is proposed for the NMF-HMM. This estimator can be implemented using parallel computing, reducing the time complexity. Moreover, compared to the traditional NMF-based speech enhancement methods, the experimental results show that our proposed algorithm improved the short-time objective intelligibility and perceptual evaluation of speech quality by 5% and 0.18, respectively.<\/jats:p>","DOI":"10.1186\/s13636-022-00256-5","type":"journal-article","created":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T21:05:59Z","timestamp":1662671159000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence"],"prefix":"10.1186","volume":"2022","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7120-5842","authenticated-orcid":false,"given":"Yang","family":"Xiang","sequence":"first","affiliation":[]},{"given":"Liming","family":"Shi","sequence":"additional","affiliation":[]},{"given":"Jesper Lisby","family":"H\u00f8jvang","sequence":"additional","affiliation":[]},{"given":"Morten H\u00f8jfeldt","family":"Rasmussen","sequence":"additional","affiliation":[]},{"given":"Mads Gr\u00e6sb\u00f8ll","family":"Christensen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,8]]},"reference":[{"issue":"4","key":"256_CR1","doi-asserted-by":"publisher","first-page":"745","DOI":"10.1109\/TASLP.2014.2304637","volume":"22","author":"J Li","year":"2014","unstructured":"J. Li, L. Deng, Y. Gong, R. Haeb-Umbach, An overview of noise-robust automatic speech recognition. IEEE\/ACM Trans. Audio Speech Lang. Process. 22(4), 745\u2013777 (2014)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR2","doi-asserted-by":"publisher","DOI":"10.1201\/b14529","volume-title":"Speech Enhancement: Theory and Practice","author":"PC Loizou","year":"2013","unstructured":"P.C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, Boca Raton, 2013)"},{"issue":"1","key":"256_CR3","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1109\/LSP.2013.2291240","volume":"21","author":"Y Xu","year":"2013","unstructured":"Y. Xu, J. Du, L.-R. Dai, C.-H. Lee, An experimental study on speech enhancement based on deep neural networks. IEEE Signal Process. Lett. 21(1), 65\u201368 (2013)","journal-title":"IEEE Signal Process. Lett."},{"key":"256_CR4","doi-asserted-by":"crossref","unstructured":"I. Cohen, S. Gannot, in Springer Handbook of Speech Processing. Spectral enhancement methods\u00a0(Springer, Berlin, Heidelberg,\u00a02008) p. 873\u2013902","DOI":"10.1007\/978-3-540-49127-9_44"},{"issue":"2","key":"256_CR5","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1109\/TASSP.1979.1163209","volume":"27","author":"S Boll","year":"1979","unstructured":"S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113\u2013120 (1979)","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"256_CR6","doi-asserted-by":"crossref","unstructured":"K.B. Christensen, M.G. Christensen, J.B. Boldt, F. Gran, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Experimental study of generalized subspace filters for the cocktail party situation (IEEE,\u00a0Shanghai,\u00a02016), p. 420\u2013424","DOI":"10.1109\/ICASSP.2016.7471709"},{"issue":"4","key":"256_CR7","doi-asserted-by":"publisher","first-page":"631","DOI":"10.1109\/TASLP.2015.2505416","volume":"24","author":"JR Jensen","year":"2015","unstructured":"J.R. Jensen, J. Benesty, M.G. Christensen, Noise reduction with optimal variable span linear filters. IEEE\/ACM Trans. Audio Speech Lang. Process. 24(4), 631\u2013644 (2015)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"4","key":"256_CR8","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1109\/89.397090","volume":"3","author":"Y Ephraim","year":"1995","unstructured":"Y. Ephraim, H.L. Van Trees, A signal subspace approach for speech enhancement. IEEE Trans. Speech Audio Process. 3(4), 251\u2013266 (1995)","journal-title":"IEEE Trans. Speech Audio Process."},{"issue":"6","key":"256_CR9","doi-asserted-by":"publisher","first-page":"700","DOI":"10.1109\/TSA.2003.818031","volume":"11","author":"F Jabloun","year":"2003","unstructured":"F. Jabloun, B. Champagne, Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Trans. Speech Audio Process. 11(6), 700\u2013708 (2003)","journal-title":"IEEE Trans. Speech Audio Process."},{"issue":"3","key":"256_CR10","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1109\/TASSP.1978.1163086","volume":"26","author":"J Lim","year":"1978","unstructured":"J. Lim, A. Oppenheim, All-pole modeling of degraded speech. IEEE Trans. Acoust. Speech Signal Process. 26(3), 197\u2013210 (1978)","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"issue":"6","key":"256_CR11","doi-asserted-by":"publisher","first-page":"1109","DOI":"10.1109\/TASSP.1984.1164453","volume":"32","author":"Y Ephraim","year":"1984","unstructured":"Y. Ephraim, D. Malah, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 32(6), 1109\u20131121 (1984)","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"issue":"2","key":"256_CR12","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1109\/TASSP.1985.1164550","volume":"33","author":"Y Ephraim","year":"1985","unstructured":"Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 33(2), 443\u2013445 (1985)","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"256_CR13","doi-asserted-by":"crossref","unstructured":"A. Hussain, M. Chetouani, S. Squartini, A. Bastari, F. Piazza, in Progress in nonlinear speech processing. An overview, Nonlinear speech enhancement (Springer, Berlin, Heidelberg,\u00a02007), p. 217\u2013248","DOI":"10.1007\/978-3-540-71505-4_12"},{"issue":"1","key":"256_CR14","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1109\/TASLP.2014.2364452","volume":"23","author":"Y Xu","year":"2014","unstructured":"Y. Xu, J. Du, L.-R. Dai, C.-H. Lee, A regression approach to speech enhancement based on deep neural networks. IEEE\/ACM Trans. Audio Speech Lang. Process. 23(1), 7\u201319 (2014)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"1","key":"256_CR15","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1109\/TASLP.2018.2872128","volume":"27","author":"MS Kavalekalam","year":"2018","unstructured":"M.S. Kavalekalam, J.K. Nielsen, J.B. Boldt, M.G. Christensen, Model-based speech enhancement for intelligibility improvement in binaural hearing aids. IEEE\/ACM Trans. Audio Speech Lang. Process. 27(1), 99\u2013113 (2018)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"2","key":"256_CR16","doi-asserted-by":"publisher","first-page":"441","DOI":"10.1109\/TASL.2006.881696","volume":"15","author":"S Srinivasan","year":"2007","unstructured":"S. Srinivasan, J. Samuelsson, W.B. Kleijn, Codebook-based bayesian speech enhancement for nonstationary environments. IEEE Trans. Audio Speech Lang. Process. 15(2), 441\u2013452 (2007)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"256_CR17","doi-asserted-by":"crossref","unstructured":"M.S. Kavalekalam, J.K. Nielsen, L. Shi, M.G. Christensen, J. Boldt, in Proc. European Signal Processing Conf. Online parametric NMF for speech enhancement (IEEE, Rome,\u00a02018), p. 2320\u20132324","DOI":"10.23919\/EUSIPCO.2018.8553039"},{"issue":"3","key":"256_CR18","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1109\/TASLP.2016.2636445","volume":"25","author":"Q He","year":"2016","unstructured":"Q. He, F. Bao, C. Bao, Multiplicative update of auto-regressive gains for codebook-based speech enhancement. IEEE\/ACM Trans. Audio Speech Lang. Process. 25(3), 457\u2013468 (2016)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"3","key":"256_CR19","doi-asserted-by":"publisher","first-page":"882","DOI":"10.1109\/TASL.2006.885256","volume":"15","author":"DY Zhao","year":"2007","unstructured":"D.Y. Zhao, W.B. Kleijn, HMM-based gain modeling for enhancement of speech in noise. IEEE Trans. Audio Speech Lang. Process. 15(3), 882\u2013892 (2007)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"11","key":"256_CR20","doi-asserted-by":"publisher","first-page":"1973","DOI":"10.1109\/TASLP.2015.2458585","volume":"23","author":"F Deng","year":"2015","unstructured":"F. Deng, C. Bao, W.B. Kleijn, Sparse hidden Markov models for speech enhancement in non-stationary noise environments. IEEE\/ACM Trans. Audio Speech Lang. Process. 23(11), 1973\u20131987 (2015)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR21","doi-asserted-by":"crossref","unstructured":"Y. Bengio et al., Learning deep architectures for AI. Found. Trends\u00ae Mach. Learn. 2(1), 1\u2013127 (2009)","DOI":"10.1561\/2200000006"},{"issue":"7","key":"256_CR22","doi-asserted-by":"publisher","first-page":"1527","DOI":"10.1162\/neco.2006.18.7.1527","volume":"18","author":"GE Hinton","year":"2006","unstructured":"G.E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527\u20131554 (2006)","journal-title":"Neural Comput."},{"issue":"10","key":"256_CR23","doi-asserted-by":"publisher","first-page":"1702","DOI":"10.1109\/TASLP.2018.2842159","volume":"26","author":"D Wang","year":"2018","unstructured":"D. Wang, J. Chen, Supervised speech separation based on deep learning: an overview. IEEE\/ACM Trans. Audio Speech Lang. Process. 26(10), 1702\u20131726 (2018)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"12","key":"256_CR24","doi-asserted-by":"publisher","first-page":"1849","DOI":"10.1109\/TASLP.2014.2352935","volume":"22","author":"Y Wang","year":"2014","unstructured":"Y. Wang, A. Narayanan, D. Wang, On training targets for supervised speech separation. IEEE\/ACM Trans. Audio Speech Lang. Process. 22(12), 1849\u20131858 (2014)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR25","doi-asserted-by":"crossref","unstructured":"A. Narayanan, D. Wang, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Ideal ratio mask estimation using deep neural networks for robust speech recognition (IEEE,\u00a0Vancouver,\u00a02013), p. 7092\u20137096","DOI":"10.1109\/ICASSP.2013.6639038"},{"key":"256_CR26","doi-asserted-by":"crossref","unstructured":"S.R. Park, J. Lee, A fully convolutional neural network for speech enhancement. arXiv preprint arXiv:1609.07132. (2016)","DOI":"10.21437\/Interspeech.2017-1465"},{"issue":"6","key":"256_CR27","doi-asserted-by":"publisher","first-page":"1223","DOI":"10.1162\/0899766053630350","volume":"17","author":"H Jacobsson","year":"2005","unstructured":"H. Jacobsson, Rule extraction from recurrent neural networks: Ataxonomy and review. Neural Comput. 17(6), 1223\u20131263 (2005)","journal-title":"Neural Comput."},{"issue":"12","key":"256_CR28","doi-asserted-by":"publisher","first-page":"2136","DOI":"10.1109\/TASLP.2015.2468583","volume":"23","author":"P-S Huang","year":"2015","unstructured":"P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, Joint optimization of masks and deep recurrent neural networks for monaural source separation. IEEE\/ACM Trans. Audio Speech Lang. Process. 23(12), 2136\u20132147 (2015)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR29","unstructured":"I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al.,\u00a0in Proc. Advances in Neural Inform. Process. Syst. Generative adversarial nets (Communications of the ACM, US,\u00a02014), p. 2672\u20132680"},{"key":"256_CR30","doi-asserted-by":"crossref","unstructured":"S. Pascual, A. Bonafonte, J. Serra, Segan: Speech enhancement generative adversarial network. arXiv preprint\u00a0arXiv:1703.09452. (2017)","DOI":"10.21437\/Interspeech.2017-1428"},{"issue":"1","key":"256_CR31","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1109\/TASLP.2016.2628641","volume":"25","author":"M Kolb\u00e6k","year":"2016","unstructured":"M. Kolb\u00e6k, Z.-H. Tan, J. Jensen, Speech intelligibility potential of general and specialized deep neural network based speech enhancement systems. IEEE\/ACM Trans. Audio Speech Lang. Process. 25(1), 153\u2013167 (2016)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR32","doi-asserted-by":"publisher","first-page":"1826","DOI":"10.1109\/TASLP.2020.2997118","volume":"28","author":"Y Xiang","year":"2020","unstructured":"Y. Xiang, C. Bao, A parallel-data-free speech enhancement method using multi-objective learning cycle-consistent generative adversarial network. IEEE\/ACM Trans. Audio Speech Lang. Process. 28, 1826\u20131838 (2020)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"6755","key":"256_CR33","doi-asserted-by":"publisher","first-page":"788","DOI":"10.1038\/44565","volume":"401","author":"DD Lee","year":"1999","unstructured":"D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization. Nature. 401(6755), 788\u2013791 (1999)","journal-title":"Nature."},{"key":"256_CR34","unstructured":"D.D. Lee, H.S. Seung, in Proc. Advances in Neural Inform. Process. Syst. Algorithms for non-negative matrix factorization (Communications of the ACM, US,\u00a02001), p. 556\u2013562"},{"issue":"5","key":"256_CR35","doi-asserted-by":"publisher","first-page":"960","DOI":"10.1109\/TASLP.2019.2907015","volume":"27","author":"K Shimada","year":"2019","unstructured":"K. Shimada, Y. Bando, M. Mimura, K. Itoyama, K. Yoshii, T. Kawahara, Unsupervised speech enhancement based on multichannel nmf-informed beamforming for noise-robust automatic speech recognition. IEEE\/ACM Trans. Audio Speech Lang. Process. 27(5), 960\u2013971 (2019)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR36","doi-asserted-by":"crossref","unstructured":"E.M. Grais, H. Erdogan, in Int. Conf. Digital Signal Process. Single channel speech music separation using nonnegative matrix factorization and spectral masks (IEEE, Corfu,\u00a02011), p. 1\u20136","DOI":"10.21437\/Interspeech.2011-498"},{"key":"256_CR37","doi-asserted-by":"crossref","unstructured":"K.W. Wilson, B. Raj, P. Smaragdis, in Proc Interspeech. Regularized non-negative matrix factorization with temporal dependencies for speech denoising (ICSA, Brisbane,\u00a02008)","DOI":"10.21437\/Interspeech.2008-49"},{"key":"256_CR38","doi-asserted-by":"crossref","unstructured":"S. Nie, S. Liang, H. Li, X. Zhang, Z. Yang, W.J. Liu, L.K. Dong, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Exploiting spectro-temporal structures using NMF for DNN-based supervised speech separation (IEEE,\u00a0Shanghai,\u00a02016), p. 469\u2013473","DOI":"10.1109\/ICASSP.2016.7471719"},{"issue":"2","key":"256_CR39","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1109\/LSP.2014.2354456","volume":"22","author":"TG Kang","year":"2014","unstructured":"T.G. Kang, K. Kwon, J.W. Shin, N.S. Kim, NMF-based target source separation using deep neural network. IEEE Signal Process. Lett. 22(2), 229\u2013233 (2014)","journal-title":"IEEE Signal Process. Lett."},{"issue":"11","key":"256_CR40","doi-asserted-by":"publisher","first-page":"2043","DOI":"10.1109\/TASLP.2018.2851151","volume":"26","author":"S Nie","year":"2018","unstructured":"S. Nie, S. Liang, W. Liu, X. Zhang, J. Tao, Deep learning based speech separation via nmf-style reconstructions. IEEE\/ACM Trans. Audio Speech Lang. Process. 26(11), 2043\u20132055 (2018)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"256_CR41","doi-asserted-by":"crossref","unstructured":"A.W. Rix, J.G. Beerends, M.P. Hollier, A.P. Hekstra, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs, vol. 2 (IEEE, Salt Lake City,\u00a02001), p. 749\u2013752","DOI":"10.1109\/ICASSP.2001.941023"},{"issue":"7","key":"256_CR42","doi-asserted-by":"publisher","first-page":"2125","DOI":"10.1109\/TASL.2011.2114881","volume":"19","author":"CH Taal","year":"2011","unstructured":"C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, An algorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Trans. Audio Speech Lang. Process. 19(7), 2125\u20132136 (2011)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"256_CR43","doi-asserted-by":"crossref","unstructured":"T.T. Vu, B. Bigot, E.S. Chng, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Combining non-negative matrix factorization and deep neural networks for speech enhancement and automatic speech recognition (IEEE, Shanghai,\u00a02016), p. 499\u2013503","DOI":"10.1109\/ICASSP.2016.7471725"},{"issue":"10","key":"256_CR44","doi-asserted-by":"publisher","first-page":"2140","DOI":"10.1109\/TASL.2013.2270369","volume":"21","author":"N Mohammadiha","year":"2013","unstructured":"N. Mohammadiha, P. Smaragdis, A. Leijon, Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Trans. Audio Speech Lang. Process. 21(10), 2140\u20132151 (2013)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"256_CR45","doi-asserted-by":"crossref","unstructured":"G.J. Mysore, P. Smaragdis, B. Raj, in International conference on latent variable analysis and signal separation. Non-negative hidden Markov modeling of audio with application to source separation (Springer,\u00a0Malo,\u00a02010), p. 140\u2013148","DOI":"10.1007\/978-3-642-15995-4_18"},{"key":"256_CR46","doi-asserted-by":"crossref","unstructured":"Z. Wang, X. Li, X. Wang, Q. Fu, Y. Yan, in Proc. Interspeech. A DNN-HMM approach to non-negative matrix factorization based speech enhancement (ICSA, Pittsburgh,\u00a02016), p. 3763\u20133767","DOI":"10.21437\/Interspeech.2016-147"},{"key":"256_CR47","doi-asserted-by":"crossref","unstructured":"Y. Xiang, L. Shi, J.L. H\u00f8jvang, M.H. Rasmussen, M.G. Christensen, in Proc. Interspeech. An NMF-HMM speech enhancement method based on Kullback-Leibler divergence (ICSA, Shanghai, 2020), p. 2667\u20132671","DOI":"10.21437\/Interspeech.2020-1047"},{"key":"256_CR48","doi-asserted-by":"crossref","unstructured":"Y. Xiang, L. Shi, J.L. H\u00f8jvang, M.H. Rasmussen, M.G. Christensen, in Proc. IEEE Int. Conf. coust., Speech, Signal Process. A novel NMF-HMM speech enhancement algorithm based on poisson mixture model (IEEE,\u00a0Toronto,\u00a02021), p. 721\u2013725","DOI":"10.1109\/ICASSP39728.2021.9414620"},{"key":"256_CR49","doi-asserted-by":"crossref","unstructured":"C. F\u00e9votte, J. Le\u00a0Roux, J.R. Hershey, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Non-negative dynamical system with application to speech and audio (IEEE,\u00a0Vancouver,\u00a02013), p. 3158\u20133162","DOI":"10.1109\/ICASSP.2013.6638240"},{"issue":"3","key":"256_CR50","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1162\/neco.2008.04-08-771","volume":"21","author":"C F\u00e9votte","year":"2009","unstructured":"C. F\u00e9votte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the itakura-saito divergence: with application to music analysis. Neural Comput. 21(3), 793\u2013830 (2009)","journal-title":"Neural Comput."},{"key":"256_CR51","doi-asserted-by":"crossref","unstructured":"C. F\u00e9votte, J. Idier, Algorithms for nonnegative matrix factorization with the \u03b2-divergence. Neural Comput. 23(9), 2421\u20132456 (2011)","DOI":"10.1162\/NECO_a_00168"},{"key":"256_CR52","doi-asserted-by":"crossref","unstructured":"D. FitzGerald, M. Cranitch, E. Coyle, On the use of the beta divergence for musical source separation (IET digital library, Dublin,\u00a02009)","DOI":"10.1049\/cp.2009.1711"},{"key":"256_CR53","doi-asserted-by":"crossref","unstructured":"A.T. Cemgil, Bayesian inference for nonnegative matrix factorisation models. Computational intelligence and neuroscience. 2009,\u00a01\u201317 (2009)","DOI":"10.1155\/2009\/785152"},{"key":"256_CR54","doi-asserted-by":"crossref","unstructured":"D. Baby, J.F. Gemmeke, T. Virtanen, et al., in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Exemplar-based speech enhancement for deep neural network based automatic speech recognition (IEEE,\u00a0South Brisbane,\u00a02015), p. 4485\u20134489","DOI":"10.1109\/ICASSP.2015.7178819"},{"issue":"1","key":"256_CR55","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TASL.2006.876726","volume":"15","author":"P Smaragdis","year":"2006","unstructured":"P. Smaragdis, Convolutive speech bases and their application to supervised speech separation. IEEE Trans. Audio Speech Lang. Process. 15(1), 1\u201312 (2006)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"1","key":"256_CR56","first-page":"1","volume":"3","author":"LE Baum","year":"1972","unstructured":"L.E. Baum, An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities. 3(1), 1\u20138 (1972)","journal-title":"Inequalities."},{"key":"256_CR57","unstructured":"I.-T. Recommendation, Perceptual evaluation of speech quality (PESQ): an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. Rec. ITU-T P (IEEE, US,\u00a02001), p. 862"},{"key":"256_CR58","doi-asserted-by":"crossref","unstructured":"J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1. NASA STI\/Recon technical report n. 93, (1993)","DOI":"10.6028\/NIST.IR.4930"},{"issue":"8","key":"256_CR59","doi-asserted-by":"publisher","first-page":"2067","DOI":"10.1109\/TASL.2010.2041110","volume":"18","author":"G Hu","year":"2010","unstructured":"G. Hu, D. Wang, A tandem algorithm for pitch estimation and voiced speech segregation. IEEE Trans. Audio Speech Lang. Process. 18(8), 2067\u20132079 (2010)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"issue":"3","key":"256_CR60","doi-asserted-by":"publisher","first-page":"247","DOI":"10.1016\/0167-6393(93)90095-3","volume":"12","author":"A Varga","year":"1993","unstructured":"A. Varga, H.J. Steeneken, Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun 12(3), 247\u2013251 (1993)","journal-title":"Speech Commun"},{"issue":"11","key":"256_CR61","doi-asserted-by":"publisher","first-page":"2403","DOI":"10.1016\/S0165-1684(01)00128-1","volume":"81","author":"I Cohen","year":"2001","unstructured":"I. Cohen, B. Berdugo, Speech enhancement for non-stationary noise environments. Signal Process. 81(11), 2403\u20132418 (2001)","journal-title":"Signal Process."},{"issue":"5","key":"256_CR62","doi-asserted-by":"publisher","first-page":"466","DOI":"10.1109\/TSA.2003.811544","volume":"11","author":"I Cohen","year":"2003","unstructured":"I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process. 11(5), 466\u2013475 (2003)","journal-title":"IEEE Trans. Speech Audio Process."},{"issue":"1\u20133","key":"256_CR63","doi-asserted-by":"publisher","first-page":"88","DOI":"10.1016\/j.neucom.2008.01.033","volume":"72","author":"PD O\u2019grady","year":"2008","unstructured":"P.D. O\u2019grady, B.A. Pearlmutter, Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint. Neurocomputing 72(1\u20133), 88\u2013101 (2008)","journal-title":"Neurocomputing"},{"key":"256_CR64","doi-asserted-by":"crossref","unstructured":"S. Braun, I. Tashev, in International Conference on Speech and Computer. Data augmentation and loss normalization for deep noise suppression (Springer,\u00a0Petersburg,\u00a02020), p. 79\u201386","DOI":"10.1007\/978-3-030-60276-5_8"},{"issue":"4","key":"256_CR65","doi-asserted-by":"publisher","first-page":"1383","DOI":"10.1109\/TASL.2011.2180896","volume":"20","author":"T Gerkmann","year":"2011","unstructured":"T. Gerkmann, R.C. Hendriks, Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Trans. Audio Speech Lang. Process. 20(4), 1383\u20131393 (2011)","journal-title":"IEEE Trans. Audio Speech Lang. Process."}],"container-title":["EURASIP Journal on Audio, Speech, and Music Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13636-022-00256-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13636-022-00256-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13636-022-00256-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T14:11:29Z","timestamp":1727964689000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmp-eurasipjournals.springeropen.com\/articles\/10.1186\/s13636-022-00256-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,8]]},"references-count":65,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["256"],"URL":"https:\/\/doi.org\/10.1186\/s13636-022-00256-5","relation":{},"ISSN":["1687-4722"],"issn-type":[{"value":"1687-4722","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,8]]},"assertion":[{"value":"24 January 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 August 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 September 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"22"}}