{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T16:12:22Z","timestamp":1781021542588,"version":"3.54.1"},"reference-count":145,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2018,4,24]],"date-time":"2018-04-24T00:00:00Z","timestamp":1524528000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Huawei Technologies Co. Ltd"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2018,9,30]]},"abstract":"<jats:p>Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition but still remains an important challenge. Data-driven supervised approaches, especially the ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks. In the meanwhile, we discuss the pros and cons of these approaches and provide their experimental results on benchmark databases. We expect that this overview can facilitate the development of the robustness of speech recognition systems in acoustic noisy environments.<\/jats:p>","DOI":"10.1145\/3178115","type":"journal-article","created":{"date-parts":[[2018,4,25]],"date-time":"2018-04-25T12:22:17Z","timestamp":1524658937000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":264,"title":["Deep Learning for Environmentally Robust Speech Recognition"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8487-0561","authenticated-orcid":false,"given":"Zixing","family":"Zhang","sequence":"first","affiliation":[{"name":"Imperial College London, London, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"J\u00fcrgen","family":"Geiger","sequence":"additional","affiliation":[{"name":"Huawei Technologies Duesseldorf GmbH, Munich, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jouni","family":"Pohjalainen","sequence":"additional","affiliation":[{"name":"University of Passau, Passau, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Amr El-Desoky","family":"Mousa","sequence":"additional","affiliation":[{"name":"University of Passau, Passau, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wenyu","family":"Jin","sequence":"additional","affiliation":[{"name":"Huawei Technologies Duesseldorf GmbH, Munich, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bj\u00f6rn","family":"Schuller","sequence":"additional","affiliation":[{"name":"Imperial College London, London, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2018,4,24]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","volume-title":"Acoustical and Environmental Robustness in Automatic Speech Recognition","author":"Acero Alex","DOI":"10.1007\/978-1-4615-3122-7"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201916)","author":"Amodei Dario","year":"2016"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2006.889720"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASRU.2015.7404837"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2012.10.004"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1979.1163209"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/11677482_3"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201915)","author":"Chen Zhuo"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-4012"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1987.1165054"},{"key":"e_1_2_1_11_1","volume-title":"Bharath","author":"Creswell Antonia","year":"2017"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2134090"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2010.2064307"},{"key":"e_1_2_1_14_1","volume-title":"Robust Speech Recognition of Uncertain or Missing Data","author":"Deng Li"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1984.1164453"},{"key":"e_1_2_1_16_1","first-page":"2","article-title":"Speech enhancement using a minimum mean-square error log-spectral amplitude estimator","volume":"23","author":"Ephraim Yariv","year":"1985","journal-title":"IEEE Trans. Acoust. Speech Sign. Process."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the CHiME-4 Workshop.","author":"Erdogan Hakan","year":"2016"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178061"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-552"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6853900"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178797"},{"key":"e_1_2_1_22_1","first-page":"2","article-title":"Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains","volume":"2","author":"Gauvain J.-L.","year":"1994","journal-title":"IEEE Trans. Speech Aud. Process."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2014-229"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the REVERB Workshop, Held in Conjunction with ICASSP 2014 and HSCMA 2014. 1--8.","author":"Geiger J\u00fcrgen","year":"2014"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2318514"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2014-151"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178925"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-6393(94)00059-J"},{"key":"e_1_2_1_29_1","volume-title":"Deep Learning","author":"Goodfellow Ian"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems (NIPS\u201914)","author":"Goodfellow Ian","year":"2014"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201916)","author":"Grais E. M. G."},{"key":"e_1_2_1_32_1","volume-title":"Generating sequences with recurrent neural networks. arXiv:1308.0850 (Aug","author":"Graves Alex","year":"2013"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/TASLP.2015.2416653","article-title":"Learning spectral mapping for speech dereverberation and denoising","volume":"23","author":"Han Kun","year":"2015","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/78.80901"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the International Conference on Spoken Language Processing (ICSLP\u201998)","author":"John H."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASRU.2015.7404829"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7471664"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 4th International Workshop on Speech Processing in Everyday Environments (CHiME\u201916)","author":"Heymann Jahn","year":"2016"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1127647"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2005-263"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201915)","author":"Hoshen Yedid"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2007.911054"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6853860"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2015.2468583"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2013-267"},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201914)","author":"Karanasou Penny"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2012.2189389"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13634-016-0306-6"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472634"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1989.1.4.541"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1038\/44565"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472782"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7953157"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1006\/csla.1995.0010"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-173"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2304637"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2974024"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6854663"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1201\/b14529"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2013-130"},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201912)","author":"Maas Andrew L."},{"key":"e_1_2_1_64_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems (NIPS\u201916)","author":"Mao Xiaojiao","year":"2016"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/89.668818"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2003.818212"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952160"},{"key":"e_1_2_1_68_1","volume-title":"Proceedings of the 4th International Workshop on Speech Processing in Everyday Environments (CHiME\u201916)","author":"Menne Tobias","year":"2016"},{"key":"e_1_2_1_69_1","volume-title":"Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology. 459--462","author":"Mestre Xavier","year":"2003"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-1620"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-388"},{"key":"e_1_2_1_72_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201915)","author":"Mirsamadi Seyedmahdad"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_2_1_74_1","volume-title":"Proceedings of the the 2nd International Conference on Language Resources and Evaluation (LREC\u201900)","author":"Moreno Asunci\u00f3n","year":"2000"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639038"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6854051"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2372314"},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the the 34th International Conference on Machine Learning (ICML\u201917)","author":"Ochiai Tsubasa"},{"key":"e_1_2_1_79_1","volume-title":"Wavenet: A generative model for raw audio. arXiv:1609.03499 (Sep.","author":"van den Oord Aaron","year":"2016"},{"key":"e_1_2_1_80_1","unstructured":"ITU-T Recommendation P.862. 2001. Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs.  ITU-T Recommendation P.862. 2001. Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs."},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2010.12.003"},{"key":"e_1_2_1_82_1","volume-title":"A fully convolutional neural network for speech enhancement. arXiv:1609.07132 (Sep","author":"Park Se Rim","year":"2016"},{"key":"e_1_2_1_83_1","volume-title":"SEGAN: Speech enhancement generative adversarial network. arXiv:1703.09452 (Mar.","author":"Pascual Santiago","year":"2017"},{"key":"e_1_2_1_84_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201900)","author":"Pearce David","year":"2000"},{"key":"e_1_2_1_85_1","unstructured":"David Pearce and J. Picone. 2002. Aurora Working Group: DSR Front End LVCSR Evaluation AU\/384\/02. Institute for Signal & Information Processing Mississippi State University Tech. Rep (2002).  David Pearce and J. Picone. 2002. Aurora Working Group: DSR Front End LVCSR Evaluation AU\/384\/02. Institute for Signal & Information Processing Mississippi State University Tech. Rep (2002)."},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2014-572"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-1672"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2016.2602884"},{"key":"e_1_2_1_89_1","volume-title":"Proceedings of the CHiME-4 Workshop.","author":"Qian Yanmin","year":"2016"},{"key":"e_1_2_1_90_1","volume-title":"Thomas Pinkney Barnwell, and Mark A. Clements","author":"Quackenbush Schuyler R.","year":"1988"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7953084"},{"key":"e_1_2_1_92_1","volume-title":"A Wavenet for speech denoising. arXiv:1706.07162 (June","author":"Rethage Dario","year":"2017"},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178838"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2017.2672401"},{"key":"e_1_2_1_96_1","volume-title":"The IBM 2016 english conversational telephone speech recognition system. In Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201916)","author":"Saon George"},{"key":"e_1_2_1_97_1","volume-title":"Advances in Speech Recognition","author":"Schalkwyk Johan"},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.1145\/2991468"},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2010.5495567"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639100"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2000.859160"},{"key":"e_1_2_1_102_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2006.09.003"},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASRU.2013.6707744"},{"key":"e_1_2_1_104_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2014.2325781"},{"key":"e_1_2_1_105_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472669"},{"key":"e_1_2_1_106_1","doi-asserted-by":"publisher","DOI":"10.1109\/53.665"},{"key":"e_1_2_1_107_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6637622"},{"key":"e_1_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2005.858005"},{"key":"e_1_2_1_109_1","volume-title":"Jon Barker, and Ricard Marxer.","author":"Vincent Emmanuel","year":"2016"},{"key":"e_1_2_1_110_1","doi-asserted-by":"crossref","volume-title":"Techniques for Noise Robustness in Automatic Speech Recognition","author":"Virtanen Tuomas","DOI":"10.1002\/9781118392683"},{"key":"e_1_2_1_111_1","volume-title":"On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis","author":"Wang DeLiang"},{"key":"e_1_2_1_112_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2352935"},{"key":"e_1_2_1_113_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2013.2250961"},{"key":"e_1_2_1_114_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178800"},{"key":"e_1_2_1_115_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2016.2528171"},{"key":"e_1_2_1_116_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2007.898454"},{"key":"e_1_2_1_117_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-22482-4_11"},{"key":"e_1_2_1_118_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6854294"},{"key":"e_1_2_1_119_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2012.6287817"},{"key":"e_1_2_1_120_1","volume-title":"Proceedings of the 2nd CHiME Workshop on Machine Listening in Multisource Environments. 86--90","author":"Weninger Felix","year":"2013"},{"key":"e_1_2_1_121_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2014.01.001"},{"key":"e_1_2_1_122_1","volume-title":"Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP\u201914)","author":"Weninger Felix"},{"key":"e_1_2_1_123_1","volume-title":"Proceedings of the REVERB Workshop, Held in Conjunction with ICASSP 2014 and HSCMA 2014. 1--8.","author":"Weninger Felix","year":"2014"},{"key":"e_1_2_1_124_1","volume-title":"Proceedings of the IEEE International Conference on Audio, Speech, and Signal Processing (ICASSP\u201917)","author":"Donald"},{"key":"e_1_2_1_125_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2017.2696307"},{"key":"e_1_2_1_126_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-11509-7_9"},{"key":"e_1_2_1_127_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2009-375"},{"key":"e_1_2_1_128_1","first-page":"5","article-title":"Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening","volume":"4","author":"W\u00f6llmer Martin","year":"2010","journal-title":"IEEE J. Select. Top. Sign. Process."},{"key":"e_1_2_1_129_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6638983"},{"key":"e_1_2_1_130_1","volume-title":"Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144 (Oct","author":"Wu Yonghui","year":"2016"},{"key":"e_1_2_1_131_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2013-754"},{"key":"e_1_2_1_132_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472778"},{"key":"e_1_2_1_133_1","volume-title":"Proceedings of the CHiME Workshop. 26--31","author":"Xiao Xiong","year":"2016"},{"key":"e_1_2_1_135_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2014-571"},{"key":"e_1_2_1_136_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2013.2291240"},{"key":"e_1_2_1_137_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2013.2291240"},{"key":"e_1_2_1_138_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2364452"},{"key":"e_1_2_1_139_1","doi-asserted-by":"publisher","DOI":"10.1145\/2168752.2168754"},{"key":"e_1_2_1_140_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205029"},{"key":"e_1_2_1_141_1","volume-title":"Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH\u201915)","author":"Yu Chengzhu"},{"key":"e_1_2_1_142_1","volume-title":"Wide residual networks. arXiv:1605.07146 (May","author":"Zagoruyko Sergey","year":"2016"},{"key":"e_1_2_1_143_1","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201914)","author":"Matthew"},{"key":"e_1_2_1_144_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2017.2699358"},{"key":"e_1_2_1_145_1","first-page":"3","article-title":"Channel mapping using bidirectional long short-term memory for dereverberation in hand-free voice controlled devices","volume":"60","author":"Zhang Zixing","year":"2014","journal-title":"IEEE Trans. Cons. Electron."},{"key":"e_1_2_1_146_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-998"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3178115","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3178115","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:02:55Z","timestamp":1750215775000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3178115"}},"subtitle":["An Overview of Recent Developments"],"short-title":[],"issued":{"date-parts":[[2018,4,24]]},"references-count":145,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2018,9,30]]}},"alternative-id":["10.1145\/3178115"],"URL":"https:\/\/doi.org\/10.1145\/3178115","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,4,24]]},"assertion":[{"value":"2017-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-04-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}