{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T11:02:44Z","timestamp":1740135764487,"version":"3.37.3"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2016,11,22]],"date-time":"2016-11-22T00:00:00Z","timestamp":1479772800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"funder":[{"name":"Iran Telecommunication Research Center (ITRC)"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Circuits Syst Signal Process"],"published-print":{"date-parts":[[2017,8]]},"DOI":"10.1007\/s00034-016-0434-0","type":"journal-article","created":{"date-parts":[[2016,11,22]],"date-time":"2016-11-22T07:42:28Z","timestamp":1479800548000},"page":"3222-3242","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Spectro-temporal Power Spectrum Features for Noise Robust ASR"],"prefix":"10.1007","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3849-3061","authenticated-orcid":false,"given":"Hamed","family":"Riazati Seresht","sequence":"first","affiliation":[]},{"given":"Seyed Mohammad","family":"Ahadi","sequence":"additional","affiliation":[]},{"given":"Sanaz","family":"Seyedin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2016,11,22]]},"reference":[{"key":"434_CR1","doi-asserted-by":"crossref","unstructured":"J. Bouvrie, T. Ezzat, T. Poggio, Localized spectro-temporal cepstral analysis of speech. in Proceedings on ICASSP (Las Vegas, NV, USA, 2008)","DOI":"10.1109\/ICASSP.2008.4518714"},{"key":"434_CR2","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1016\/S0167-6393(03)00016-5","volume":"41","author":"J Chen","year":"2003","unstructured":"J. Chen, K.K. Paliwal, S. Nakamura, Cepstrum derived from differential power spectrum for robust speech recognition. Speech Commun. 41, 469\u2013484 (2003)","journal-title":"Speech Commun."},{"key":"434_CR3","doi-asserted-by":"crossref","unstructured":"S.-Y. Chang, B.T. Meyer, N. Morgan, Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction. in Proceedings on ICASSP (Vancouver, Canada, 2013)","DOI":"10.1109\/ICASSP.2013.6639032"},{"key":"434_CR4","first-page":"1","volume":"7","author":"J Demsar","year":"2006","unstructured":"J. Demsar, Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1\u201330 (2006)","journal-title":"J. Mach. Learn. Res."},{"key":"434_CR5","doi-asserted-by":"crossref","first-page":"1220","DOI":"10.1152\/jn.2001.85.3.1220","volume":"85","author":"DA Depireux","year":"2001","unstructured":"D.A. Depireux, J.Z. Simon, D.J. Klein, S.A. Shamma, Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. J. Neurophysiol. 85, 1220\u20131234 (2001)","journal-title":"J. Neurophysiol."},{"key":"434_CR6","unstructured":"ETSI standard document, Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm, ETSI ES 202 050 v.1.1.5. Nov 2003"},{"key":"434_CR7","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.csl.2006.03.002","volume":"21","author":"G Farahani","year":"2007","unstructured":"G. Farahani, S.M. Ahadi, M.M. Homayounpour, Features based on filtering and spectral peaks in autocorrelation domain for robust speech recognition. Comput. Speech Lang. 21, 187\u2013205 (2007)","journal-title":"Comput. Speech Lang."},{"key":"434_CR8","doi-asserted-by":"crossref","first-page":"3769","DOI":"10.1121\/1.3504658","volume":"128","author":"S Ganapathy","year":"2010","unstructured":"S. Ganapathy, S. Thomas, H. Hermansky, Temporal envelope compensation for robust phoneme recognition using modulation spectrum. J. Acoust. Soc. Am. 128, 3769\u20133780 (2010)","journal-title":"J. Acoust. Soc. Am."},{"key":"434_CR9","doi-asserted-by":"crossref","unstructured":"H.A. Gupta, A. Raju, A. Alwan, Non-linear dimension reduction of Gabor features for noise-robust ASR. in Proceedings on ICASSP (Florence, Italy, 2014)","DOI":"10.1109\/ICASSP.2014.6853891"},{"key":"434_CR10","unstructured":"M. Happel, S. Muller, J. Anemueller, F. Ohl, Predictability of STRFs in auditory cortex neurons depends on stimulus class. in Proceedings on Interspeech (Brisbane, Australia, 2008)"},{"key":"434_CR11","doi-asserted-by":"crossref","first-page":"1738","DOI":"10.1121\/1.399423","volume":"87","author":"H Hermansky","year":"1990","unstructured":"H. Hermansky, Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87, 1738\u20131752 (1990)","journal-title":"J. Acoust. Soc. Am."},{"issue":"4","key":"434_CR12","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1109\/89.326616","volume":"2","author":"H Hermansky","year":"1994","unstructured":"H. Hermansky, N. Morgan, Rasta processing of speech. IEEE Trans. Speech Audio Process. 2(4), 578\u2013589 (1994)","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"434_CR13","unstructured":"H.-G. Hirsch, D. Pearce, The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. in Proceedings on ISCA ITRW ASR (Paris, France, 2000)"},{"key":"434_CR14","unstructured":"HTK, The hidden Markov model toolkit (2002). [Online]. Version: HTK 3.2.1 (windows). Available: http:\/\/htk.eng.cam.ac.uk"},{"key":"434_CR15","unstructured":"S. Ikbal, H. Bourlard, M. Magimai, HMM\/ANN based spectral peak location estimation for noise robust speech recognition. in Proceedings on ICASSP (Philadelphia, PA, USA, 2005)"},{"key":"434_CR16","doi-asserted-by":"crossref","unstructured":"S. Ikbal, M.M. Doss, H. Misra, H. Bourlard, Spectro-temporal activity pattern (STAP) features for robust ASR. in Proceedings on ICSLP (Jeju Island, South Korea, 2004)","DOI":"10.21437\/Interspeech.2004-641"},{"key":"434_CR17","doi-asserted-by":"crossref","unstructured":"C. Kim, R.M. Stern, Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring. in Proceedings on ICASSP (Dallas, Texas, USA, 2010)","DOI":"10.1109\/ICASSP.2010.5495570"},{"key":"434_CR18","doi-asserted-by":"crossref","unstructured":"M. Kleinschmidt, D. Gelbart, Improving word accuracy with Gabor feature extraction. in Proceedings on Interspeech (Denver, CO, USA, 2002)","DOI":"10.21437\/ICSLP.2002-5"},{"issue":"5","key":"434_CR19","doi-asserted-by":"crossref","first-page":"726","DOI":"10.1016\/j.specom.2010.08.007","volume":"53","author":"M Marki","year":"2011","unstructured":"M. Marki, Y. Stylianou, Discrimination of speech from nonspeech in broadcast news based on modulation frequency features. Speech Commun. 53(5), 726\u2013735 (2011)","journal-title":"Speech Commun."},{"key":"434_CR20","doi-asserted-by":"crossref","unstructured":"N. Mesgarani, S. Thomas, H. Hermansky, A multistream multiresolution framework for phoneme recognition. in Proceedings on Interspeech (Makuhari, Japan, 2010)","DOI":"10.21437\/Interspeech.2010-120"},{"issue":"5","key":"434_CR21","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1016\/j.specom.2010.07.002","volume":"53","author":"BT Meyer","year":"2011","unstructured":"B.T. Meyer, B. Kollmeier, Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition. Speech Commun. 53(5), 753\u2013767 (2011)","journal-title":"Speech Commun."},{"key":"434_CR22","doi-asserted-by":"crossref","unstructured":"B.T. Meyer, S.R. Ravuri, M.R. Scheadler, N. Morgan, Comparing different flavors of spectro-temporal features for ASR. in Proceedings on Interspeech (Florence, Italy, 2011)","DOI":"10.21437\/Interspeech.2011-103"},{"key":"434_CR23","doi-asserted-by":"crossref","unstructured":"B. Meyer, C. Spille, B. Kollmeier, N. Morgan, Hooking up spectro-temporal filters with auditory-inspiring representations for robust automatic speech recognition. in Proceedings on Interspeech (Portland, Oregon, USA, 2012)","DOI":"10.21437\/Interspeech.2012-386"},{"key":"434_CR24","doi-asserted-by":"crossref","unstructured":"S.K. Nemala, K. Patil, M. Elhilali, Multistream bandpass modulation features for robust speech recognition. in Proceedings on Interspeech (Florence, Italy, 2011)","DOI":"10.21437\/Interspeech.2011-105"},{"key":"434_CR25","doi-asserted-by":"crossref","DOI":"10.2174\/97816080517241110101","volume-title":"Recent advances in robust speech recognition technology","author":"J Ramirez","year":"2011","unstructured":"J. Ramirez, J.M. Gorriz, Recent advances in robust speech recognition technology (Bentham Science Publishers, Sharjah, 2011)"},{"key":"434_CR26","doi-asserted-by":"crossref","unstructured":"S.V. Ravuri, N. Morgan, Easy does it: robust spectro-temporal many-stream ASR without fine tuning streams. in Proceedings on ICASSP (Kyoto, Japan, 2012)","DOI":"10.1109\/ICASSP.2012.6288872"},{"key":"434_CR27","doi-asserted-by":"crossref","first-page":"4134","DOI":"10.1121\/1.3699200","volume":"131","author":"MR Schaedler","year":"2012","unstructured":"M.R. Schaedler, B.T. Meyer, B. Kollmeier, Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. J. Acoust. Soc. Am. 131, 4134\u20134151 (2012)","journal-title":"J. Acoust. Soc. Am."},{"key":"434_CR28","doi-asserted-by":"crossref","first-page":"2252","DOI":"10.1587\/transinf.E93.D.2252","volume":"E93\u2013D","author":"S Seyedin","year":"2010","unstructured":"S. Seyedin, S.M. Ahadi, A new subband-weighted MVDR-based front-end for robust speech recognition. IEICE Trans. Inf. Syst. E93\u2013D, 2252\u20132261 (2010)","journal-title":"IEICE Trans. Inf. Syst."},{"key":"434_CR29","doi-asserted-by":"publisher","unstructured":"S. Seyedin, S.M. Ahadi, S. Gazor, New features using robust MVDR spectrum of filtered autocorrelation sequence for robust speech recognition. Scientific World J. 2013, 634160 (2013). doi: 10.1155\/2013\/634160","DOI":"10.1155\/2013\/634160"},{"key":"434_CR30","doi-asserted-by":"crossref","unstructured":"S. Tiberwala, H. Hermansky, Multi-band and adaptation approaches to robust speech recognition. in Proceedings on Eurospeech (Rhodes, Greece, 1997)","DOI":"10.21437\/Eurospeech.1997-411"},{"key":"434_CR31","unstructured":"A. Varga, H. Steeneken, M. Tomlinson, J.D., The NOISEX-92 study on the effect of additive noise on automatic speech recognition (Speech Research Unit, Defense Research Agency, Malvern, 1992)"},{"key":"434_CR32","doi-asserted-by":"crossref","unstructured":"M. Westphal, The use of cepstral means in conversational speech recognition. in Proceedings on Eurospeech (Rhodes, Greece, 1997)","DOI":"10.21437\/Eurospeech.1997-120"},{"key":"434_CR33","doi-asserted-by":"crossref","first-page":"1662","DOI":"10.1109\/TASL.2008.2002082","volume":"16","author":"X Xiao","year":"2008","unstructured":"X. Xiao, E.S. Chng, H. Li, Normalization of the speech modulation spectra for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 16, 1662\u20131674 (2008)","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"434_CR34","doi-asserted-by":"crossref","unstructured":"S. Zhao, N. Morgan, Multi-stream spectro-temporal features for robust speech recognition. in Proceedings on Interspeech (Brisbane, Australia, 2008)","DOI":"10.21437\/Interspeech.2008-209"},{"key":"434_CR35","doi-asserted-by":"crossref","unstructured":"S.Y. Zhao, S. Ravuri, N. Morgan, Multi-stream to many-stream: using spectro-temporal features for ASR. in Proceedings ICASSP (Dallas, Texas, USA, 2010)","DOI":"10.21437\/Interspeech.2009-747"}],"container-title":["Circuits, Systems, and Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00034-016-0434-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-016-0434-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-016-0434-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,21]],"date-time":"2023-08-21T00:39:04Z","timestamp":1692578344000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00034-016-0434-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,11,22]]},"references-count":35,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2017,8]]}},"alternative-id":["434"],"URL":"https:\/\/doi.org\/10.1007\/s00034-016-0434-0","relation":{},"ISSN":["0278-081X","1531-5878"],"issn-type":[{"type":"print","value":"0278-081X"},{"type":"electronic","value":"1531-5878"}],"subject":[],"published":{"date-parts":[[2016,11,22]]}}}