{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,2]],"date-time":"2022-04-02T14:36:01Z","timestamp":1648910161546},"reference-count":23,"publisher":"World Scientific Pub Co Pte Lt","issue":"07","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Patt. Recogn. Artif. Intell."],"published-print":{"date-parts":[[2012,11]]},"abstract":"<jats:p> The mismatch between the training and the testing environments greatly degrades the performance of speaker recognition. Although many robust techniques have been proposed, speaker recognition in mismatch condition is still a challenge. To solve this problem, we propose a sparse-based auditory model as the front-end of speaker recognition by simulating auditory processing of speech signal. To this end, we introduce narrow-band filter-bank instead of the widely used wide-band filter-bank to simulate the basilar membrane filter-bank, use sparse representation as the approximation of basilar membrane coding strategy, and incorporate the frequency selectivity enhance mechanism between tectorial membrane and basilar membrane by practical engineering approximation. Compared with the standard Mel-frequency cepstral coefficient approach, our preliminary experimental results indicate that the sparse-based auditory model consistently improve the robustness of speaker recognition in mismatched condition. <\/jats:p>","DOI":"10.1142\/s0218001412500152","type":"journal-article","created":{"date-parts":[[2012,11,12]],"date-time":"2012-11-12T01:29:09Z","timestamp":1352683749000},"page":"1250015","source":"Crossref","is-referenced-by-count":1,"title":["SPARSE-BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION"],"prefix":"10.1142","volume":"26","author":[{"given":"DATAO","family":"YOU","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology, 92 West Dazhi Street, Nan Gang District, Harbin, 150001, P. R. China"}]},{"given":"JIQING","family":"HAN","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology, 92 West Dazhi Street, Nan Gang District, Harbin, 150001, P. R. China"}]},{"given":"TIERAN","family":"ZHENG","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology, 92 West Dazhi Street, Nan Gang District, Harbin, 150001, P. R. China"}]},{"given":"GUIBIN","family":"ZHENG","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology, 92 West Dazhi Street, Nan Gang District, Harbin, 150001, P. R. China"}]}],"member":"219","published-online":{"date-parts":[[2013,2,17]]},"reference":[{"key":"rf2","doi-asserted-by":"publisher","DOI":"10.1016\/j.heares.2010.05.001"},{"key":"rf3","doi-asserted-by":"publisher","DOI":"10.1121\/1.1914702"},{"key":"rf5","doi-asserted-by":"publisher","DOI":"10.1121\/1.3273893"},{"key":"rf6","doi-asserted-by":"publisher","DOI":"10.1214\/009053606000001523"},{"key":"rf8","doi-asserted-by":"publisher","DOI":"10.1109\/89.279278"},{"key":"rf9","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0703665104"},{"key":"rf10","doi-asserted-by":"publisher","DOI":"10.1109\/89.260357"},{"key":"rf11","doi-asserted-by":"publisher","DOI":"10.1016\/0378-5955(95)00229-4"},{"key":"rf13","doi-asserted-by":"publisher","DOI":"10.1529\/biophysj.107.124727"},{"key":"rf14","doi-asserted-by":"publisher","DOI":"10.1121\/1.399423"},{"key":"rf15","doi-asserted-by":"publisher","DOI":"10.1109\/89.326616"},{"key":"rf16","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.0060016"},{"key":"rf17","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6393(97)00061-7"},{"key":"rf18","volume":"7","author":"Kim D.-S.","year":"1999","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"rf20","doi-asserted-by":"publisher","DOI":"10.1006\/csla.1995.0010"},{"key":"rf22","volume-title":"A Wavelet Tour of Signal Processing, the Sparse Way","author":"Mallat S.","year":"2009"},{"key":"rf23","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2007.899278"},{"key":"rf25","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2010.2042490"},{"key":"rf28","doi-asserted-by":"publisher","DOI":"10.1038\/nature04485"},{"key":"rf29","volume":"28","author":"Steven B. D.","year":"1980","journal-title":"IEEE Trans. Acoustics, Speech Signal Process. ASSP"},{"key":"rf30","doi-asserted-by":"publisher","DOI":"10.1037\/h0046162"},{"key":"rf31","doi-asserted-by":"publisher","DOI":"10.1016\/j.brainres.2007.11.059"},{"key":"rf32","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2009.08.009"}],"container-title":["International Journal of Pattern Recognition and Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218001412500152","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,7]],"date-time":"2019-08-07T14:37:02Z","timestamp":1565188622000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0218001412500152"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11]]},"references-count":23,"journal-issue":{"issue":"07","published-online":{"date-parts":[[2013,2,17]]},"published-print":{"date-parts":[[2012,11]]}},"alternative-id":["10.1142\/S0218001412500152"],"URL":"https:\/\/doi.org\/10.1142\/s0218001412500152","relation":{},"ISSN":["0218-0014","1793-6381"],"issn-type":[{"value":"0218-0014","type":"print"},{"value":"1793-6381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,11]]}}}