{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T11:02:25Z","timestamp":1740135745052,"version":"3.37.3"},"reference-count":59,"publisher":"Springer Science and Business Media LLC","issue":"7","license":[{"start":{"date-parts":[[2024,4,13]],"date-time":"2024-04-13T00:00:00Z","timestamp":1712966400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,4,13]],"date-time":"2024-04-13T00:00:00Z","timestamp":1712966400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006447","name":"University of Zurich","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006447","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Circuits Syst Signal Process"],"published-print":{"date-parts":[[2024,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Speaker verification is a biometric-based method for individual authentication. However, there are still several challenging problems in achieving high performance in short utterance text-independent conditions, maybe for weak speaker-specific features. Recently, deep learning algorithms have been used extensively in speech processing. This manuscript uses a deep belief network (DBN) as a deep generative method for feature extraction in speaker verification systems. This study aims to show the impact of using the proposed method in various challenging issues, including short utterances, text independence, language variation, and large-scale speaker verification. The proposed DBN uses MFCC as input and tries to extract more efficient features. This new representation of speaker information is evaluated in two popular speaker verification systems: GMM-UBM and i-vector-PLDA methods. The results show that, for the i-vector-PLDA system, the proposed feature decreases the EER considerably from 15.24 to 10.97%. In another experiment, DBN is used to reduce feature dimension and achieves significant results in decreasing computational time and increasing system response speed. In a case study, all the evaluations are performed for 1270 speakers of the NIST SRE2008 dataset. We show deep belief networks can be used in state-of-the-art acoustic modeling methods and more challenging datasets.<\/jats:p>","DOI":"10.1007\/s00034-024-02671-9","type":"journal-article","created":{"date-parts":[[2024,4,13]],"date-time":"2024-04-13T11:01:59Z","timestamp":1713006119000},"page":"4547-4564","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Analysis of Deep Generative Model Impact on Feature Extraction and Dimension Reduction for Short Utterance Text-Independent Speaker Verification"],"prefix":"10.1007","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3447-4330","authenticated-orcid":false,"given":"Aref","family":"Farhadipour","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2372-7969","authenticated-orcid":false,"given":"Hadi","family":"Veisi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,4,13]]},"reference":[{"key":"2671_CR1","unstructured":"M.P. Alvin, A. Martin, NIST speaker recognition evaluation chronicles. In: The Speaker and Language Recognition Workshop (ODYSSEY, 2004)"},{"key":"2671_CR2","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1186\/s40537-023-00727-2","volume":"10","author":"L Alzubaidi","year":"2023","unstructured":"L Alzubaidi J Bai A Al-Sabaawi J Santamar\u00eda A Albahri BSN Al-dabbagh MA Fadhel M Manoufali J Zhang AH Al-Timemy 2023 A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications J. Big Data 10 46 127","journal-title":"J. Big Data"},{"key":"2671_CR3","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1016\/j.neunet.2021.03.004","volume":"140","author":"Z Bai","year":"2021","unstructured":"Z Bai XL Zhang 2021 Speaker recognition based on deep learning: an overview Neural Netw. 140 65 99","journal-title":"Neural Netw."},{"key":"2671_CR4","unstructured":"A. Banerjee, A. Dubey, A. Menon, S. Nanda, G.C. Nandi, Speaker recognition using deep belief networks. arXiv:1805.08865 (2018)"},{"key":"2671_CR5","doi-asserted-by":"publisher","first-page":"1045","DOI":"10.1007\/s11036-017-0876-z","volume":"22","author":"I Bisio","year":"2017","unstructured":"I Bisio F Lavagetto C Garibotto A Sciarrone 2017 Speaker recognition exploiting D2D communications paradigm: performance evaluation of multiple observations approaches Mob. Netw. Appl. 22 1045 1057","journal-title":"Mob. Netw. Appl."},{"key":"2671_CR6","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1109\/TASSP.1979.1163209","volume":"27","author":"S Boll","year":"1979","unstructured":"S Boll 1979 Suppression of acoustic noise in speech using spectral subtraction IEEE Trans. Acoust. Speech Signal Process. 27 113 120","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"2671_CR7","doi-asserted-by":"crossref","unstructured":"T. Chen, E. Khoury, Speaker embedding conversion for backward and cross-channel compatibility. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7072\u20137076 (2022)","DOI":"10.1109\/ICASSP43922.2022.9747402"},{"key":"2671_CR8","doi-asserted-by":"crossref","unstructured":"A. Chowdhury, A. Cozzo, A. Ross, Domain adaptation for speaker recognition in singing and spoken voice. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7192\u20137196 (2022)","DOI":"10.1109\/ICASSP43922.2022.9746111"},{"key":"2671_CR9","doi-asserted-by":"publisher","first-page":"846","DOI":"10.1109\/TASLP.2014.2308473","volume":"22","author":"S Cumani","year":"2014","unstructured":"S Cumani O Plchot P Laface 2014 On the use of i\u2013vector posterior distributions in probabilistic linear discriminant analysis IEEE Trans. Audio Speech Lang. Process. 22 846 857","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"2671_CR10","doi-asserted-by":"publisher","first-page":"599","DOI":"10.1080\/02564602.2017.1357507","volume":"35","author":"RK Das","year":"2018","unstructured":"RK Das SM Prasanna 2018 Speaker verification from short utterance perspective: a review IETE Tech. Rev. 35 599 617","journal-title":"IETE Tech. Rev."},{"key":"2671_CR11","doi-asserted-by":"publisher","first-page":"788","DOI":"10.1109\/TASL.2010.2064307","volume":"19","author":"N Dehak","year":"2010","unstructured":"N Dehak PJ Kenny R Dehak P Dumouchel P Ouellet 2010 Front-end factor analysis for speaker verification IEEE Trans. Audio Speech Lang. Process. 19 788 798","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"2671_CR12","doi-asserted-by":"crossref","unstructured":"B. Desplanques, J. Thienpondt, K. Demuynck, Ecapa-tdnn: emphasized channel attention, propagation and aggregation in tdnn based speaker verification. arXiv:2005.07143 (2020)","DOI":"10.21437\/Interspeech.2020-2650"},{"key":"2671_CR13","doi-asserted-by":"publisher","first-page":"1985","DOI":"10.1007\/s12652-021-02960-0","volume":"13","author":"M Dua","year":"2022","unstructured":"M Dua C Jain S Kumar 2022 LSTM and CNN based ensemble approach for spoof detection task in automatic speaker verification systems J. Ambient Intell. Hum. Comput. 13 1985 2000","journal-title":"J. Ambient Intell. Hum. Comput."},{"key":"2671_CR14","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1007\/s10772-020-09791-y","volume":"25","author":"SA El-Moneim","year":"2022","unstructured":"SA El-Moneim M Nassar MI Dessouky NA Ismail AS El-Fishawy FEA El-Samie 2022 Cancellable template generation for speaker recognition based on spectrogram patch selection and deep convolutional neural networks Int. J. Speech Tech. 25 689 696","journal-title":"Int. J. Speech Tech."},{"key":"2671_CR15","unstructured":"A. Farhadipour, ivector and GMMUBM based speaker verification MATLAB code. https:\/\/github.com\/areffarhadi\/iVector_GMMUBM_Speaker_Verification (2024)"},{"key":"2671_CR16","doi-asserted-by":"publisher","first-page":"643","DOI":"10.4218\/etrij.2017-0260","volume":"40","author":"A Farhadipour","year":"2018","unstructured":"A Farhadipour H Veisi M Asgari MA Keyvanrad 2018 Dysarthric speaker identification with different degrees of dysarthria severity using deep belief networks Etri J. 40 643 652","journal-title":"Etri J."},{"key":"2671_CR17","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1109\/89.279278","volume":"2","author":"JL Gauvain","year":"1994","unstructured":"JL Gauvain CH Lee 1994 Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains IEEE Trans. Speech Audio Process. 2 291 298","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"2671_CR18","doi-asserted-by":"crossref","unstructured":"O. Ghahabi, J. Hernando, Deep belief networks for i-vector based speaker recognition, In: The IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1700\u20131704 (2014)","DOI":"10.1109\/ICASSP.2014.6853888"},{"key":"2671_CR19","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1016\/j.specom.2018.10.004","volume":"105","author":"J Guo","year":"2018","unstructured":"J Guo N Xu K Qian Y Shi K Xu Y Wu A Alwan 2018 Deep neural network based i-vector mapping for speaker verification using short utterances Speech Commun. 105 92 102","journal-title":"Speech Commun."},{"key":"2671_CR20","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1109\/MSP.2012.2205597","volume":"29","author":"G Hinton","year":"2012","unstructured":"G Hinton L Deng D Yu GE Dahl AR Mohamed N Jaitly A Senior V Vanhoucke P Nguyen TN Sainath 2012 Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups IEEE Signal Process. Mag. 29 82 97","journal-title":"IEEE Signal Process. Mag."},{"key":"2671_CR21","doi-asserted-by":"publisher","first-page":"1771","DOI":"10.1162\/089976602760128018","volume":"14","author":"GE Hinton","year":"2002","unstructured":"GE Hinton 2002 Training products of experts by minimizing contrastive divergence Neural Comput. 14 1771 1800","journal-title":"Neural Comput."},{"key":"2671_CR22","doi-asserted-by":"publisher","first-page":"5947","DOI":"10.4249\/scholarpedia.5947","volume":"4","author":"GE Hinton","year":"2009","unstructured":"GE Hinton 2009 Deep belief networks Scholarpedia 4 5947","journal-title":"Scholarpedia"},{"key":"2671_CR23","doi-asserted-by":"publisher","first-page":"1527","DOI":"10.1162\/neco.2006.18.7.1527","volume":"18","author":"GE Hinton","year":"2006","unstructured":"GE Hinton S Osindero YW Teh 2006 A fast learning algorithm for deep belief nets Neural Comput. 18 1527 1554","journal-title":"Neural Comput."},{"key":"2671_CR24","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1126\/science.1127647","volume":"313","author":"GE Hinton","year":"2006","unstructured":"GE Hinton RR Salakhutdinov 2006 Reducing the dimensionality of data with neural networks Science 313 504 507","journal-title":"Science"},{"key":"2671_CR25","doi-asserted-by":"crossref","unstructured":"J.W. Jung, H. Tak, H.J. Shim, H.S. Heo, B.J. Lee, S.W. Chung, H.G. Kang, H.J. Yu, N. Evans, T. Kinnunen, SASV challenge 2022: a spoofing aware speaker verification challenge evaluation plan. arXiv:2201.10283 (2022)","DOI":"10.21437\/Interspeech.2022-11270"},{"key":"2671_CR26","doi-asserted-by":"crossref","unstructured":"S.S. Kajarekar, N. Scheffer, M. Graciarena, E. Shriberg, A. Stolcke, L. Ferrer, T. Bocklet, The SRI NIST 2008 speaker recognition evaluation system. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4205\u20134208 (2009)","DOI":"10.1109\/ICASSP.2009.4960556"},{"key":"2671_CR27","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1016\/j.specom.2014.01.004","volume":"59","author":"A Kanagasundaram","year":"2014","unstructured":"A Kanagasundaram D Dean S Sridharan J Gonzalez-Dominguez J Gonzalez-Rodriguez D Ramos 2014 Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques Speech Commun. 59 69 82","journal-title":"Speech Commun."},{"key":"2671_CR28","doi-asserted-by":"crossref","unstructured":"A. Kanagasundaram, S. Sridharan, S. Ganapathy, P. Singh, C. Fookes, A study of x-vector based speaker recognition on short utterances. In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2943\u20132947 (2019)","DOI":"10.21437\/Interspeech.2019-1891"},{"key":"2671_CR29","first-page":"1","volume":"36","author":"V Karthikeyan","year":"2022","unstructured":"V Karthikeyan 2022 Modified layer deep convolution neural network for text-independent speaker recognition J. Exp. Theo. Artif. Intell. 36 1 13","journal-title":"J. Exp. Theo. Artif. Intell."},{"key":"2671_CR30","doi-asserted-by":"publisher","first-page":"1435","DOI":"10.1109\/TASL.2006.881693","volume":"15","author":"P Kenny","year":"2007","unstructured":"P Kenny G Boulianne P Ouellet P Dumouchel 2007 Joint factor analysis versus eigenchannels in speaker recognition IEEE Trans. Audio Speech Lang. Process. 15 1435 1447","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"2671_CR31","doi-asserted-by":"crossref","unstructured":"P. Kenny, T. Stafylakis, P. Ouellet, M.J. Alam, P. Dumouchel, PLDA for speaker verification with utterances of arbitrary duration. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7649\u20137653 (2013)","DOI":"10.1109\/ICASSP.2013.6639151"},{"key":"2671_CR32","doi-asserted-by":"crossref","unstructured":"M.A. Keyvanrad, M.M. Homayounpour, A brief survey on deep belief networks and introducing a new object oriented toolbox (DeeBNet). arXiv:1408.3264 (2014)","DOI":"10.1109\/IJCNN.2015.7280688"},{"key":"2671_CR33","doi-asserted-by":"publisher","first-page":"633","DOI":"10.1109\/TASLP.2018.2789399","volume":"26","author":"WB Kheder","year":"2018","unstructured":"WB Kheder D Matrouf M Ajili JF Bonastre 2018 A unified joint model to deal with nuisance variabilities in the i-vector space IEEE Trans. Audio Speech Lang. Process. 26 633 645","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"2671_CR34","doi-asserted-by":"crossref","unstructured":"L. Li, D. Wang, W. Du, D. Wang, CP map: a novel evaluation toolkit for speaker verification. arXiv:2203.02942 (2022)","DOI":"10.21437\/Odyssey.2022-43"},{"key":"2671_CR35","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1016\/j.csl.2013.07.003","volume":"28","author":"MW Mak","year":"2014","unstructured":"MW Mak HB Yu 2014 A study of voice activity detection techniques for NIST speaker recognition evaluations Comput. Speech Lang. 28 295 313","journal-title":"Comput. Speech Lang."},{"key":"2671_CR36","doi-asserted-by":"publisher","first-page":"755","DOI":"10.1109\/TASL.2011.2164533","volume":"20","author":"M McLaren","year":"2011","unstructured":"M McLaren D Leeuwen Van 2011 Source-normalized LDA for robust speaker recognition using i-vectors from multiple speech sources IEEE Trans. Audio Speech Lang. Process. 20 755 766","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"2671_CR37","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1109\/79.543975","volume":"13","author":"TK Moon","year":"1996","unstructured":"TK Moon 1996 The expectation-maximization algorithm IEEE Signal Process. Mag. 13 47 60","journal-title":"IEEE Signal Process. Mag."},{"key":"2671_CR38","doi-asserted-by":"publisher","first-page":"116469","DOI":"10.1016\/j.eswa.2021.116469","volume":"193","author":"AB Nassif","year":"2022","unstructured":"AB Nassif I Shahin A Elnagar D Velayudhan A Alhudhaif K Polat 2022 Emotional speaker identification using a novel capsule nets model Expert Syst. Appl. 193 116469","journal-title":"Expert Syst. Appl."},{"key":"2671_CR39","doi-asserted-by":"crossref","unstructured":"D. Nongrum, F. Pyrtuh, A comparative study on effect of temporal phase for speaker verification. In: Proceedings of International Conference on Frontiers in Computing and Systems (COMSYS), pp. 571\u2013578 (2021)","DOI":"10.1007\/978-981-19-0105-8_56"},{"key":"2671_CR40","doi-asserted-by":"publisher","first-page":"123028","DOI":"10.1109\/ACCESS.2022.3223365","volume":"10","author":"PG Patil","year":"2022","unstructured":"PG Patil TH Jaware SP Patil RD Badgujar F Albu I Mahariq B Al-Sheikh C Nayak 2022 Marathi speech intelligibility enhancement using I-AMS based neuro-fuzzy classifier approach for hearing aid users IEEE Access 10 123028 123042","journal-title":"IEEE Access"},{"key":"2671_CR41","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1049\/iet-bmt.2017.0065","volume":"7","author":"A Poddar","year":"2018","unstructured":"A Poddar M Sahidullah G Saha 2018 Speaker verification with short utterances: a review of challenges, trends and opportunities IET Biom. 7 91 101","journal-title":"IET Biom."},{"key":"2671_CR42","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1016\/j.dsp.2019.01.023","volume":"88","author":"A Poddar","year":"2019","unstructured":"A Poddar M Sahidullah G Saha 2019 Quality measures for speaker verification with short utterances Digit. Signal Process. 88 66 79","journal-title":"Digit. Signal Process."},{"key":"2671_CR43","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1006\/dspr.1999.0361","volume":"10","author":"DA Reynolds","year":"2000","unstructured":"DA Reynolds TF Quatieri RB Dunn 2000 Speaker verification using adapted Gaussian mixture models Digit. Signal Process. 10 19 41","journal-title":"Digit. Signal Process."},{"key":"2671_CR44","doi-asserted-by":"publisher","first-page":"1671","DOI":"10.1109\/LSP.2015.2420092","volume":"22","author":"F Richardson","year":"2015","unstructured":"F Richardson D Reynolds N Dehak 2015 Deep neural network approaches to speaker and language recognition IEEE Signal Process. Lett. 22 1671 1675","journal-title":"IEEE Signal Process. Lett."},{"key":"2671_CR45","first-page":"1","volume":"1","author":"SO Sadjadi","year":"2013","unstructured":"SO Sadjadi M Slaney L Heck 2013 MSR identity toolbox v1.0: a MATLAB toolbox for speaker-recognition research Speech Lang. Process. Techn. Comm. Newsl. 1 1 32","journal-title":"Speech Lang. Process. Techn. Comm. Newsl."},{"key":"2671_CR46","first-page":"300982","volume":"34","author":"S Saleem","year":"2020","unstructured":"S Saleem F Subhan N Naseer A Bais A Imtiaz 2020 Forensic speaker recognition: a new method based on extracting accent and language information from short utterances Forens. Sci. Int. Digit. Investig. 34 300982","journal-title":"Forens. Sci. Int. Digit. Investig."},{"key":"2671_CR47","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1007\/s10772-019-09618-5","volume":"22","author":"L Sun","year":"2019","unstructured":"L Sun T Gu K Xie J Chen 2019 Text-independent speaker identification based on deep Gaussian correlation supervector Int. J. Speech Tech. 22 449 457","journal-title":"Int. J. Speech Tech."},{"key":"2671_CR48","unstructured":"D. Sztah\u00f3, G. Szasz\u00e1k, A. Beke, Deep learning methods in speaker recognition: a review. arXiv:1911.06615 (2019)"},{"key":"2671_CR49","doi-asserted-by":"crossref","unstructured":"H. Tak, M. Todisco, X. Wang, J.W. Jung, J. Yamagishi, N. Evans, Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation. arXiv:2202.12233 (2022)","DOI":"10.21437\/Odyssey.2022-16"},{"key":"2671_CR50","doi-asserted-by":"crossref","unstructured":"M. Takamizawa, S. Tsuge, Y. Horiuchi, S. Kuroiwa, Same speaker identification with deep learning and application to text-dependent speaker verification. In: Human Centred Intelligent Systems Conference, pp. 149\u2013158 (2022)","DOI":"10.1007\/978-981-19-3455-1_11"},{"key":"2671_CR51","doi-asserted-by":"crossref","unstructured":"Y. Tang, G. Ding, J. Huang, X. He, B. Zhou, Deep speaker embedding learning with multi-level pooling for text-independent speaker verification. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 6116\u20136120 (2019)","DOI":"10.1109\/ICASSP.2019.8682712"},{"key":"2671_CR52","doi-asserted-by":"crossref","unstructured":"F. Tong, M. Zhao, J. Zhou, H. Lu, Z. Li, L. Li, Q. Hong, ASV-subtools: open source toolkit for automatic speaker verification. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6184\u20136188 (2021)","DOI":"10.1109\/ICASSP39728.2021.9414676"},{"key":"2671_CR53","doi-asserted-by":"crossref","unstructured":"E. Variani, X. Lei, E. McDermott, I. L. Moreno, J. Gonzalez-Dominguez, Deep neural networks for small footprint text-dependent speaker verification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4052\u20134056 (2014)","DOI":"10.1109\/ICASSP.2014.6854363"},{"key":"2671_CR54","doi-asserted-by":"crossref","unstructured":"S. Wang, J. Rohdin, L. Burget, O. Plchot, Y. Qian, K. Yu, J. Cernock\u00fd, On the usage of phonetic information for text-independent speaker embedding extraction. In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 1148\u20131152 (2019)","DOI":"10.21437\/Interspeech.2019-3036"},{"key":"2671_CR55","doi-asserted-by":"crossref","unstructured":"X. Wang, L. Li, D. Wang, VAE-based domain adaptation for speaker verification. In: The Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 535\u2013539 (2019)","DOI":"10.1109\/APSIPAASC47483.2019.9023015"},{"key":"2671_CR56","doi-asserted-by":"crossref","unstructured":"Z. Wu, S. Wang, Y. Qian, K. Yu, Data augmentation using variational autoencoder for embedding based speaker verification. In: Proceedings of the Annual Conference of the International Speech Communication Association, (INTERSPEECH), pp. 1163\u20131167 (2019)","DOI":"10.21437\/Interspeech.2019-2248"},{"key":"2671_CR57","unstructured":"S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, The HTK book. Cambridge University Engineering Department (2002)"},{"key":"2671_CR58","unstructured":"Y.Q. Yu, W.J. Li, Densely connected time delay neural network for speaker verification. In: Proceedings of the Annual Conference of the International Speech Communication Association, (INTERSPEECH), pp. 921\u2013925 (2020)"},{"key":"2671_CR59","doi-asserted-by":"publisher","first-page":"4068","DOI":"10.1007\/s00034-022-01974-z","volume":"41","author":"Y Zhao","year":"2022","unstructured":"Y Zhao R Togneri V Sreeram 2022 Multi-task learning-based spoofing-robust automatic speaker verification system Circuits Syst. Signal Process. 41 4068 4089","journal-title":"Circuits Syst. Signal Process."}],"container-title":["Circuits, Systems, and Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-024-02671-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00034-024-02671-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00034-024-02671-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,16]],"date-time":"2024-07-16T11:14:44Z","timestamp":1721128484000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00034-024-02671-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,13]]},"references-count":59,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,7]]}},"alternative-id":["2671"],"URL":"https:\/\/doi.org\/10.1007\/s00034-024-02671-9","relation":{},"ISSN":["0278-081X","1531-5878"],"issn-type":[{"type":"print","value":"0278-081X"},{"type":"electronic","value":"1531-5878"}],"subject":[],"published":{"date-parts":[[2024,4,13]]},"assertion":[{"value":"22 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 March 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 April 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"This paper reflects the authors\u2019 own research and analysis truthfully and completely and is not currently being considered for publication elsewhere.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical Approval"}}]}}