{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T16:38:09Z","timestamp":1781109489445,"version":"3.54.1"},"reference-count":59,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T00:00:00Z","timestamp":1754006400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T00:00:00Z","timestamp":1754006400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100009532","name":"Ministerstvo Vnitra Cesk\u00e9 Republiky","doi-asserted-by":"publisher","award":["VB02000060"],"award-info":[{"award-number":["VB02000060"]}],"id":[{"id":"10.13039\/100009532","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100017520","name":"Fakulta Informacn\u00edch Technologi\u00ed, Vysok\u00e9 Ucen\u00ed Technick\u00e9 v Brne","doi-asserted-by":"publisher","award":["FIT-S-23-8151"],"award-info":[{"award-number":["FIT-S-23-8151"]}],"id":[{"id":"10.13039\/501100017520","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The proliferation of deepfake speech poses a significant threat to cybersecurity, from manipulating political speeches and impersonating public figures to spoofing voice biometric systems. The increasing sophistication of adversaries increases the necessity of deploying adaptive detection methods. Moreover, real-world incidents such as fraudulent financial transactions highlight the severity of the problem. Although numerous detectors have been developed, their evaluation remains difficult due to different methodologies and benchmark datasets, making direct comparisons impossible. This study presents a general and detailed framework for evaluating and comparing deepfake speech detectors. We further demonstrate the use of this framework to evaluate 40 state-of-the-art deepfake speech detectors under various conditions and data samples. We objectively compare these methods and identify the key attributes influencing performance the most. We also stress the issue of generalisation, as current detectors struggle to detect previously unseen deepfake speech samples or samples that have been modified. Finally, to strengthen the defence against synthetic audio content, we provide recommendations for improving the robustness of future detectors.<\/jats:p>","DOI":"10.1186\/s42400-024-00346-1","type":"journal-article","created":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T03:03:04Z","timestamp":1754017384000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Evaluation framework for deepfake speech detection: a comparative study of state-of-the-art deepfake speech detectors"],"prefix":"10.1186","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4717-1910","authenticated-orcid":false,"given":"Anton","family":"Firc","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kamil","family":"Malinka","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Petr","family":"Han\u00e1\u010dek","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,8,1]]},"reference":[{"key":"346_CR1","unstructured":"Ahmed M.E, Kwak I.-Y, Huh J.H, Kim I, Oh T, Kim H: Void: A fast and light voice liveness detection system. In: 29th USENIX Security Symposium (USENIX Security 20), pp. 2685\u20132702. USENIX Association, (2020). https:\/\/www.usenix.org\/conference\/usenixsecurity20\/presentation\/ahmed-muhammad"},{"key":"346_CR2","doi-asserted-by":"publisher","DOI":"10.3390\/a15050155","author":"Z Almutairi","year":"2022","unstructured":"Almutairi Z, Elgibreen H (2022) A review of modern audio deepfake detection methods: challenges and future directions. Algorithms. https:\/\/doi.org\/10.3390\/a15050155","journal-title":"Algorithms"},{"key":"346_CR3","doi-asserted-by":"publisher","unstructured":"Ardila R, Branson M, Davis K, Henretty M, Kohler M, Meyer J, Morais R, Saunders L, Tyers FM, Weber G (2019) Common voice: a massively-multilingual speech corpus. arXiv. https:\/\/doi.org\/10.48550\/ARXIV.1912.06670","DOI":"10.48550\/ARXIV.1912.06670"},{"key":"346_CR4","doi-asserted-by":"publisher","unstructured":"Borodin K, Kudryavtsev V, Korzh D, Efimenko A, Mkrtchian G, Gorodnichev M, Rogov OY (2024) Aasist3: Kan-enhanced aasist speech deepfake detection using ssl features and additional regularization for the asvspoof 2024 challenge. In: The automatic speaker verification spoofing countermeasures workshop (ASVspoof 2024). ISCA, Kos, Greece, pp 48\u201355. https:\/\/doi.org\/10.21437\/asvspoof.2024-8","DOI":"10.21437\/asvspoof.2024-8"},{"key":"346_CR5","unstructured":"Brewster T (2022) Fraudsters cloned company director\u2019s voice in \\$35 million bank heist, police find. Forbes Magazine. https:\/\/www.forbes.com\/sites\/thomasbrewster\/2021\/10\/14\/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions\/?sh=776258a75591"},{"key":"346_CR6","unstructured":"Casanova E, Weber J, Shulby CD, Junior AC, G\u00f6lge E, Ponti MA (2022) YourTTS: towards zero-shot multi-speaker TTS and zero-shot voice conversion for everyone. In: Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S (eds) Proceedings of the 39th international conference on machine learning. Proceedings of machine learning research, vol 162. PMLR, pp 2709\u20132720. https:\/\/proceedings.mlr.press\/v162\/casanova22a.html"},{"key":"346_CR7","doi-asserted-by":"publisher","unstructured":"Chen X, Zhang Y, Zhu G, Duan Z (2021) UR channel-robust synthetic speech detection system for ASVspoof 2021. In: Proc. 2021 edition of the automatic speaker verification and spoofing countermeasures challenge, pp 75\u201382. https:\/\/doi.org\/10.21437\/ASVSPOOF.2021-12","DOI":"10.21437\/ASVSPOOF.2021-12"},{"key":"346_CR8","doi-asserted-by":"publisher","unstructured":"Dao A-T, Rouvier M, Matrouf D (2024) ASVspoof 5 challenge: advanced ResNet architectures for robust voice spoofing detection. In: The automatic speaker verification spoofing countermeasures workshop (ASVspoof 2024). ISCA, Kos, Greece, pp 163\u2013169. https:\/\/doi.org\/10.21437\/asvspoof.2024-24","DOI":"10.21437\/asvspoof.2024-24"},{"key":"346_CR9","unstructured":"Dessa (2020) Detecting audio deep fakes with AI. Dessa News. https:\/\/medium.com\/dessa-news\/detecting-audio-deepfakes-f2edfd8e2b35"},{"issue":"4","key":"346_CR10","doi-asserted-by":"publisher","first-page":"15090","DOI":"10.1016\/j.heliyon.2023.e15090","volume":"9","author":"A Firc","year":"2023","unstructured":"Firc A, Malinka K, Han\u00e1\u010dek P (2023) Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors. Heliyon 9(4):15090. https:\/\/doi.org\/10.1016\/j.heliyon.2023.e15090","journal-title":"Heliyon"},{"key":"346_CR11","doi-asserted-by":"publisher","unstructured":"Firc A, Malinka K (2022) The dawn of a text-dependent society: deepfakes as a threat to speech verification systems. In: Proceedings of the 37th ACM\/SIGAPP symposium on applied computing. SAC \u201922. Association for Computing Machinery, New York, NY, USA, pp 1646\u20131655. https:\/\/doi.org\/10.1145\/3477314.3507013","DOI":"10.1145\/3477314.3507013"},{"key":"346_CR12","doi-asserted-by":"publisher","unstructured":"Firc A, Malinka K, Han\u00e1\u010dek P (2024) Deepfake speech detection: a spectrogram analysis. In: Proceedings of the 39th ACM\/SIGAPP symposium on applied computing. SAC \u201924. Association for computing machinery, New York, NY, USA, pp 1312\u20131320. https:\/\/doi.org\/10.1145\/3605098.3635911","DOI":"10.1145\/3605098.3635911"},{"key":"346_CR13","doi-asserted-by":"crossref","unstructured":"Firc A, Malinka K, Han\u00e1\u010dek P: Diffuse or confuse: A diffusion deepfake speech dataset. In: 2024 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1\u20137 (2024). https:\/\/doi.org\/10.1109\/BIOSIG61931.2024.10786752","DOI":"10.1109\/BIOSIG61931.2024.10786752"},{"key":"346_CR14","unstructured":"Frank J, Sch\u00f6nherr L (2021) WaveFake: a data set to facilitate audio deepfake detection"},{"key":"346_CR15","doi-asserted-by":"crossref","unstructured":"Ge W, Patino J, Todisco M, Evans N (2021) Raw differentiable architecture search for speech deepfake and spoofing detection. In: Proc. 2021 edition of the automatic speaker verification and spoofing countermeasures challenge, pp 22\u201328","DOI":"10.21437\/ASVSPOOF.2021-4"},{"key":"346_CR16","doi-asserted-by":"publisher","unstructured":"Ge W, Todisco M, Evans N (2022) Explainable deepfake and spoofing detection: an attack analysis using shapley additive exPlanations. In: Proc. the speaker and language recognition workshop (Odyssey 2022), pp 70\u201376. https:\/\/doi.org\/10.21437\/Odyssey.2022-10","DOI":"10.21437\/Odyssey.2022-10"},{"key":"346_CR17","doi-asserted-by":"publisher","unstructured":"Hai X, Liu X, Tan Y, Zhou Q (2023) Sifdetectcracker: an adversarial attack against fake voice detection based on speaker-irrelative features. In: Proceedings of the 31st ACM international conference on multimedia. MM \u201923. Association for computing machinery, New York, NY, USA, pp 8552\u20138560. https:\/\/doi.org\/10.1145\/3581783.3613841","DOI":"10.1145\/3581783.3613841"},{"key":"346_CR18","unstructured":"He Huang M, Zhang P, Tiovalen JR, Balji M (2022) Audio deepfake detection. GitHub"},{"key":"346_CR19","doi-asserted-by":"publisher","unstructured":"Hsu C-C, Hwang H-T, Wu Y-C, Tsao Y, Wang H-M (2016) Voice conversion from non-parallel corpora using variational auto-encoder. In: 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), pp 1\u20136. https:\/\/doi.org\/10.1109\/APSIPA.2016.7820786","DOI":"10.1109\/APSIPA.2016.7820786"},{"key":"346_CR20","doi-asserted-by":"publisher","unstructured":"ISCA (2024) The automatic speaker verification spoofing countermeasures workshop 2024. https:\/\/doi.org\/10.21437\/asvspoof.2024","DOI":"10.21437\/asvspoof.2024"},{"key":"346_CR21","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.114591","volume":"171","author":"R Jahangir","year":"2021","unstructured":"Jahangir R, Teh YW, Nweke HF, Mujtaba G, Al-Garadi MA, Ali I (2021) Speaker identification through artificial intelligence techniques: a comprehensive review and research challenges. Expert Syst Appl 171:114591. https:\/\/doi.org\/10.1016\/j.eswa.2021.114591","journal-title":"Expert Syst Appl"},{"key":"346_CR22","unstructured":"Jia Y, Zhang Y, Weiss R, Wang Q, Shen J, Ren F, Chen Z, Nguyen P, Pang R, Lopez Moreno I, Wu Y (2018) Transfer learning from speaker verification to multispeaker text-to-speech synthesis. In: Advances in Neural Information Processing Systems, vol 31. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2018\/file\/6832a7b24bc06775d02b7406880b93fc-Paper.pdf"},{"key":"346_CR23","doi-asserted-by":"publisher","unstructured":"Kawa P, Plata M, Syga P (2022). Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection. Proceedings of Interspeech 2022, 4023-4027. ISCA. https:\/\/doi.org\/10.21437\/interspeech.2022-10078","DOI":"10.21437\/interspeech.2022-10078"},{"key":"346_CR24","doi-asserted-by":"publisher","unstructured":"Kawa P, Plata M, Syga P (2023) Defense against adversarial attacks on audio DeepFake detection. In: Proc. INTERSPEECH 2023, pp 5276\u20135280. https:\/\/doi.org\/10.21437\/Interspeech.2023-409","DOI":"10.21437\/Interspeech.2023-409"},{"key":"346_CR25","unstructured":"Khalid H, Tariq S, Kim M, Woo SS (2021) FakeAVCeleb: a novel audio-video multimodal deepfake dataset. In: Thirty-fifth conference on neural information processing systems datasets and benchmarks track (Round 2). https:\/\/openreview.net\/forum?id=TAXFsg6ZaOl"},{"key":"346_CR26","unstructured":"Kim J, Kong J, Son J (2021) Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech. In: Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 139, pp 5530\u20135540. PMLR. https:\/\/proceedings.mlr.press\/v139\/kim21f.html"},{"key":"346_CR27","unstructured":"Kong J, Kim J, Bae J (2020) HiFi-GAN: Generative adversarial networks for efficient and high fidelity speech synthesis. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc., pp 17022\u201317033. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/c5d736809766d46260d816d8dbc9eb44-Paper.pdf"},{"key":"346_CR28","unstructured":"Kumar K, Kumar R, de Boissiere T, Gestin L, Teoh WZ, Sotelo J, de Br\u00e9bisson A, Bengio Y, Courville AC (2019) MelGAN: Generative adversarial networks for conditional waveform synthesis. In: Advances in Neural Information Processing Systems, vol 32. Curran Associates, Inc. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2019\/file\/6804c9bca0a615bdb9374d00a9fcba59-Paper.pdf"},{"key":"346_CR29","doi-asserted-by":"publisher","unstructured":"Li M, Ahmadiadli Y, Zhang X-P (2022) A comparative study on physical and perceptual features for deepfake audio detection. In: Proceedings of the 1st international workshop on deepfake detection for audio multimedia. DDAM \u201922, Association for Computing Machinery, New York, NY, USA, pp 35\u201341. https:\/\/doi.org\/10.1145\/3552466.3556523","DOI":"10.1145\/3552466.3556523"},{"key":"346_CR30","doi-asserted-by":"publisher","unstructured":"Liu L-J, Ling Z-H, Jiang Y, Zhou M, Dai L-R (2018) WaveNet vocoder with limited training data for voice conversion. In: Proc. interspeech 2018, pp 1983\u20131987. https:\/\/doi.org\/10.21437\/Interspeech.2018-1190","DOI":"10.21437\/Interspeech.2018-1190"},{"key":"346_CR31","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1007\/978-3-031-70879-4_4","volume-title":"Computer security\u2014ESORICS 2024","author":"K Malinka","year":"2024","unstructured":"Malinka K, Firc A, Ka\u0161ka P, Lap\u0161ansk\u00fd T, \u0160andor O, Homoliak I (2024) Resilience of voice assistants to synthetic speech. In: Garcia-Alfaro J, Kozik R, Chora\u015b M, Katsikas S (eds) Computer security\u2014ESORICS 2024. Springer, Cham, pp 66\u201384"},{"key":"346_CR32","doi-asserted-by":"publisher","DOI":"10.1186\/s13640-024-00641-4","author":"K Malinka","year":"2024","unstructured":"Malinka K, Firc A, \u0160alko M, Prudk\u00fd D, Rada\u010dovsk\u00e1 K, Han\u00e1\u010dek P (2024) Comprehensive multiparametric analysis of human deepfake speech recognition. EURASIP J Image Video Process. https:\/\/doi.org\/10.1186\/s13640-024-00641-4","journal-title":"EURASIP J Image Video Process"},{"issue":"4","key":"346_CR33","doi-asserted-by":"publisher","first-page":"3974","DOI":"10.1007\/s10489-022-03766-z","volume":"53","author":"M Masood","year":"2023","unstructured":"Masood M, Nawaz M, Malik KM, Javed A, Irtaza A, Malik H (2023) Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward. Appl Intell 53(4):3974\u20134026. https:\/\/doi.org\/10.1007\/s10489-022-03766-z","journal-title":"Appl Intell"},{"key":"346_CR34","doi-asserted-by":"publisher","unstructured":"Matrouf D, Bonastre J-F, Fredouille C (2006) Effect of speech transformation on impostor acceptance. In: 2006 IEEE international conference on acoustics speech and signal processing proceedings, vol 1. https:\/\/doi.org\/10.1109\/ICASSP.2006.1660175","DOI":"10.1109\/ICASSP.2006.1660175"},{"key":"346_CR35","doi-asserted-by":"publisher","unstructured":"Mcuba M, Singh A, Ikuesan RA, Venter H (2023) The effect of deep learning methods on deepfake audio detection for digital investigation. Proc Comput Sci 219:211\u2013219. https:\/\/doi.org\/10.1016\/j.procs.2023.01.283. (CENTERIS - International Conference on ENTERprise Information Systems \/ ProjMAN - International Conference on Project MANagement \/ HCist - International Conference on Health and Social Care Information Systems and Technologies 2022)","DOI":"10.1016\/j.procs.2023.01.283"},{"key":"346_CR36","doi-asserted-by":"publisher","first-page":"144497","DOI":"10.1109\/ACCESS.2023.3344653","volume":"11","author":"R Mubarak","year":"2023","unstructured":"Mubarak R, Alsboui T, Alshaikh O, Inuwa-Dutse I, Khan S, Parkinson S (2023) A survey on the detection and impacts of deepfakes in visual, audio, and textual formats. IEEE Access 11:144497\u2013144529 https:\/\/doi.org\/10.1109\/ACCESS.2023.3344653","journal-title":"IEEE Access"},{"key":"346_CR37","doi-asserted-by":"publisher","unstructured":"M\u00fcller N, Czempin P, Diekmann F, Froghyar A, B\u00f6ttinger K (2022) Does audio deepfake detection generalize? In: Proc. interspeech 2022, pp 2783\u20132787. https:\/\/doi.org\/10.21437\/Interspeech.2022-108","DOI":"10.21437\/Interspeech.2022-108"},{"key":"346_CR38","unstructured":"O\u2019Donnell L (2019) CEO \u2019Deep fake\u2019 swindles company out of \\$243K. https:\/\/threatpost.com\/deep-fake-of-ceos-voice-swindles-company-out-of-243k\/147982\/"},{"key":"346_CR39","unstructured":"Popov V, Vovk I, Gogoryan V, Sadekova T, Kudinov M (2021) Grad-TTS: a diffusion probabilistic model for text-to-speech. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning. Proceedings of machine learning research, vol 139. PMLR, pp 8599\u20138608. https:\/\/proceedings.mlr.press\/v139\/popov21a.html"},{"key":"346_CR40","doi-asserted-by":"publisher","unstructured":"Pratap V, Xu Q, Sriram A, Synnaeve G, Collobert R (2020). MLS: A Large-Scale Multilingual Dataset for Speech Research. Interspeech 2020, 2757-2761. https:\/\/doi.org\/10.21437\/interspeech.2020-2826","DOI":"10.21437\/interspeech.2020-2826"},{"key":"346_CR41","doi-asserted-by":"crossref","unstructured":"Prenger R, Valle R, Catanzaro B (2018) WaveGlow: a flow-based generative network for speech synthesis","DOI":"10.1109\/ICASSP.2019.8683143"},{"key":"346_CR42","doi-asserted-by":"publisher","unstructured":"Prudk\u00fd D, Firc A, Malinka K (2023) Assessing the human ability to recognize synthetic speech in ordinary conversation. In: 2023 international conference of the biometrics special interest group (BIOSIG), pp 1\u20135. https:\/\/doi.org\/10.1109\/BIOSIG58226.2023.10346006","DOI":"10.1109\/BIOSIG58226.2023.10346006"},{"key":"346_CR43","doi-asserted-by":"publisher","unstructured":"Reimao R, Tzerpos V (2019) FoR: a dataset for synthetic speech detection. In: 2019 international conference on speech technology and human-computer dialogue (SpeD), pp 1\u201310. https:\/\/doi.org\/10.1109\/SPED.2019.8906599","DOI":"10.1109\/SPED.2019.8906599"},{"key":"346_CR44","doi-asserted-by":"publisher","unstructured":"Reimao R, Tzerpos V (2021) Synthetic speech detection using neural networks. In: 2021 international conference on speech technology and human-computer dialogue (SpeD), pp 97\u2013102. https:\/\/doi.org\/10.1109\/SpeD53181.2021.9587406","DOI":"10.1109\/SpeD53181.2021.9587406"},{"key":"346_CR45","doi-asserted-by":"publisher","unstructured":"Rohdin J, Zhang L, Old\u0159ich P, Stan\u011bk V, Mihola D, Peng J, Stafylakis T, Beveraki D, Silnova A, Brukner J, Burget L (2024) But systems and analyses for the asvspoof 5 challenge. In: The automatic speaker verification spoofing countermeasures workshop (ASVspoof 2024). ISCA, Kos, Greece, pp 24\u201331. https:\/\/doi.org\/10.21437\/asvspoof.2024-4","DOI":"10.21437\/asvspoof.2024-4"},{"key":"346_CR46","doi-asserted-by":"publisher","unstructured":"Sahidullah M, Kinnunen T, Hanil\u00e7i C (2015) A comparison of features for synthetic speech detection. In: Proc. interspeech 2015, pp 2087\u20132091. https:\/\/doi.org\/10.21437\/Interspeech.2015-472","DOI":"10.21437\/Interspeech.2015-472"},{"key":"346_CR47","doi-asserted-by":"publisher","unstructured":"Sch\u00e4fer K, Choi J-E, Neu M (2024) Robust audio deepfake detection: exploring front-\/back-end combinations and data augmentation strategies for the asvspoof5 challenge. In: The automatic speaker verification spoofing countermeasures workshop (ASVspoof 2024). ISCA, Kos, Greece, pp 56\u201363. https:\/\/doi.org\/10.21437\/asvspoof.2024-9","DOI":"10.21437\/asvspoof.2024-9"},{"key":"346_CR48","doi-asserted-by":"publisher","unstructured":"Tak H, Jung J-w, Patino J, Kamble M, Todisco M, Evans N (2021) End-to-end spectro-temporal graph attention networks for speaker verification anti-spoofing and speech deepfake detection. In: Proc. 2021 edition of the automatic speaker verification and spoofing countermeasures challenge, pp 1\u20138. https:\/\/doi.org\/10.21437\/ASVSPOOF.2021-1","DOI":"10.21437\/ASVSPOOF.2021-1"},{"key":"346_CR49","doi-asserted-by":"publisher","unstructured":"Tran T, Bui TD, Simatis P (2024) Parallelchain lab\u2019s anti-spoofing systems for ASVspoof 5. In: The automatic speaker verification spoofing countermeasures workshop (ASVspoof 2024). ISCA, Kos, Greece, pp 9\u201315. https:\/\/doi.org\/10.21437\/asvspoof.2024-2","DOI":"10.21437\/asvspoof.2024-2"},{"key":"346_CR51","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2020.101114","volume":"64","author":"X Wang","year":"2020","unstructured":"Wang X, Yamagishi J, Todisco M, Delgado H, Nautsch A, Evans N, Sahidullah M, Vestman V, Kinnunen T, Lee KA, Juvela L, Alku P, Peng Y-H, Hwang H-T, Tsao Y, Wang H-M, Maguer SL, Becker M, Henderson F, Clark R, Zhang Y, Wang Q, Jia Y, Onuma K, Mushika K, Kaneda T, Jiang Y, Liu L-J, Wu Y-C, Huang W-C, Toda T, Tanaka K, Kameoka H, Steiner I, Matrouf D, Bonastre J-F, Govender A, Ronanki S, Zhang J-X, Ling Z-H (2020) Asvspoof 2019: a large-scale public database of synthesized, converted and replayed speech. Comput Speech Lang 64:101114. https:\/\/doi.org\/10.1016\/j.csl.2020.101114","journal-title":"Comput Speech Lang"},{"key":"346_CR52","doi-asserted-by":"publisher","unstructured":"Wang X, Delgado H, Tak H, Jung J, Shim H, Todisco M, Kukanov I, Liu X, Sahidullah M, Kinnunen T, Evans N, Lee K, Yamagishi J (2024). ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale. The Automatic Speaker Verification Spoofing Countermeasures Workshop (ASVspoof 2024), 1-8. https:\/\/doi.org\/10.21437\/ASVspoof.2024-1","DOI":"10.21437\/ASVspoof.2024-1"},{"key":"346_CR53","doi-asserted-by":"publisher","unstructured":"Wang H, Dinkel H, Wang S, Qian Y, Yu K (2020) Dual-adversarial domain adaptation for generalized replay attack detection. In: Proc. interspeech 2020, pp 1086\u20131090. https:\/\/doi.org\/10.21437\/Interspeech.2020-1255","DOI":"10.21437\/Interspeech.2020-1255"},{"key":"346_CR54","doi-asserted-by":"publisher","unstructured":"Yamagishi J, Wang X, Todisco M, Sahidullah M, Patino J, Nautsch A, Liu X, Lee KA, Kinnunen T, Evans N, Delgado H (2021) ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection. In: Proc. 2021 edition of the automatic speaker verification and spoofing countermeasures challenge, pp 47\u201354. https:\/\/doi.org\/10.21437\/ASVSPOOF.2021-8","DOI":"10.21437\/ASVSPOOF.2021-8"},{"key":"346_CR55","doi-asserted-by":"publisher","unstructured":"Zen H, Agiomyrgiannakis Y, Egberts N, Henderson F, Szczepaniak P (2016) Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices. In: Proc. interspeech 2016, pp 2273\u20132277. https:\/\/doi.org\/10.21437\/Interspeech.2016-522","DOI":"10.21437\/Interspeech.2016-522"},{"key":"346_CR56","doi-asserted-by":"publisher","unstructured":"Zen H, Senior A, Schuster M (2013) Statistical parametric speech synthesis using deep neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 7962\u20137966. https:\/\/doi.org\/10.1109\/ICASSP.2013.6639215","DOI":"10.1109\/ICASSP.2013.6639215"},{"issue":"5","key":"346_CR57","doi-asserted-by":"publisher","first-page":"6259","DOI":"10.1007\/s11042-021-11733-y","volume":"81","author":"T Zhang","year":"2022","unstructured":"Zhang T (2022) Deepfake generation and detection, a survey. Multimed Tools Appl 81(5):6259\u20136276. https:\/\/doi.org\/10.1007\/s11042-021-11733-y","journal-title":"Multimed Tools Appl"},{"key":"346_CR58","doi-asserted-by":"publisher","first-page":"937","DOI":"10.1109\/LSP.2021.3076358","volume":"28","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Jiang F, Duan Z (2021) One-class learning towards synthetic voice spoofing detection. IEEE Signal Process Lett 28:937\u2013941. https:\/\/doi.org\/10.1109\/LSP.2021.3076358","journal-title":"IEEE Signal Process Lett"},{"key":"346_CR59","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1007\/978-3-030-95398-0_9","volume-title":"Digital forensics and watermarking","author":"Z Zhang","year":"2022","unstructured":"Zhang Z, Gu Y, Yi X, Zhao X (2022) FMFCC-a: a challenging Mandarin dataset for synthetic speech detection. In: Zhao X, Piva A, Comesa\u00f1a-Alfaro P (eds) Digital forensics and watermarking. Springer, Cham, pp 117\u2013131"},{"key":"346_CR60","doi-asserted-by":"publisher","unstructured":"Zhang Z, Gu Y, Yi X, Zhao X (2020) SynSpeechDDB: a new synthetic speech detection database. IEEE Dataport. https:\/\/doi.org\/10.21227\/ta8z-mx73","DOI":"10.21227\/ta8z-mx73"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-024-00346-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-024-00346-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-024-00346-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T03:03:12Z","timestamp":1754017392000},"score":1,"resource":{"primary":{"URL":"https:\/\/cybersecurity.springeropen.com\/articles\/10.1186\/s42400-024-00346-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,1]]},"references-count":59,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["346"],"URL":"https:\/\/doi.org\/10.1186\/s42400-024-00346-1","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,1]]},"assertion":[{"value":"17 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 December 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interest"}}],"article-number":"50"}}