{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,9]],"date-time":"2026-05-09T14:55:46Z","timestamp":1778338546140,"version":"3.51.4"},"reference-count":77,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2025,2,8]],"date-time":"2025-02-08T00:00:00Z","timestamp":1738972800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["JCP"],"abstract":"<jats:p>Advances in deep learning have led to dramatic improvements in generative synthetic speech, eliminating robotic speech patterns to create speech that is indistinguishable from a human voice. Although these advances are extremely useful in various applications, they also facilitate powerful attacks against both humans and machines. Recently, a new type of speech attack called partial fake (PF) speech has emerged. This paper studies how well humans and machines, including speaker recognition systems and existing fake-speech detection tools, can distinguish between human voice and computer-generated speech. Our study shows that both humans and machines can be easily deceived by PF speech, and the current defences against PF speech are insufficient. These findings emphasise the urgency of increasing awareness for humans and creating new automated defences against PF speech for machines.<\/jats:p>","DOI":"10.3390\/jcp5010006","type":"journal-article","created":{"date-parts":[[2025,2,12]],"date-time":"2025-02-12T06:06:06Z","timestamp":1739340366000},"page":"6","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Partial Fake Speech Attacks in the Real World Using Deepfake Audio"],"prefix":"10.3390","volume":"5","author":[{"given":"Abdulazeez","family":"Alali","sequence":"first","affiliation":[{"name":"School of Computer Science and Informatics, Cardiff University, Cardiff CF24 4AG, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2701-7809","authenticated-orcid":false,"given":"George","family":"Theodorakopoulos","sequence":"additional","affiliation":[{"name":"School of Computer Science and Informatics, Cardiff University, Cardiff CF24 4AG, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,8]]},"reference":[{"key":"ref_1","unstructured":"Boone, D.R. (2015). Is Your Voice Telling on You?: How to Find and Use Your Natural Voice, Plural Publishing."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kumar, P., Jakhanwal, N., Bhowmick, A., and Chandra, M. (2011, January 12\u201314). Gender classification using pitch and formants. Proceedings of the 2011 International Conference on Communication, Computing & Security, Rourkela Odisha, India.","DOI":"10.1145\/1947940.1948007"},{"key":"ref_3","unstructured":"Casanova, E., Weber, J., Shulby, C.D., Junior, A.C., G\u00f6lge, E., and Ponti, M.A. (2022, January 17\u201323). Yourtts: Towards zero-shot multi-speaker tts and zero-shot voice conversion for everyone. Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, USA."},{"key":"ref_4","unstructured":"Wang, C., Chen, S., Wu, Y., Zhang, Z., Zhou, L., Liu, S., Chen, Z., Liu, Y., Wang, H., and Li, J. (2023). Neural codec language models are zero-shot text to speech synthesizers. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"159","DOI":"10.15625\/1813-9663\/18136","article-title":"Adapt-Tts: High-Quality Zero-Shot Multi-Speaker Text-to-Speech Adaptive-Based for Vietnamese","volume":"39","author":"Ngoc","year":"2023","journal-title":"J. Comput. Sci. Cybern."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Saeki, T., Maiti, S., Li, X., Watanabe, S., Takamichi, S., and Saruwatari, H. (2023). Learning to speak from text: Zero-shot multilingual text-to-speech with unsupervised text pretraining. arXiv.","DOI":"10.24963\/ijcai.2023\/575"},{"key":"ref_7","unstructured":"Peng, K., Ping, W., Song, Z., and Zhao, K. (2020, January 13\u201318). Non-autoregressive neural text-to-speech. Proceedings of the 37th International Conference on Machine Learning, Virtual."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Kawamura, M., Shirahata, Y., Yamamoto, R., and Tachibana, K. (2023, January 4\u201310). Lightweight and high-fidelity end-to-end text-to-speech with multi-band generation and inverse short-time fourier transform. Proceedings of the ICASSP 2023\u20142023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece.","DOI":"10.1109\/ICASSP49357.2023.10095296"},{"key":"ref_9","unstructured":"Betker, J. (2023). Better speech synthesis through scaling. arXiv."},{"key":"ref_10","unstructured":"Jiang, Z., Liu, J., Ren, Y., He, J., Zhang, C., Ye, Z., Wei, P., Wang, C., Yin, X., and Ma, Z. (2023). Mega-tts 2: Zero-shot text-to-speech with arbitrary length speech prompts. arXiv."},{"key":"ref_11","unstructured":"Jing, X., Chang, Y., Yang, Z., Xie, J., Triantafyllopoulos, A., and Schuller, B.W. (2023, January 20\u201322). U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech. Proceedings of the 15th ITG Conference on Speech Communication, Aachen, Germany."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ning, Z., Xie, Q., Zhu, P., Wang, Z., Xue, L., Yao, J., Xie, L., and Bi, M. (2023, January 4\u201310). Expressive-vc: Highly expressive voice conversion with attention fusion of bottleneck and perturbation features. Proceedings of the ICASSP 2023\u20142023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece.","DOI":"10.1109\/ICASSP49357.2023.10096057"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"67835","DOI":"10.1109\/ACCESS.2023.3292003","article-title":"English emotional voice conversion using StarGAN model","volume":"11","author":"Meftah","year":"2023","journal-title":"IEEE Access"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"103022","DOI":"10.1016\/j.specom.2023.103022","article-title":"Choosing only the best voice imitators: Top-K many-to-many voice conversion with StarGAN","volume":"156","author":"Colomer","year":"2024","journal-title":"Speech Commun."},{"key":"ref_15","unstructured":"(2024, July 23). ElevenLabs. Available online: https:\/\/elevenlabs.io\/."},{"key":"ref_16","unstructured":"(2024, July 23). Descript. Available online: https:\/\/www.descript.com\/."},{"key":"ref_17","unstructured":"(2024, July 26). HSBC Voice ID. Available online: https:\/\/ciiom.hsbc.com\/ways-to-bank\/phone-banking\/voice-id\/."},{"key":"ref_18","unstructured":"(2024, July 26). WeChat VoicePrint. Available online: https:\/\/blog.wechat.com\/2015\/05\/21\/voiceprint-the-new-wechat-password\/."},{"key":"ref_19","unstructured":"(2024, August 04). Alipay Sound Wave Payment. Available online: https:\/\/opendocs.alipay.com\/open\/140\/104601."},{"key":"ref_20","unstructured":"(2024, August 07). What Is Alexa Voice ID. Available online: https:\/\/www.amazon.co.uk\/gp\/help\/customer\/display.html?nodeId=GYCXKY2AB2QWZT2X."},{"key":"ref_21","unstructured":"Klepper, D., and Swenson, A. (2024, August 16). AI-Generated Disinformation Poses Threat of Misleading Voters in 2024 Election. PBS NewsHour. Available online: https:\/\/www.pbs.org\/newshour\/politics\/ai-generated-disinformation-poses-threat-of-misleading-voters-in-2024-election\/."},{"key":"ref_22","unstructured":"Rudy, E. (2024, August 27). Don\u2019t Watch Sinister Taylor Swift Video or Risk Bank-Emptying Attack That Just Takes Seconds. The Sun. Available online: https:\/\/www.thesun.co.uk\/tech\/25342162\/taylor-swift-fans-ai-attack-dangerous-video\/."},{"key":"ref_23","unstructured":"Stupp, C. (2024, August 27). Fraudsters Used AI to Mimic CEO\u2019s Voice in Unusual Cybercrime Case. WSJ. Available online: https:\/\/www.wsj.com\/articles\/fraudsters-use-ai-to-mimic-ceos-voice-in-unusual-cybercrime-case-11567157402\/."},{"key":"ref_24","unstructured":"Brewster, T. (2024, September 02). Fraudsters Cloned Company Director\u2019s Voice in $35 Million Heist, Police Find. Forbes. Available online: https:\/\/www.forbes.com\/sites\/thomasbrewster\/2021\/10\/14\/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions\/."},{"key":"ref_25","unstructured":"Karimi, F. (2024, July 20). Mom, These Bad Men Have Me\u2019: She Believes Scammers Cloned Her Daughter\u2019s Voice in a Fake Kidnapping. CNN. Available online: https:\/\/edition.cnn.com\/2023\/04\/29\/us\/ai-scam-calls-kidnapping-cec\/index.html\/."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Wenger, E., Bronckers, M., Cianfarani, C., Cryan, J., Sha, A., Zheng, H., and Zhao, B.Y. (2021, January 15\u201319). \u201cHello, It\u2019s Me\u201d: Deep Learning-based Speech Synthesis Attacks in the Real World. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event.","DOI":"10.1145\/3460120.3484742"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"103617","DOI":"10.1016\/j.cose.2023.103617","article-title":"Hello me, meet the real me: Voice synthesis attacks on voice assistants","volume":"137","author":"Bilika","year":"2024","journal-title":"Comput. Secur."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Gao, Y., Lian, J., Raj, B., and Singh, R. (2021, January 19\u201322). Detection and evaluation of human and machine generated speech in spoofing attacks on automatic speaker verification systems. Proceedings of the 2021 IEEE Spoken Language Technology Workshop (SLT), Shenzhen, China.","DOI":"10.1109\/SLT48900.2021.9383558"},{"key":"ref_29","unstructured":"Simmons, D. (2024, September 08). BBC Fools HSBC Voice Recognition Security System. Available online: https:\/\/www.bbc.co.uk\/news\/technology-39965545\/."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"101869","DOI":"10.1016\/j.inffus.2023.101869","article-title":"A review of deep learning techniques for speech processing","volume":"99","author":"Mehrish","year":"2023","journal-title":"Inf. Fusion"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"32725","DOI":"10.1007\/s11042-021-11235-x","article-title":"A survey on presentation attack detection for automatic speaker verification systems: State-of-the-art, taxonomy, issues and future direction","volume":"80","author":"Tan","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Tayebi Arasteh, S., Weise, T., Schuster, M., Noeth, E., Maier, A., and Yang, S.H. (2023). The effect of speech pathology on automatic speaker verification: A large-scale study. Sci. Rep., 13.","DOI":"10.1038\/s41598-023-47711-7"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Mandasari, M.I., McLaren, M., and van Leeuwen, D.A. (2012, January 25\u201330). The effect of noise on modern automatic speaker recognition systems. Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan.","DOI":"10.1109\/ICASSP.2012.6288857"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"107661","DOI":"10.1016\/j.engappai.2023.107661","article-title":"Speech and speaker recognition using raw waveform modeling for adult and children\u2019s speech: A comprehensive review","volume":"131","author":"Radha","year":"2024","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Zhang, W., Yeh, C.C., Beckman, W., Raitio, T., Rasipuram, R., Golipour, L., and Winarsky, D. (2023, January 26\u201328). Audiobook synthesis with long-form neural text-to-speech. Proceedings of the 12th Speech Synthesis Workshop (SSW) 2023, Grenoble, France.","DOI":"10.21437\/SSW.2023-22"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Kim, M., Jeong, M., Choi, B.J., Kim, S., Lee, J.Y., and Kim, N.S. (2024). Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction. arXiv.","DOI":"10.1109\/ASRU57964.2023.10389791"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Vecino, B.T., Gabrys, A., Matwicki, D., Pomirski, A., Iddon, T., Cotescu, M., and Lorenzo-Trueba, J. (2023, January 26\u201328). Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications. Proceedings of the 12th ISCA Speech Synthesis Workshop (SSW2023), Grenoble, France.","DOI":"10.21437\/SSW.2023-35"},{"key":"ref_38","unstructured":"Donahue, J., Dieleman, S., Bi\u0144kowski, M., Elsen, E., and Simonyan, K. (2020). End-to-end adversarial text-to-speech. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1278","DOI":"10.1109\/TASL.2010.2089679","article-title":"A Hybrid Text-to-Speech System That Combines Concatenative and Statistical Synthesis Units","volume":"19","author":"Tiomkin","year":"2011","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1290","DOI":"10.1109\/TASLP.2021.3066047","article-title":"Transfer learning from speech synthesis to voice conversion with non-parallel training data","volume":"29","author":"Zhang","year":"2021","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1109\/LSP.2023.3277786","article-title":"SC-CNN: Effective speaker conditioning method for zero-shot multi-speaker text-to-speech systems","volume":"30","author":"Yoon","year":"2023","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Lian, J., Zhang, C., and Yu, D. (2022, January 23\u201327). Robust disentangled variational speech representation learning for zero-shot voice conversion. Proceedings of the ICASSP 2022\u20142022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9747272"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1157","DOI":"10.1109\/LSP.2023.3308474","article-title":"Lm-vc: Zero-shot voice conversion via speech generation based on language models","volume":"30","author":"Wang","year":"2023","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Wu, Y., Tan, X., Li, B., He, L., Zhao, S., Song, R., Qin, T., and Liu, T.Y. (2022). Adaspeech 4: Adaptive text to speech in zero-shot scenarios. arXiv.","DOI":"10.21437\/Interspeech.2022-901"},{"key":"ref_45","unstructured":"(2024, August 19). Google Cloud Text-to-Speech API Now Supports Custom Voices. Available online: https:\/\/cloud.google.com\/blog\/products\/ai-machine-learning\/create-custom-voices-with-google-cloud-text-to-speech\/."},{"key":"ref_46","unstructured":"(2024, August 19). Real-Time State-of-the-art Speech Synthesis for Tensorflow 2. Available online: https:\/\/github.com\/TensorSpeech\/TensorFlowTTS\/."},{"key":"ref_47","unstructured":"(2024, September 10). An Open Source Text-to-Speech System Built by Inverting Whisper. Available online: https:\/\/github.com\/collabora\/WhisperSpeech\/."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Van Niekerk, B., Carbonneau, M.A., Za\u00efdi, J., Baas, M., Seut\u00e9, H., and Kamper, H. (2022, January 23\u201327). A comparison of discrete and soft speech units for improved voice conversion. Proceedings of the ICASSP 2022\u20142022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9746484"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Li, J., Tu, W., and Xiao, L. (2023, January 4\u201310). Freevc: Towards high-quality text-free one-shot voice conversion. Proceedings of the ICASSP 2023\u20142023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece.","DOI":"10.1109\/ICASSP49357.2023.10095191"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"1717","DOI":"10.1109\/TASLP.2021.3076867","article-title":"Any-to-many voice conversion with location-relative sequence-to-sequence modeling","volume":"29","author":"Liu","year":"2021","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_51","unstructured":"Yi, J., Tao, J., Fu, R., Yan, X., Wang, C., Wang, T., Zhang, C.Y., Zhang, X., Zhao, Y., and Ren, Y. (2023). Add 2023: The second audio deepfake detection challenge. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Yi, J., Fu, R., Tao, J., Nie, S., Ma, H., Wang, C., Wang, T., Tian, Z., Bai, Y., and Fan, C. (2022, January 23\u201327). Add 2022: The first audio deep synthesis detection challenge. Proceedings of the ICASSP 2022\u20132022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9746939"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Mart\u00edn-Do\u00f1as, J.M., and \u00c1lvarez, A. (2022, January 23\u201327). The vicomtech audio deepfake detection system based on wav2vec2 for the 2022 add challenge. Proceedings of the ICASSP 2022\u20142022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore.","DOI":"10.1109\/ICASSP43922.2022.9747768"},{"key":"ref_54","unstructured":"Li, J., Li, L., Luo, M., Wang, X., Qiao, S., and Zhou, Y. (2023, January 19). Multi-grained Backend Fusion for Manipulation Region Location of Partially Fake Audio. Proceedings of the DADA@ IJCAI, Macau, China."},{"key":"ref_55","unstructured":"Cai, Z., Wang, W., Wang, Y., and Li, M. (2023). The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023. arXiv."},{"key":"ref_56","unstructured":"Liu, J., Su, Z., Huang, H., Wan, C., Wang, Q., Hong, J., Tang, B., and Zhu, F. (2023). TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection. arXiv."},{"key":"ref_57","unstructured":"Ryan, P. (2024, August 14). Deepfake\u2019 Audio Evidence Used in UK Court to Discredit Dubai Dad. The National. Available online: https:\/\/www.thenationalnews.com\/uae\/courts\/deepfake-audio-evidence-used-in-uk-court-to-discredit-dubai-dad-1.975764\/."},{"key":"ref_58","unstructured":"Zhang, Z., Zhou, L., Wang, C., Chen, S., Wu, Y., Liu, S., Chen, Z., Liu, Y., Wang, H., and Li, J. (2023). Speak foreign languages with your own voice: Cross-lingual neural codec language modeling. arXiv."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Casanova, E., Shulby, C., G\u00f6lge, E., M\u00fcller, N.M., De Oliveira, F.S., Junior, A.C., Soares, A.d.S., Aluisio, S.M., and Ponti, M.A. (2021). SC-GlowTTS: An efficient zero-shot multi-speaker text-to-speech model. arXiv.","DOI":"10.21437\/Interspeech.2021-1774"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Li, Y.A., Han, C., and Mesgarani, N. (2023, January 9\u201312). Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models. Proceedings of the 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar.","DOI":"10.1109\/SLT54892.2023.10022498"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Kang, W., Hasegawa-Johnson, M., and Roy, D. (2022). End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions. arXiv.","DOI":"10.21437\/Interspeech.2023-2298"},{"key":"ref_62","unstructured":"Popov, V., Vovk, I., Gogoryan, V., Sadekova, T., Kudinov, M., and Wei, J. (2021). Diffusion-based voice conversion with fast maximum likelihood sampling scheme. arXiv."},{"key":"ref_63","unstructured":"(2024, July 14). Resemblyzer. Available online: https:\/\/github.com\/resemble-ai\/Resemblyzer\/."},{"key":"ref_64","unstructured":"Jia, Y., Zhang, Y., Weiss, R., Wang, Q., Shen, J., Ren, F., Nguyen, P., Pang, R., Lopez Moreno, I., and Wu, Y. (2018). Transfer learning from speaker verification to multispeaker text-to-speech synthesis. Adv. Neural Inf. Process. Syst., 31, Available online: https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2018\/file\/6832a7b24bc06775d02b7406880b93fc-Paper.pdf."},{"key":"ref_65","unstructured":"(2024, July 18). Amazon Connect Voice ID. Available online: https:\/\/aws.amazon.com\/connect\/voice-id\/."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"AlAli, A., and Theodorakopoulos, G. (2023, January 9\u201310). An RFP dataset for Real, Fake, and Partially fake audio detection. Proceedings of the International Conference on Cyber Security, Privacy in Communication Networks, Cardiff, UK.","DOI":"10.1007\/978-981-97-3973-8_1"},{"key":"ref_67","unstructured":"Demirsahin, I., Kjartansson, O., Gutkin, A., and Rivera, C. (2020, January 1). Open-source Multi-speaker Corpora of the English Accents in the British Isles. Proceedings of the 12th Language Resources and Evaluation Conference (LREC), Marseille, France."},{"key":"ref_68","unstructured":"Yamagishi, J., Veaux, C., and MacDonald, K. (2019). CSTR VCTK Corpus: English Multi-Speaker Corpus for CSTR Voice Cloning Toolkit (Version 0.92), The Centre for Speech Technology Research (CSTR), University of Edinburgh."},{"key":"ref_69","unstructured":"Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., and Vijayanarasimhan, S. (2016). Youtube-8m: A large-scale video classification benchmark. arXiv."},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"103364","DOI":"10.1016\/j.isci.2021.103364","article-title":"Fooled twice: People cannot detect deepfakes but think they can","volume":"24","author":"Soraperra","year":"2021","journal-title":"Iscience"},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Kaate, I., Salminen, J., Jung, S.G., Almerekhi, H., and Jansen, B.J. (2023, January 20\u201322). How do users perceive deepfake personas? Investigating the deepfake user perception and its implications for human-computer interaction. Proceedings of the 15th Biannual Conference of the Italian SIGCHI Chapter, Torino, Italy.","DOI":"10.1145\/3605390.3605397"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Ahmed, M.F.B., Miah, M.S.U., Bhowmik, A., and Sulaiman, J.B. (2021, January 4\u20135). Awareness to Deepfake: A resistance mechanism to Deepfake. Proceedings of the 2021 International Congress of Advanced Technology and Engineering (ICOTEN), Taiz, Yemen.","DOI":"10.1109\/ICOTEN52080.2021.9493549"},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1109\/TASLP.2022.3233236","article-title":"The partialspoof database and countermeasures for the detection of short fake speech segments embedded in an utterance","volume":"31","author":"Zhang","year":"2022","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Tak, H., Todisco, M., Wang, X., Jung, J.W., Yamagishi, J., and Evans, N. (2022). Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation. arXiv.","DOI":"10.21437\/Odyssey.2022-16"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Liu, R., Zhang, J., Gao, G., and Li, H. (2023). Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion. arXiv.","DOI":"10.21437\/Interspeech.2023-2335"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Yu, Z., Zhai, S., and Zhang, N. (2023, January 26\u201330). Antifake: Using adversarial audio to prevent unauthorized speech synthesis. Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Copenhagen, Denmark.","DOI":"10.1145\/3576915.3623209"},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"289","DOI":"10.3390\/forensicsci4030021","article-title":"Video and audio deepfake datasets and open issues in deepfake technology: Being ahead of the curve","volume":"4","author":"Akhtar","year":"2024","journal-title":"Forensic Sci."}],"container-title":["Journal of Cybersecurity and Privacy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2624-800X\/5\/1\/6\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:29:44Z","timestamp":1760027384000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2624-800X\/5\/1\/6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,8]]},"references-count":77,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["jcp5010006"],"URL":"https:\/\/doi.org\/10.3390\/jcp5010006","relation":{},"ISSN":["2624-800X"],"issn-type":[{"value":"2624-800X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,8]]}}}