{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,14]],"date-time":"2026-07-14T03:46:13Z","timestamp":1784000773507,"version":"3.55.0"},"publisher-location":"Cham","reference-count":47,"publisher":"Springer Nature Switzerland","isbn-type":[{"value":"9783031783401","type":"print"},{"value":"9783031783418","type":"electronic"}],"license":[{"start":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T00:00:00Z","timestamp":1733097600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T00:00:00Z","timestamp":1733097600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025]]},"DOI":"10.1007\/978-3-031-78341-8_12","type":"book-chapter","created":{"date-parts":[[2024,12,1]],"date-time":"2024-12-01T15:15:05Z","timestamp":1733066105000},"page":"180-193","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["PolyGlotFake: A Novel Multilingual and\u00a0Multimodal DeepFake Dataset"],"prefix":"10.1007","author":[{"given":"Yang","family":"Hou","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Haitao","family":"Fu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chunkai","family":"Chen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zida","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Haoyu","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jianjun","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,12,2]]},"reference":[{"key":"12_CR1","unstructured":"Deepswap - ai-powered DeepFake technology (2023). https:\/\/www.deepswap.net\/. Accessed 24 Dec 2023"},{"key":"12_CR2","doi-asserted-by":"crossref","unstructured":"Afchar, D., Nozick, V., Yamagishi, J., Echizen, I.: MesoNet: a compact facial video forgery detection network. In: 2018 IEEE International Workshop on Information Forensics and Security (WIFS), pp.\u00a01\u20137. IEEE (2018)","DOI":"10.1109\/WIFS.2018.8630761"},{"key":"12_CR3","unstructured":"AI, C.: Github repository for coqui AI text-to-speech (2023). https:\/\/github.com\/coqui-ai\/tts. Accessed 29 Dec 2023"},{"key":"12_CR4","unstructured":"AI, R.: Rask AI official website (2023). https:\/\/zh.rask.ai\/. Accessed 29 Dec 2023"},{"key":"12_CR5","unstructured":"AI, S.: Github repository for suno ai\u2019s bark project (2023). https:\/\/github.com\/suno-ai\/bark. Accessed 29 Dec 2023"},{"key":"12_CR6","doi-asserted-by":"crossref","unstructured":"Cao, J., Ma, C., Yao, T., Chen, S., Ding, S., Yang, X.: End-to-end reconstruction-classification learning for face forgery detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 4113\u20134122 (2022)","DOI":"10.1109\/CVPR52688.2022.00408"},{"key":"12_CR7","doi-asserted-by":"crossref","unstructured":"Cheng, K., et al.: Videoretalking: audio-based lip synchronization for talking head video editing in the wild. In: SIGGRAPH Asia 2022 Conference Papers, pp.\u00a01\u20139 (2022)","DOI":"10.1145\/3550469.3555399"},{"key":"12_CR8","doi-asserted-by":"crossref","unstructured":"Chung, J.S., Nagrani, A., Zisserman, A.: Voxceleb2: deep speaker recognition. arXiv preprint arXiv:1806.05622 (2018)","DOI":"10.21437\/Interspeech.2018-1929"},{"key":"12_CR9","series-title":"LNCS","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1007\/978-3-031-06433-3_19","volume-title":"ICIAP 2022","author":"DA Coccomini","year":"2022","unstructured":"Coccomini, D.A., Messina, N., Gennaro, C., Falchi, F.: Combining efficientNet and vision transformers for video DeepFake detection. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds.) ICIAP 2022. LNCS, vol. 13233, pp. 219\u2013229. Springer, Cham (2022). https:\/\/doi.org\/10.1007\/978-3-031-06433-3_19"},{"key":"12_CR10","doi-asserted-by":"crossref","unstructured":"Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 5781\u20135790 (2020)","DOI":"10.1109\/CVPR42600.2020.00582"},{"key":"12_CR11","unstructured":"Dolhansky, B., et al.: The DeepFake detection challenge (DFDC) dataset. arXiv preprint arXiv:2006.07397 (2020)"},{"key":"12_CR12","doi-asserted-by":"crossref","unstructured":"Dong, S., Wang, J., Ji, R., Liang, J., Fan, H., Ge, Z.: Implicit identity leakage: the stumbling block to improving DeepFake detection generalization. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 3994\u20134004 (2023)","DOI":"10.1109\/CVPR52729.2023.00389"},{"key":"12_CR13","unstructured":"Heygen: Heygen official website (2023). https:\/\/www.heygen.com\/. Accessed 29 Dec 2023"},{"key":"12_CR14","doi-asserted-by":"crossref","unstructured":"Hou, Y., Guo, Q., Huang, Y., Xie, X., Ma, L., Zhao, J.: Evading DeepFake detectors via adversarial statistical consistency. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 12271\u201312280 (2023)","DOI":"10.1109\/CVPR52729.2023.01181"},{"key":"12_CR15","unstructured":"Jemine, C.: Real-time-voice-cloning. University of Li\u00e9ge, Li\u00e9ge, Belgium p.\u00a03 (2019)"},{"key":"12_CR16","doi-asserted-by":"crossref","unstructured":"Jia, S., Ma, C., Yao, T., Yin, B., Ding, S., Yang, X.: Exploring frequency adversarial attacks for face forgery detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 4103\u20134112 (2022)","DOI":"10.1109\/CVPR52688.2022.00407"},{"key":"12_CR17","doi-asserted-by":"crossref","unstructured":"Jiang, L., Li, R., Wu, W., Qian, C., Loy, C.C.: Deeperforensics-1.0: a large-scale dataset for real-world face forgery detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 2889\u20132898 (2020)","DOI":"10.1109\/CVPR42600.2020.00296"},{"issue":"7","key":"12_CR18","doi-asserted-by":"publisher","first-page":"1678","DOI":"10.1007\/s11263-022-01606-8","volume":"130","author":"F Juefei-Xu","year":"2022","unstructured":"Juefei-Xu, F., Wang, R., Huang, Y., Guo, Q., Ma, L., Liu, Y.: Countering malicious DeepFake: survey, battleground, and horizon. Int. J. Comput. Vision 130(7), 1678\u20131734 (2022)","journal-title":"Int. J. Comput. Vision"},{"key":"12_CR19","doi-asserted-by":"crossref","unstructured":"Khalid, H., Kim, M., Tariq, S., Woo, S.S.: Evaluation of an audio-video multimodal deepfake dataset using unimodal and multimodal detectors. In: Proceedings of the 1st Workshop on Synthetic Multimedia-audiovisual DeepFake Generation and Detection, pp. 7\u201315 (2021)","DOI":"10.1145\/3476099.3484315"},{"key":"12_CR20","unstructured":"Khalid, H., Tariq, S., Kim, M., Woo, S.S.: Fakeavceleb: a novel audio-video multimodal deepfake dataset. arXiv preprint arXiv:2108.05080 (2021)"},{"key":"12_CR21","unstructured":"Korshunov, P., Marcel, S.: DeepFakes: a new threat to face recognition? assessment and detection. arXiv preprint arXiv:1812.08685 (2018)"},{"key":"12_CR22","doi-asserted-by":"crossref","unstructured":"Kwon, P., You, J., Nam, G., Park, S., Chae, G.: Kodf: a large-scale Korean DeepFake detection dataset. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision, pp. 10744\u201310753 (2021)","DOI":"10.1109\/ICCV48922.2021.01057"},{"key":"12_CR23","doi-asserted-by":"crossref","unstructured":"Li, J., Tu, W., Xiao, L.: Freevc: towards high-quality text-free one-shot voice conversion. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\u00a01\u20135. IEEE (2023)","DOI":"10.1109\/ICASSP49357.2023.10095191"},{"key":"12_CR24","unstructured":"Li, Y., Lyu, S.: Exposing DeepFake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656 (2018)"},{"key":"12_CR25","doi-asserted-by":"crossref","unstructured":"Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.: Celeb-DF: a large-scale challenging dataset for DeepFake forensics. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 3207\u20133216 (2020)","DOI":"10.1109\/CVPR42600.2020.00327"},{"key":"12_CR26","doi-asserted-by":"publisher","first-page":"102103","DOI":"10.1016\/j.inffus.2023.102103","volume":"103","author":"H Liz-L\u00f3pez","year":"2024","unstructured":"Liz-L\u00f3pez, H., Keita, M., Taleb-Ahmed, A., Hadid, A., Huertas-Tato, J., Camacho, D.: Generation and detection of manipulated multimodal audiovisual content: advances, trends and open challenges. Inf. Fusion 103, 102103 (2024)","journal-title":"Inf. Fusion"},{"key":"12_CR27","unstructured":"Lu, S.: faceswap-GAN: A GAN-based faceswap project on github (2023). https:\/\/github.com\/shaoanlu\/faceswap-GAN. Accessed 24 Dec 2023"},{"key":"12_CR28","doi-asserted-by":"crossref","unstructured":"Luo, Y., Zhang, Y., Yan, J., Liu, W.: Generalizing face forgery detection with high-frequency features. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 16317\u201316326 (2021)","DOI":"10.1109\/CVPR46437.2021.01605"},{"key":"12_CR29","unstructured":"Microsoft: Microsoft azure text-to-speech services (2023). https:\/\/azure.microsoft.com\/en-us\/products\/ai-services\/text-to-speech. Accessed 29 Dec 2023"},{"key":"12_CR30","doi-asserted-by":"crossref","unstructured":"Mittag, G., Naderi, B., Chehadi, A., M\u00f6ller, S.: Nisqa: a deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets. arXiv preprint arXiv:2104.09494 (2021)","DOI":"10.21437\/Interspeech.2021-299"},{"issue":"12","key":"12_CR31","doi-asserted-by":"publisher","first-page":"4695","DOI":"10.1109\/TIP.2012.2214050","volume":"21","author":"A Mittal","year":"2012","unstructured":"Mittal, A., Moorthy, A.K., Bovik, A.C.: No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 21(12), 4695\u20134708 (2012)","journal-title":"IEEE Trans. Image Process."},{"key":"12_CR32","doi-asserted-by":"crossref","unstructured":"Narayan, K., Agarwal, H., Thakral, K., Mittal, S., Vatsa, M., Singh, R.: DF-platter: Multi-face heterogeneous DeepFake dataset. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 9739\u20139748 (2023)","DOI":"10.1109\/CVPR52729.2023.00939"},{"key":"12_CR33","doi-asserted-by":"crossref","unstructured":"Neekhara, P., Dolhansky, B., Bitton, J., Ferrer, C.C.: Adversarial threats to DeepFake detection: a practical perspective. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 923\u2013932 (2021)","DOI":"10.1109\/CVPRW53098.2021.00103"},{"key":"12_CR34","doi-asserted-by":"crossref","unstructured":"Nguyen, H.H., Yamagishi, J., Echizen, I.: Use of a capsule network to detect fake images and videos. arXiv preprint arXiv:1910.12467 (2019)","DOI":"10.1109\/ICASSP.2019.8682602"},{"key":"12_CR35","doi-asserted-by":"crossref","unstructured":"Ni, Y., Meng, D., Yu, C., Quan, C., Ren, D., Zhao, Y.: Core: consistent representation learning for face forgery detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 12\u201321 (2022)","DOI":"10.1109\/CVPRW56347.2022.00011"},{"key":"12_CR36","unstructured":"OpenAI: Github repository for openai whisper project (2023). https:\/\/github.com\/openai\/whisper. Accessed 29 Dec 2023"},{"key":"12_CR37","doi-asserted-by":"crossref","unstructured":"Polyak, A., Wolf, L., Taigman, Y.: Tts skins: speaker conversion via asr. arXiv preprint arXiv:1904.08983 (2019)","DOI":"10.21437\/Interspeech.2020-1416"},{"key":"12_CR38","doi-asserted-by":"crossref","unstructured":"Prajwal, K., Mukhopadhyay, R., Namboodiri, V.P., Jawahar, C.: A lip sync expert is all you need for speech to lip generation in the wild. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 484\u2013492 (2020)","DOI":"10.1145\/3394171.3413532"},{"key":"12_CR39","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1007\/978-3-030-58610-2_6","volume-title":"Computer Vision \u2013 ECCV 2020","author":"Y Qian","year":"2020","unstructured":"Qian, Y., Yin, G., Sheng, L., Chen, Z., Shao, J.: Thinking in frequency: face forgery detection by mining frequency-aware clues. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 86\u2013103. Springer, Cham (2020). https:\/\/doi.org\/10.1007\/978-3-030-58610-2_6"},{"key":"12_CR40","unstructured":"R\u00f6ssler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nie\u00dfner, M.: Faceforensics: a large-scale video dataset for forgery detection in human faces. arXiv preprint arXiv:1803.09179 (2018)"},{"key":"12_CR41","unstructured":"Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105\u20136114. PMLR (2019)"},{"key":"12_CR42","unstructured":"Wang, C., et\u00a0al.: Neural codec language models are zero-shot text to speech synthesizers. arXiv preprint arXiv:2301.02111 (2023)"},{"key":"12_CR43","doi-asserted-by":"crossref","unstructured":"Wang, J., et al.: M2tr: multi-modal multi-scale transformers for DeepFake detection. In: Proceedings of the 2022 International Conference on Multimedia Retrieval, pp. 615\u2013623 (2022)","DOI":"10.1145\/3512527.3531415"},{"key":"12_CR44","doi-asserted-by":"crossref","unstructured":"Wang, Y., et\u00a0al.: Tacotron: towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 (2017)","DOI":"10.21437\/Interspeech.2017-1452"},{"key":"12_CR45","doi-asserted-by":"crossref","unstructured":"Xu, Y., Liang, J., Jia, G., Yang, Z., Zhang, Y., He, R.: Tall: thumbnail layout for DeepFake video detection. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision, pp. 22658\u201322668 (2023)","DOI":"10.1109\/ICCV51070.2023.02071"},{"key":"12_CR46","doi-asserted-by":"crossref","unstructured":"Yang, X., Li, Y., Lyu, S.: Exposing deep fakes using inconsistent head poses. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8261\u20138265. IEEE (2019)","DOI":"10.1109\/ICASSP.2019.8683164"},{"key":"12_CR47","doi-asserted-by":"crossref","unstructured":"Zhou, T., Wang, W., Liang, Z., Shen, J.: Face forensics in the wild. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 5778\u20135788 (2021)","DOI":"10.1109\/CVPR46437.2021.00572"}],"container-title":["Lecture Notes in Computer Science","Pattern Recognition"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-78341-8_12","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,1]],"date-time":"2024-12-01T16:05:49Z","timestamp":1733069149000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-78341-8_12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,2]]},"ISBN":["9783031783401","9783031783418"],"references-count":47,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-78341-8_12","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,2]]},"assertion":[{"value":"2 December 2024","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"Access to the dataset\u00a0is restricted to academic institutions and is intended solely\u00a0for research use. It complies with YouTube\u2019s fair use policy through\u00a0its transformative, non-commercial use, by including only brief excerpts (approximately 20\u00a0s) from each YouTube video, and ensuring that these excerpts do not adversely affect the copyright owners\u2019 ability to earn revenue from their original content. Should\u00a0any copyright owner feel their rights have been infringed, we\u00a0are committed to promptly removing the contested material from\u00a0our dataset.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics Statement"}},{"value":"ICPR","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Pattern Recognition","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Kolkata","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"India","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2024","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"1 December 2024","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"5 December 2024","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"27","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"icpr2024","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/icpr2024.org\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}