{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T01:12:21Z","timestamp":1775697141635,"version":"3.50.1"},"reference-count":109,"publisher":"Association for Computing Machinery (ACM)","issue":"8","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2025,8,31]]},"abstract":"<jats:p>As Deepfake content proliferates online, advancing face manipulation forensics has become crucial. To combat this emerging threat, previous methods mainly focus on studying how to distinguish authentic and manipulated face images. Although impressive, image-level classification lacks explainability and is limited to specific application scenarios, spurring recent research on pixel-level prediction for face manipulation forensics. However, existing forgery localization methods suffer from exploring frequency-based forgery traces in the localization network. In this paper, we observe that multi-frequency spectrum information is effective for identifying tampered regions. To this end, a novel Multi-spectral Class Center Network (MSCCNet) is proposed for face manipulation localization. Specifically, we design a Multi-spectral Class Center (MSCC) module to learn more generalizable and multi-frequency features. Based on the features of different frequency bands, the MSCC module collects multi-spectral class centers and computes pixel-to-class relations. Applying multi-spectral class-level representations suppresses the semantic information of the visual concepts which is insensitive to manipulated regions of forgery images. Furthermore, we propose a Multi-level Features Aggregation (MFA) module to employ more low-level forgery artifacts and structural textures. Meanwhile, we conduct a comprehensive localization benchmark based on pixel-level FF++ and Dolos datasets. Experimental results quantitatively and qualitatively demonstrate the effectiveness and superiority of the proposed MSCCNet. We expect this work to inspire more studies on pixel-level face manipulation localization. The codes are available.<\/jats:p>","DOI":"10.1145\/3747296","type":"journal-article","created":{"date-parts":[[2025,7,11]],"date-time":"2025-07-11T13:13:10Z","timestamp":1752239590000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Multi-spectral Class Center Network for Face Manipulation Localization"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7634-9992","authenticated-orcid":false,"given":"Changtao","family":"Miao","sequence":"first","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3028-0755","authenticated-orcid":false,"given":"Qi","family":"Chu","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9095-4462","authenticated-orcid":false,"given":"Zhentao","family":"Tan","sequence":"additional","affiliation":[{"name":"Alibaba Cloud, Alibaba Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0283-0238","authenticated-orcid":false,"given":"Zhenchao","family":"Jin","sequence":"additional","affiliation":[{"name":"Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0026-6813","authenticated-orcid":false,"given":"Tao","family":"Gong","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9305-1189","authenticated-orcid":false,"given":"Wanyi","family":"Zhuang","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1024-5063","authenticated-orcid":false,"given":"Yue","family":"Wu","sequence":"additional","affiliation":[{"name":"Alibaba Cloud, Alibaba Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3977-8800","authenticated-orcid":false,"given":"Bin","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9182-4069","authenticated-orcid":false,"given":"Honggang","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4417-9316","authenticated-orcid":false,"given":"Nenghai","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Technology, University of Science and Technology of China, Hefei, China"}]}],"member":"320","published-online":{"date-parts":[[2025,8,12]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/T-C.1974.223784"},{"key":"e_1_3_1_3_2","first-page":"4980","volume-title":"Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV)","author":"Bappy Jawadul H.","year":"2017","unstructured":"Jawadul H. Bappy, Amit K. Roy-Chowdhury, Jason Bunk, Lakshmanan Nataraj, and B. S. Manjunath. 2017. Exploiting spatial structure for localizing manipulated image regions. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), 4980\u20134989."},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/2909827.2930786"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2018.2825953"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3612928"},{"issue":"3","key":"e_1_3_1_7_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3626101","article-title":"Hierarchical learning and dummy triplet loss for efficient deepfake detection","volume":"20","author":"Beuve Nicolas","year":"2023","unstructured":"Nicolas Beuve, Wassim Hamidouche, and Olivier D\u00e9forges. 2023. Hierarchical learning and dummy triplet loss for efficient deepfake detection. ACM Transactions on Multimedia Computing, Communications and Applications 20, 3 (2023), 1\u201318.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01815"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i2.16193"},{"key":"e_1_3_1_10_2","first-page":"14185","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Chen Xinru","year":"2021","unstructured":"Xinru Chen, Chengbo Dong, Jiaqi Ji, Juan Cao, and Xirong Li. 2021. Image manipulation detection by multi-view multi-scale supervision. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, 14185\u201314193."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3625231"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00582"},{"key":"e_1_3_1_13_2","unstructured":"DeepFakes. 2019. Retrieved from https:\/\/github.com\/deepfakes\/"},{"key":"e_1_3_1_14_2","unstructured":"J. Dong W. Wang and T. Tan. 2010. Casia image tampering detection evaluation database. Retrieved from http:\/\/forensics.idealtest.org"},{"key":"e_1_3_1_15_2","first-page":"422","volume-title":"Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)","author":"Dong J.","year":"2013","unstructured":"J. Dong, W. Wang, and T. Tan. 2013. CASIA image tampering detection evaluation database. In Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP). IEEE, 422\u2013426."},{"key":"e_1_3_1_16_2","unstructured":"Ricard Durall Margret Keuper Franz-Josef Pfreundt and Janis Keuper. 2019. Unmasking deepfakes with simple features. arXiv:1911.00686. Retrieved from https:\/\/arxiv.org\/abs\/1911.00686"},{"key":"e_1_3_1_17_2","unstructured":"FaceSwap. 2019. Retrieved from https:\/\/github.com\/MarekKowalski\/FaceSwap"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2012.2190402"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3489143"},{"key":"e_1_3_1_20_2","unstructured":"Qiqi Gu Shen Chen Taiping Yao Yang Chen Shouhong Ding and Ran Yi. 2021. Exploiting Fine-grained face forgery clues via progressive enhancement learning. arXiv:2112.13977. Retrieved from https:\/\/arxiv.org\/abs\/2112.13977"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACVW.2019.00018"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3652027"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00308"},{"key":"e_1_3_1_24_2","article-title":"Inductive representation learning on large graphs","volume":"30","author":"Hamilton Will","year":"2017","unstructured":"Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, Vol. 30.","journal-title":"Advances in Neural Information Processing Systems, Vol"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3591106.3592282"},{"key":"e_1_3_1_26_2","first-page":"15035","volume-title":"Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Hao Jing","year":"2021","unstructured":"Jing Hao, Zhixin Zhang, Shicai Yang, Di Xie, and Shiliang Pu. 2021. TransForensics: Image forgery localization with dense Self-Attention. In Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), 15035\u201315044."},{"issue":"11","key":"e_1_3_1_27_2","first-page":"1","article-title":"Multimodal neurosymbolic approach for explainable deepfake detection","volume":"20","author":"Haq Ijaz Ul","year":"2023","unstructured":"Ijaz Ul Haq, Khalid Mahmood Malik, and Khan Muhammad. 2023. Multimodal neurosymbolic approach for explainable deepfake detection. ACM Transactions on Multimedia Computing, Communications and Applications 20, 11\u00a0(2023), 1\u201316.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_28_2","first-page":"770","volume-title":"Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"He Kaiming","year":"2015","unstructured":"Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770\u2013778."},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2006.262447"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2022.3141262"},{"issue":"11","key":"e_1_3_1_31_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3592615","article-title":"Data augmentation-based novel deep learning method for deepfaked images detection","volume":"20","author":"Iqbal Farkhund","year":"2023","unstructured":"Farkhund Iqbal, Ahmed Abbasi, Abdul Rehman Javed, Ahmad Almadhor, Zunera Jalil, Sajid Anwar, and Imad Rida. 2023. Data augmentation-based novel deep learning method for deepfaked images detection. ACM Transactions on Multimedia Computing, Communications and Applications 20, 11\u00a0(2023), 1\u201315.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TBIOM.2021.3086109"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3643030"},{"key":"e_1_3_1_34_2","unstructured":"Tero Karras Timo Aila Samuli Laine and Jaakko Lehtinen. 2017. Progressive growing of GANs for improved quality stability and variation. arXiv:1710.10196. Retrieved from https:\/\/arxiv.org\/abs\/1710.10196"},{"key":"e_1_3_1_35_2","unstructured":"Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907. Retrieved from https:\/\/arxiv.org\/abs\/1609.02907"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2022.3169921"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-022-01617-5"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3678883"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00839"},{"key":"e_1_3_1_40_2","unstructured":"Lingzhi Li Jianmin Bao Hao Yang Dong Chen and Fang Wen. 2019. Faceshifter: Towards high fidelity and occlusion aware face swapping. arXiv:1912.13457. Retrieved from https:\/\/arxiv.org\/abs\/1912.13457"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00505"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3414034"},{"key":"e_1_3_1_43_2","unstructured":"Yuezun Li and Siwei Lyu. 2018. Exposing deepfake videos by detecting face warping artifacts. arXiv:1811.00656. Retrieved from https:\/\/arxiv.org\/abs\/1811.00656"},{"key":"e_1_3_1_44_2","unstructured":"Yuezun Li Xin Yang Pu Sun Honggang Qi and Siwei Lyu. 2019. Celeb-df: A new dataset for deepfake forensics. arXiv:1909.12962. Retrieved from https:\/\/arxiv.org\/abs\/1909.12962"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00327"},{"issue":"2","key":"e_1_3_1_46_2","first-page":"1","article-title":"Cascaded adaptive graph representation learning for image copy-move forgery detection","volume":"21","author":"Li Yuanman","year":"2024","unstructured":"Yuanman Li, Lanhao Ye, Haokun Cao, Wei Wang, and Zhongyun Hua. 2024. Cascaded adaptive graph representation learning for image copy-move forgery detection. ACM Transactions on Multimedia Computing, Communications and Applications 21, 2\u00a0(2024), 1\u201324.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"issue":"11","key":"e_1_3_1_47_2","first-page":"1","article-title":"Detecting deepfake videos using spatiotemporal trident network","volume":"20","author":"Lin Kaihan","year":"2023","unstructured":"Kaihan Lin, Weihong Han, Shudong Li, Zhaoquan Gu, Huimin Zhao, and Yangyang Mei. 2023. Detecting deepfake videos using spatiotemporal trident network. ACM Transactions on Multimedia Computing, Communications and Applications 20, 11\u00a0(2023), 1\u201320.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00083"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3189545"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3558004"},{"key":"e_1_3_1_51_2","first-page":"11461","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE","author":"Lugmayr Andreas","year":"2022","unstructured":"Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, and Luc Van Gool. 2022. RePaint: Inpainting using denoising diffusion probabilistic models. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 11461\u201311471."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01605"},{"key":"e_1_3_1_53_2","unstructured":"Xiaochen Ma Xuekang Zhu Lei Su Bo Du Zhuohang Jiang Bingkui Tong Zeyu Lei Xinyu Yang Chi-Man Pun Jiancheng Lv et al. 2024. IMDL-BenCo: A comprehensive benchmark and codebase for image manipulation detection and localization. arXiv:2406.10580. Retrieved from https:\/\/arxiv.org\/abs\/2406.10580"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58571-6_39"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/TBIOM.2021.3119403"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2022.3233774"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2022.3198275"},{"issue":"11","key":"e_1_3_1_58_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3625547","article-title":"ProActive deepfake detection using GAN-based visible watermarking","volume":"20","author":"Nadimpalli Aakash Varma","year":"2023","unstructured":"Aakash Varma Nadimpalli and Ajita Rattani. 2023. ProActive deepfake detection using GAN-based visible watermarking. ACM Transactions on Multimedia Computing, Communications and Applications 20, 11\u00a0(2023), 1\u201327.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/BTAS46853.2019.9185974"},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACVW50321.2020.9096940"},{"key":"e_1_3_1_61_2","first-page":"1","volume-title":"Proceedings of the 2012 IEEE International Conference on Computational Photography (ICCP)","author":"Pan Xunyu","year":"2012","unstructured":"Xunyu Pan, Xing Zhang, and Siwei Lyu. 2012. Exposing image splicing with inconsistent local noise variances. In Proceedings of the 2012 IEEE International Conference on Computational Photography (ICCP). IEEE, 1\u201310."},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2016.2623589"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58610-2_6"},{"key":"e_1_3_1_64_2","first-page":"783","volume-title":"Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Qin Zequn","year":"2021","unstructured":"Zequn Qin, Pengyi Zhang, Fei Wu, and Xi Li. 2021. Fcanet: Frequency channel attention networks. In Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV). IEEE, 783\u2013792."},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/WIFS.2017.8267647"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/3506853"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00009"},{"issue":"4","key":"e_1_3_1_69_2","first-page":"1","article-title":"Face reconstruction-based generalized deepfake detection model with residual outlook attention","volume":"21","author":"Shi Zenan","year":"2024","unstructured":"Zenan Shi, Wenyu Liu, and Haipeng Chen. 2024. Face reconstruction-based generalized deepfake detection model with residual outlook attention. ACM Transactions on Multimedia Computing, Communications and Applications 21, 4\u00a0(2024), 1\u201319.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_70_2","doi-asserted-by":"crossref","first-page":"18699","DOI":"10.1109\/CVPR52688.2022.01816","volume-title":"Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Shiohara Kaede","year":"2022","unstructured":"Kaede Shiohara and T. Yamasaki. 2022. Detecting deepfakes with self-blended images. In Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18699\u201318708."},{"key":"e_1_3_1_71_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510462"},{"key":"e_1_3_1_72_2","first-page":"467","volume-title":"Proceedings of the 17th European Conference on Computer Vision (ECCV \u201922)","author":"Song Luchuan","year":"2022","unstructured":"Luchuan Song, Zheng Fang, Xiaodan Li, Xiaoyi Dong, Zhenchao Jin, Yuefeng Chen, and Siwei Lyu. 2022. Adaptive face forgery detection in cross domain. In Proceedings of the 17th European Conference on Computer Vision (ECCV \u201922). Springer, 467\u2013484."},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3547806"},{"issue":"2","key":"e_1_3_1_74_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3686158","article-title":"Cross-attention based two-branch networks for document image forgery localization in the metaverse","volume":"21","author":"Song Yalin","year":"2024","unstructured":"Yalin Song, Wenbin Jiang, Xiuli Chai, Zhihua Gan, Mengyuan Zhou, and Lei Chen. 2024. Cross-attention based two-branch networks for document image forgery localization in the metaverse. ACM Transactions on Multimedia Computing, Communications and Applications 21, 2\u00a0(2024), 1\u201324.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_75_2","unstructured":"Kritaphat Songsri-In and Stefanos Zafeiriou. 2019. Complement face forensic detection and localization with facial landmarks. arXiv:1910.05455. Retrieved from https:\/\/arxiv.org\/abs\/1910.05455"},{"key":"e_1_3_1_76_2","first-page":"3172","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV)","author":"Suvorov Roman","year":"2021","unstructured":"Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, and Victor S. Lempitsky. 2021. Resolution-robust large mask inpainting with fourier convolutions. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), 3172\u20133182."},{"key":"e_1_3_1_77_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2022.3214768"},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3323035"},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.1145\/2929464.2929475"},{"key":"e_1_3_1_80_2","unstructured":"Petar Veli\u010dkovi\u0107 Guillem Cucurull Arantxa Casanova Adriana Romero Pietro Lio and Yoshua Bengio. 2017. Graph attention networks. arXiv:1710.10903. Retrieved from https:\/\/arxiv.org\/abs\/1710.10903"},{"key":"e_1_3_1_81_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01468"},{"key":"e_1_3_1_82_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00240"},{"key":"e_1_3_1_83_2","doi-asserted-by":"crossref","unstructured":"Junke Wang Zuxuan Wu Jingjing Chen and Yu-Gang Jiang. 2021. M2TR: Multi-modal multi-scale transformers for deepfake detection. arXiv:2104.09770. Retrieved from https:\/\/arxiv.org\/abs\/2104.09770","DOI":"10.1145\/3512527.3531415"},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.1145\/3588574"},{"key":"e_1_3_1_85_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_3_1_86_2","doi-asserted-by":"publisher","DOI":"10.1145\/3408299"},{"key":"e_1_3_1_87_2","first-page":"161","volume-title":"Proceedings of the IEEE International Conference on Image Processing (ICIP)","author":"Wen B.","year":"2016","unstructured":"B. Wen, Y. Zhu, R. Subramanian, T. T. Ng, and S. Winkler. 2016. COVERAGE-A novel database for copy-move forgery detection. In Proceedings of the IEEE International Conference on Image Processing (ICIP). IEEE, 161\u2013165."},{"key":"e_1_3_1_88_2","unstructured":"Haiwei Wu Jiantao Zhou Shile Zhang and Jinyu Tian. 2022. Exploring spatial-temporal features for deepfake detection and localization. arXiv:2210.15872. Retrieved from https:\/\/arxiv.org\/abs\/2210.15872"},{"key":"e_1_3_1_89_2","doi-asserted-by":"crossref","first-page":"9535","DOI":"10.1109\/CVPR.2019.00977","volume-title":"Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Wu Yue","year":"2019","unstructured":"Yue Wu, Wael AbdAlmageed, and P. Natarajan. 2019. ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9535\u20139544."},{"issue":"11","key":"e_1_3_1_90_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3605893","article-title":"Forgery detection by weighted complementarity between significant invariance and detail enhancement","volume":"20","author":"Xiao Shuai","year":"2023","unstructured":"Shuai Xiao, Zhuo Zhang, Jiachen Yang, Jiabao Wen, and Yang Li. 2023. Forgery detection by weighted complementarity between significant invariance and detail enhancement. ACM Transactions on Multimedia Computing, Communications and Applications 20, 11\u00a0(2023), 1\u201320.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_91_2","unstructured":"Keyulu Xu Weihua Hu Jure Leskovec and Stefanie Jegelka. 2018. How powerful are graph neural networks?. arXiv:1810.00826. Retrieved from https:\/\/arxiv.org\/abs\/1810.00826"},{"key":"e_1_3_1_92_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2018.2834155"},{"issue":"2","key":"e_1_3_1_93_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3672566","article-title":"Deepfake video detection using facial feature points and Ch-transformer","volume":"21","author":"Yang Rui","year":"2024","unstructured":"Rui Yang, Rushi Lan, Zhenrong Deng, Xiaonan Luo, and Xiyan Sun. 2024. Deepfake video detection using facial feature points and Ch-transformer. ACM Transactions on Multimedia Computing, Communications and Applications 21, 2\u00a0(2024), 1\u201322.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_94_2","first-page":"8261","volume-title":"Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP \u201919)","author":"Yang Xin","year":"2019","unstructured":"Xin Yang, Yuezun Li, and Siwei Lyu. 2019. Exposing deep fakes using inconsistent head poses. In Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP \u201919). IEEE, 8261\u20138265."},{"key":"e_1_3_1_95_2","doi-asserted-by":"publisher","DOI":"10.1145\/3665248"},{"key":"e_1_3_1_96_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2022.3146781"},{"issue":"2","key":"e_1_3_1_97_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3664654","article-title":"Spatiotemporal inconsistency learning and interactive fusion for deepfake video detection","volume":"21","author":"Zhang Dengyong","year":"2024","unstructured":"Dengyong Zhang, Wenjie Zhu, Xin Liao, Feifan Qi, Gaobo Yang, and Xiangling Ding. 2024. Spatiotemporal inconsistency learning and interactive fusion for deepfake video detection. ACM Transactions on Multimedia Computing, Communications and Applications 21, 2\u00a0(2024), 1\u201324.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"issue":"2","key":"e_1_3_1_98_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3657297","article-title":"Domain-invariant and patch-discriminative feature learning for general deepfake detection","volume":"21","author":"Zhang Jian","year":"2024","unstructured":"Jian Zhang, Jiangqun Ni, Fan Nie, and Jiwu Huang. 2024. Domain-invariant and patch-discriminative feature learning for general deepfake detection. ACM Transactions on Multimedia Computing, Communications and Applications 21, 2\u00a0(2024), 1\u201319.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_99_2","first-page":"7324","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Zhang Richard","year":"2019","unstructured":"Richard Zhang. 2019. Making convolutional networks shift-invariant again. In Proceedings of the International Conference on Machine Learning. PMLR, 7324\u20137334."},{"key":"e_1_3_1_100_2","doi-asserted-by":"publisher","DOI":"10.1145\/3625100"},{"key":"e_1_3_1_101_2","doi-asserted-by":"publisher","DOI":"10.1145\/3548689"},{"key":"e_1_3_1_102_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00222"},{"issue":"2","key":"e_1_3_1_103_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3651311","article-title":"Audio-visual contrastive pre-train for face forgery detection","volume":"21","author":"Zhao Hanqing","year":"2024","unstructured":"Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Weiming Zhang, Ying Guo, Zhen Cheng, Pengfei Yan, and Nenghai Yu. 2024. Audio-visual contrastive pre-train for face forgery detection. ACM Transactions on Multimedia Computing, Communications and Applications 21, 2\u00a0(2024), 1\u201316.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_104_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01475"},{"key":"e_1_3_1_105_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00153"},{"key":"e_1_3_1_106_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.02042"},{"key":"e_1_3_1_107_2","article-title":"Generate, segment and replace: Towards generic manipulation segmentation","author":"Zhou Peng","year":"2018","unstructured":"Peng Zhou, Bor-Chun Chen, Xintong Han, Mahyar Najibi, and Larry S. Davis. 2018. Generate, segment and replace: Towards generic manipulation segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence.","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"e_1_3_1_108_2","first-page":"391","volume-title":"Proceedings of the 17th European Conference on Computer Vision (ECCV \u201922)","author":"Zhuang Wanyi","year":"2022","unstructured":"Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, and Nenghai Yu. 2022. UIA-ViT: Unsupervised inconsistency-aware method based on vision transformer for face forgery detection. In Proceedings of the 17th European Conference on Computer Vision (ECCV \u201922). Springer, 391\u2013407."},{"key":"e_1_3_1_109_2","first-page":"1","volume-title":"Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME)","author":"Zhuang Wanyi","year":"2022","unstructured":"Wanyi Zhuang, Qi Chu, Haojie Yuan, Changtao Miao, Bin Liu, and Nenghai Yu. 2022. Towards intrinsic common discriminative features learning for face forgery detection using adversarial learning. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1\u20136."},{"key":"e_1_3_1_110_2","first-page":"6258","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV)","author":"\u00c8\u0161\u00e2n\u021baru Drago\u0219-Constantin","year":"2024","unstructured":"Drago\u0219-Constantin \u00c8\u0161\u00e2n\u021baru, Elisabeta Onea\u021b\u0103, and Dan Onea\u021b\u0103. 2024. Weakly-supervised deepfake localization in diffusion-generated images. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), 6258\u20136268."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3747296","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,12]],"date-time":"2025-08-12T20:37:17Z","timestamp":1755031037000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3747296"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,12]]},"references-count":109,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,8,31]]}},"alternative-id":["10.1145\/3747296"],"URL":"https:\/\/doi.org\/10.1145\/3747296","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,12]]},"assertion":[{"value":"2024-08-26","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-06-20","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}