{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T08:00:54Z","timestamp":1761897654932,"version":"build-2065373602"},"reference-count":39,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2021,12,8]],"date-time":"2021-12-08T00:00:00Z","timestamp":1638921600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the National Science Foundation of China","award":["U20A20163, 61671069"],"award-info":[{"award-number":["U20A20163, 61671069"]}]},{"name":"the Scientific Research Project of Beijing Municipal Education Commission","award":["KZ202111232049, KM202011232021"],"award-info":[{"award-number":["KZ202111232049, KM202011232021"]}]},{"name":"the Qin Xin Talents Cultivation Program","award":["QXTCP A201902"],"award-info":[{"award-number":["QXTCP A201902"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Nowadays, faces in videos can be easily replaced with the development of deep learning, and these manipulated videos are realistic and cannot be distinguished by human eyes. Some people maliciously use the technology to attack others, especially celebrities and politicians, causing destructive social impacts. Therefore, it is imperative to design an accurate method for detecting face manipulation. However, most of the existing methods adopt single convolutional neural network as the feature extraction module, causing the extracted features to be inconsistent with the human visual mechanism. Moreover, the rich details and semantic information cannot be reflected with single feature, limiting the detection performance. Therefore, this paper tackles the above problems by proposing a novel face manipulation detection method based on a supervised multi-feature fusion attention network (SMFAN). Specifically, the capsule network is used for face manipulation detection, and the SMFAN is added to the original capsule network to extract details of the fake face image. Further, the focal loss is used to realize hard example mining. Finally, the experimental results on the public dataset FaceForensics++ show that the proposed method has better performance.<\/jats:p>","DOI":"10.3390\/s21248181","type":"journal-article","created":{"date-parts":[[2021,12,8]],"date-time":"2021-12-08T23:30:00Z","timestamp":1639006200000},"page":"8181","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Face Manipulation Detection Based on Supervised Multi-Feature Fusion Attention Network"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0875-1549","authenticated-orcid":false,"given":"Lin","family":"Cao","sequence":"first","affiliation":[{"name":"The Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing 100101, China"}]},{"given":"Wenjun","family":"Sheng","sequence":"additional","affiliation":[{"name":"The Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing 100101, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7131-1162","authenticated-orcid":false,"given":"Fan","family":"Zhang","sequence":"additional","affiliation":[{"name":"The Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing 100101, China"}]},{"given":"Kangning","family":"Du","sequence":"additional","affiliation":[{"name":"The Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing 100101, China"}]},{"given":"Chong","family":"Fu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China"}]},{"given":"Peiran","family":"Song","sequence":"additional","affiliation":[{"name":"The Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing 100101, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,8]]},"reference":[{"key":"ref_1","unstructured":"Gu, J., Wang, Z., Kuen, J., Ma, L., and Shahroudy, A. (2015). Recent advances in convolutional neural networks. arXiv."},{"key":"ref_2","unstructured":"Goodfello, I. (2014, January 8\u201313). Generative adversarial nets. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Zhu, J.Y. (2017, January 22\u201329). Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision (ICCV2017), Venice, Italy.","DOI":"10.1109\/ICCV.2017.244"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Choi, Y., Choi, M., and Kim, M. (2018, January 19\u201321). StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2018), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00916"},{"key":"ref_5","unstructured":"Bayar, B., and Stamm, M.C. (2018, January 10\u201312). Constrained convolutional neural networks: A new approach towards general purpose image ma-nipulation detection. Proceedings of the IEEE Transactions on Information Forensics and Security (TIFS2018), Kaohsiung, Taiwan, China."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Guera, D., and Delp, E. (2018, January 27\u201330). Deepfake Video Detection Using Recurrent Neural Networks. Proceedings of the International Conference on Advanced Video and Signal Based Surveillance (AVSS2018), Auckland, New Zealand.","DOI":"10.1109\/AVSS.2018.8639163"},{"key":"ref_7","unstructured":"Sabir, E., Cheng, J., Jaiswal, A., AbdAlmageed, W., Masi, I., and Natarajan, P. (2019, January 16\u201320). Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW2019), Long Beach, CA, USA."},{"key":"ref_8","unstructured":"Wodajo, D., and Atnafu, S. (2021). Deepfake Video Detection Using Convolutional Vision Transformer. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Nguyen, H.H., Yamagishi, J., and Echizen, I. (2019). Use of a Capsule Network to Detect Fake Images and Videos. arXiv.","DOI":"10.1109\/ICASSP.2019.8682602"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","article-title":"Focal Loss for Dense Object Detection","volume":"42","author":"Lin","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Rossler, A., Cozzolino, D., Verdoliva, L., Verdoliva, L., Riess, C., Thies, J., and Nie\u00dfner, M. (2019). FaceForensics++: Learning to Detect Manipulated Facial Images. arXiv.","DOI":"10.1109\/ICCV.2019.00009"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Karras, T., Laine, S., and Aila, T. (2018). A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv.","DOI":"10.1109\/CVPR.2019.00453"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Nirkin, Y., Keller, Y., and Hassner, T. (November, January 27). FSGAN: Subject Agnostic Face Swapping and Reenactment. Proceedings of the IEEE International Conference on Computer Cision (ICCV2019), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00728"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Natsume, R., Yatagawa, T., and Morishima, S. (2018, January 12\u201316). RSGAN: Face Swapping and Editing Using Face and Hair Representation in Latent Spaces. Proceedings of the SIGGRAPH\u2019 18 ACM SIGGRAPH 2018 Posters, New York, NY, USA.","DOI":"10.1145\/3230744.3230818"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., and NieBner, M. (2016, January 27\u201330). Face2face: Real-Time Face Capture and Reenactment Of RGB Videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2016), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.262"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 22\u201325). Xception: Deep Learning with Depthwise Separable Convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2017), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Dang, H., Liu, F., Stehouwer, J., Liu, X., and Jain, A. (2020, January 14\u201319). On the Detection of Digital Face Manipulation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2020), Online.","DOI":"10.1109\/CVPR42600.2020.00582"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Durall, R., Keuper, M., and Keuper, J. (2020, January 14\u201319). Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Distributions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2020), Online.","DOI":"10.1109\/CVPR42600.2020.00791"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Guo, Z.Q., Yang, G., Chen, J.Y., and Sun, X.M. (2020). Fake Face Detection via Adaptive Residuals Extraction Network. arXiv.","DOI":"10.1016\/j.cviu.2021.103170"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhou, P., Han, X., Morariu, V., and Davis, L. (2017, January 22\u201325). Two-Stream Neural Networks for Tampered Face Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW2017), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.229"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2016), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"868","DOI":"10.1109\/TIFS.2012.2190402","article-title":"Rich models for steganalysis of digital images","volume":"7","author":"Fridrich","year":"2012","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"ref_23","first-page":"1","article-title":"Detecting GAN Generated Fake Images Using Co-Occurrence Matrices","volume":"5","author":"Nataraj","year":"2019","journal-title":"Electron. Imaging"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"223","DOI":"10.26599\/BDMA.2021.9020006","article-title":"Multimodal adaptive identity-recognition algorithm fused with gait perception","volume":"4","author":"Wang","year":"2019","journal-title":"Big Data Min. Anal."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"536","DOI":"10.26599\/TST.2020.9010024","article-title":"Incremental face clustering with optimal summary learning via graph convolutional network. Tsinghua Science and Technology","volume":"26","author":"Zhao","year":"2021","journal-title":"Tsinghua Sci. Technol."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liu, Y., Song, Z., Xu, X., Rafique, W., Zhang, X., Shen, J., Khosravi, M.R., and Qi, L. (2021). Bidirectional GRU networks-based next POI category prediction for healthcare. Int. J. Intell. Syst.","DOI":"10.1002\/int.22710"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Yang, Z., Chen, H., Zhang, J., Ma, J., and Chang, Y. (2021, January 7\u201315). Attention-based Multi-level Feature Fusion for Named Entity Recognition. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI2020), Yokohama, Japan.","DOI":"10.24963\/ijcai.2020\/497"},{"key":"ref_28","unstructured":"Qin, X., Wang, Z., and Bai, Y. (2020, January 7\u201312). FFA-Net: Feature fusion attention network for single image dehazing. Proceedings of the AAAI Conference on Artificial Intelligence(AAAI2020), New York, NY, USA."},{"key":"ref_29","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2850","DOI":"10.1007\/s10489-020-02055-x","article-title":"Attention-based VGG-16 model for COVID-19 chest X-ray image classification","volume":"51","author":"Sitaula","year":"2021","journal-title":"Appl. Intell."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Deng, J., and Dong, W. (2016, January 20\u201325). Imagenet: A Large-Scale Hierarchical Image Database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2009), Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_32","unstructured":"Rossler, A., Cozzolino, D., Verdoliva, L., Verdoliva, L., Riess, C., Thies, J., and Nie\u00dfner, M. (2018). Faceforensics: A Large-Scale Video Dataset for Forgery Detection in Human Faces. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"10:1499","DOI":"10.1109\/LSP.2016.2603342","article-title":"Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks","volume":"23","author":"Zhang","year":"2016","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Cozzolino, D., Poggi, G., and Verdoliva, L. (2017, January 20\u201322). Recasting Residual-Based Local Descriptors as Convolutional Neural Networks: An Application to Image Forgery Detection. Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, Philadelphia, PA, USA.","DOI":"10.1145\/3082031.3083247"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Bayar, B., and Stamm, M.C. (2016, January 20\u201322). A Deep Learning Approach To Universal Image Manipulation Detection Using A New Convolutional Layer. Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security, Vigo, Spain.","DOI":"10.1145\/2909827.2930786"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Afchar, D., Nozick, V., Yamagishi, J., and Echizen, I. (2018, January 11\u201313). Mesonet: A Compact Facial Video Forgery Detection Network. Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS2018), Hong Kong, China.","DOI":"10.1109\/WIFS.2018.8630761"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Raghavendra, R., Venkatesh, S., and Christoph Busch, R.B. (2017, January 22\u201325). Transferable Deep-CNN Features for Detecting Digital and Print-Scanned Morphed Face Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW2017), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.228"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Li, L.Z., Bao, J.M., Zhang, T., Yang, H., Chen, D., Wen, F., and Guo, B.N. (2020, January 13\u201319). Face X-Ray for More General Face Forgery Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2020), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00505"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhu, X., Wang, H., Fei, H.Y., Lei, Z., and Li, S.Z. (2020). Face Forgery Detection by 3D Decomposition. arXiv.","DOI":"10.1109\/CVPR46437.2021.00295"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/24\/8181\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:42:57Z","timestamp":1760168577000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/24\/8181"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,8]]},"references-count":39,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["s21248181"],"URL":"https:\/\/doi.org\/10.3390\/s21248181","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,12,8]]}}}