{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T20:22:12Z","timestamp":1773001332567,"version":"3.50.1"},"reference-count":59,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2025,9,30]],"date-time":"2025-09-30T00:00:00Z","timestamp":1759190400000},"content-version":"vor","delay-in-days":272,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"funder":[{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-23-IAS4-0004-02"],"award-info":[{"award-number":["ANR-23-IAS4-0004-02"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-15-IDEX-02"],"award-info":[{"award-number":["ANR-15-IDEX-02"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["International Journal of Intelligent Systems"],"published-print":{"date-parts":[[2025,1]]},"abstract":"<jats:p>Deep generative models are now capable of generating synthetic images with very high visual realism, often indistinguishable from real\u2010world photographs. Such AI\u2010generated images (AIGIs) can pose serious security concerns if used maliciously. Conventional AIGI detection methods are based on supervised learning and may have limited generalization ability. In this paper, we build a novel universal detector of AIGIs without the need to perform training on these images. Starting with a study on the effectiveness of various pretrained image models for the AIGI detection task, we then chose to build our detector based on the features of the popular CLIP model. Unlike existing methods, we use a small number of real images and their carefully processed counterparts as AIGI proxies during training, combined with a novel margin\u2010based loss to promote generalization. Extensive experiments demonstrate the effectiveness of our method, outperforming existing supervised methods while not using any AIGI for training.<\/jats:p>","DOI":"10.1155\/int\/8530953","type":"journal-article","created":{"date-parts":[[2025,9,30]],"date-time":"2025-09-30T16:50:06Z","timestamp":1759251006000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Building a Universal Detector of AI\u2010Generated Images Without Training on Them"],"prefix":"10.1155","volume":"2025","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-6211-5182","authenticated-orcid":false,"given":"Ji","family":"Li","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0053-3352","authenticated-orcid":false,"given":"Kai","family":"Wang","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2025,9,30]]},"reference":[{"key":"e_1_2_14_1_2","first-page":"2672","article-title":"Generative Adversarial Nets","volume":"27","author":"Goodfellow I.","year":"2014","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_14_2_2","first-page":"6840","article-title":"Denoising Diffusion Probabilistic Models","volume":"33","author":"Ho J.","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_14_3_2","doi-asserted-by":"publisher","DOI":"10.1155\/2013\/496701"},{"key":"e_1_2_14_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/jstsp.2020.3002101"},{"key":"e_1_2_14_5_2","doi-asserted-by":"publisher","DOI":"10.3390\/jimaging7040069"},{"key":"e_1_2_14_6_2","doi-asserted-by":"crossref","unstructured":"RedmonJ. DivvalaS. GirshickR. andFarhadiA. You Only Look Once: Unified Real-Time Object Detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016 779\u2013788 https:\/\/doi.org\/10.1109\/cvpr.2016.91 2-s2.0-84986308404.","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_2_14_7_2","doi-asserted-by":"crossref","unstructured":"RonnebergerO. FischerP. andBroxT. U-Net: Convolutional Networks for Biomedical Image Segmentation Proceedings of the International Conference on Medical Image Computing and Computer-Aassisted Intervention 2015 234\u2013241 https:\/\/doi.org\/10.1007\/978-3-319-24574-4_28 2-s2.0-84951834022.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_14_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-97-0448-4_6"},{"key":"e_1_2_14_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/tifs.2018.2834147"},{"key":"e_1_2_14_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2021.3114989"},{"key":"e_1_2_14_11_2","doi-asserted-by":"publisher","DOI":"10.2352\/issn.2470-1173.2019.5.mwsf-532"},{"key":"e_1_2_14_12_2","doi-asserted-by":"crossref","unstructured":"WangS.-Yu WangO. ZhangR. OwensA. andEfrosA. A. CNN-Generated Images Are Surprisingly Easy to Spot for Now Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2020 8695\u20138704.","DOI":"10.1109\/CVPR42600.2020.00872"},{"key":"e_1_2_14_13_2","doi-asserted-by":"crossref","unstructured":"WangZ. BaoJ. ZhouW.et al. DIRE for Diffusion-Generated Image Detection Proceedings of the IEEE\/CVF International Conference on Computer Vision 2023 22445\u201322455.","DOI":"10.1109\/ICCV51070.2023.02051"},{"key":"e_1_2_14_14_2","unstructured":"LiJiandWangK. Detecting Computer-Generated Images by Using Only Real Images Proceedings of the International Conference on Machine Vision 2024 1\u201313."},{"key":"e_1_2_14_15_2","article-title":"Auto-encoding Variational Bayes","author":"Kingma D. P.","year":"2013","journal-title":"arXiv preprint arXiv:1312.6114"},{"key":"e_1_2_14_16_2","unstructured":"ChenM. RadfordA. ChildR.et al. Generative Pretraining from Pixels Proceedings of the International Conference on Machine Learning 2020 1691\u20131703."},{"key":"e_1_2_14_17_2","unstructured":"DinhL. Sohl-DicksteinJ. andBengioS. Density Estimation Using Real NVP International Conference on Learning Representations 2017 1\u201312."},{"key":"e_1_2_14_18_2","article-title":"Progressive Growing of GANs for Improved Quality, Stability, and Variation","author":"Karras T.","year":"2017","journal-title":"arXiv preprint arXiv:1710.10196"},{"key":"e_1_2_14_19_2","article-title":"A Style-Based Generator Architecture for Generative Adversarial Networks","author":"Karras T.","year":"2019","journal-title":"arXiv preprint arXiv:1812.04948"},{"key":"e_1_2_14_20_2","unstructured":"SongJ. MengC. andErmonS. Denoising Diffusion Implicit Models Proceedings of the International Conference on Learning Representations 2020 1\u201312."},{"key":"e_1_2_14_21_2","doi-asserted-by":"crossref","unstructured":"RombachR. BlattmannA. LorenzD. EsserP. andOmmerB. High-Resolution Image Synthesis with Latent Diffusion Models Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2022 10684\u201310695.","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"e_1_2_14_22_2","article-title":"Hierarchical Text-Conditional Image Generation With CLIP Latents","author":"Ramesh A.","year":"2022","journal-title":"arXiv preprint arXiv:2204.06125"},{"key":"e_1_2_14_23_2","doi-asserted-by":"crossref","unstructured":"ZhangX. KaramanS. andChangS.-F. Detecting and Simulating Artifacts in GAN Fake Images Proceedings of the IEEE International Workshop on Information Forensics and Security 2019 1\u20136.","DOI":"10.1109\/WIFS47025.2019.9035107"},{"key":"e_1_2_14_24_2","unstructured":"FrankJ. EisenhoferT. Sch\u00f6nherrL. FischerA. KolossaD. andHolzT. Leveraging Frequency Analysis for Deep Fake Image Recognition Proceedings of the International Conference on Machine Learning 2020 3247\u20133258."},{"key":"e_1_2_14_25_2","doi-asserted-by":"crossref","unstructured":"ChandrasegaranK. TranN.-T. andCheungN.-M. A Closer Look at Fourier Spectrum Discrepancies for CNN-Generated Images Detection Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2021 7200\u20137209.","DOI":"10.1109\/CVPR46437.2021.00712"},{"key":"e_1_2_14_26_2","doi-asserted-by":"publisher","DOI":"10.30765\/er.2583"},{"key":"e_1_2_14_27_2","article-title":"Exposing the Fake: Effective Diffusion-Generated Images Detection","author":"Ma R.","year":"2023","journal-title":"arXiv preprint arXiv:2307.06272"},{"key":"e_1_2_14_28_2","doi-asserted-by":"crossref","unstructured":"LuoY. DuJ. YanKe andDingS. Lare2\u2009^\u2009: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2024 17006\u201317015 https:\/\/doi.org\/10.1109\/cvpr52733.2024.01609.","DOI":"10.1109\/CVPR52733.2024.01609"},{"key":"e_1_2_14_29_2","doi-asserted-by":"crossref","unstructured":"RickerJ. LukovnikovD. andFischerA. Aeroblade: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2024 9130\u20139140 https:\/\/doi.org\/10.1109\/cvpr52733.2024.00872.","DOI":"10.1109\/CVPR52733.2024.00872"},{"key":"e_1_2_14_30_2","unstructured":"ChenB. ZengJ. YangJ. andYangR. DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of Diffusion Generated Images Proceedings of the International Conference on Machine Learning 2024 7621\u20137639."},{"key":"e_1_2_14_31_2","doi-asserted-by":"crossref","unstructured":"DolorielC. T.andCheungN. M. Frequency Masking for Universal Deepfake Detection Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing 2024 13466\u201313470 https:\/\/doi.org\/10.1109\/icassp48485.2024.10446290.","DOI":"10.1109\/ICASSP48485.2024.10446290"},{"key":"e_1_2_14_32_2","doi-asserted-by":"crossref","unstructured":"TanC. LiuH. ZhaoY.et al. Rethinking the Up-Sampling Operations in CNN-Based Generative Network for Generalizable Deepfake Detection Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2024 28130\u201328139 https:\/\/doi.org\/10.1109\/cvpr52733.2024.02657.","DOI":"10.1109\/CVPR52733.2024.02657"},{"key":"e_1_2_14_33_2","doi-asserted-by":"crossref","unstructured":"OjhaU. LiY. andLeeY. J. Towards Universal Fake Image Detectors that Generalize across Generative Models Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2023 24480\u201324489 https:\/\/doi.org\/10.1109\/cvpr52729.2023.02345.","DOI":"10.1109\/CVPR52729.2023.02345"},{"key":"e_1_2_14_34_2","unstructured":"RadfordA. KimJ. W. HallacyC.et al. Learning Transferable Visual Models from Natural Language Supervision Proceedings of the International Conference on Machine Learning 2021 8748\u20138763."},{"key":"e_1_2_14_35_2","doi-asserted-by":"crossref","unstructured":"CozzolinoD. PoggiG. CorviR. Nie\u00dfnerM. andVerdolivaL. Raising the Bar of AI-Generated Image Detection with CLIP Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops 2024 4356\u20134366 https:\/\/doi.org\/10.1109\/cvprw63382.2024.00439.","DOI":"10.1109\/CVPRW63382.2024.00439"},{"key":"e_1_2_14_36_2","doi-asserted-by":"crossref","unstructured":"KoutlisC.andPapadopoulosS. Leveraging Representations from Intermediate Encoder-Blocks for Synthetic Image Detection Proceedings of the European Conference on Computer Vision 2025 394\u2013411 https:\/\/doi.org\/10.1007\/978-3-031-73220-1_23.","DOI":"10.1007\/978-3-031-73220-1_23"},{"key":"e_1_2_14_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/jproc.2020.3004555"},{"key":"e_1_2_14_38_2","article-title":"How Much Can CLIP Benefit Vision-And-Language Tasks?","author":"Shen S.","year":"2021","journal-title":"arXiv preprint arXiv:2107.06383"},{"key":"e_1_2_14_39_2","doi-asserted-by":"crossref","unstructured":"ZhangR. ZhangW. FangR.et al. Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification Proceedings of the European Conference on Computer Vision 2022 493\u2013510 https:\/\/doi.org\/10.1007\/978-3-031-19833-5_29.","DOI":"10.1007\/978-3-031-19833-5_29"},{"key":"e_1_2_14_40_2","article-title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale","author":"Dosovitskiy A.","year":"2020","journal-title":"arXiv preprint arXiv:2010.11929"},{"key":"e_1_2_14_41_2","doi-asserted-by":"crossref","unstructured":"LiuZ. LinY. CaoY.et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows Proceedings of the IEEE\/CVF International Conference on Computer Vision 2021 10012\u201310022.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"e_1_2_14_42_2","article-title":"BEiT: Bert Pre-training of Image Transformers","author":"Bao H.","year":"2021","journal-title":"arXiv preprint arXiv:2106.08254"},{"key":"e_1_2_14_43_2","doi-asserted-by":"crossref","unstructured":"SinghM. GustafsonL. AdcockA.et al. Revisiting Weakly Supervised Pre-training of Visual Perception Models Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2022 804\u2013814.","DOI":"10.1109\/CVPR52688.2022.00088"},{"key":"e_1_2_14_44_2","doi-asserted-by":"crossref","unstructured":"HeK. ChenX. XieS. LiY. Doll\u00e1rP. andGirshickR. Masked Autoencoders Are Scalable Vision Learners Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2022 16000\u201316009.","DOI":"10.1109\/CVPR52688.2022.01553"},{"key":"e_1_2_14_45_2","doi-asserted-by":"crossref","unstructured":"SinghM. DuvalQ. AlwalaK. V.et al. The Effectiveness of MAE Pre-pretraining for Billion-Scale Pretraining Proceedings of the IEEE\/CVF International Conference on Computer Vision 2023 5484\u20135494.","DOI":"10.1109\/ICCV51070.2023.00505"},{"key":"e_1_2_14_46_2","article-title":"Dinov2: Learning Robust Visual Features without Supervision","author":"Oquab M.","year":"2023","journal-title":"arXiv preprint arXiv:2304.07193"},{"key":"e_1_2_14_47_2","doi-asserted-by":"crossref","unstructured":"ChertiM. BeaumontR. WightmanR.et al. Reproducible Scaling Laws for Contrastive Language-Image Learning Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2023 2818\u20132829 https:\/\/doi.org\/10.1109\/cvpr52729.2023.00276.","DOI":"10.1109\/CVPR52729.2023.00276"},{"key":"e_1_2_14_48_2","doi-asserted-by":"crossref","unstructured":"LinZ. GengS. ZhangR.et al. Frozen CLIP Models Are Efficient Video Learners Proceedings of the European Conference on Computer Vision 2022 388\u2013404 https:\/\/doi.org\/10.1007\/978-3-031-19833-5_23.","DOI":"10.1007\/978-3-031-19833-5_23"},{"key":"e_1_2_14_49_2","doi-asserted-by":"crossref","unstructured":"BarracoM. CorniaM. CascianelliS. BaraldiL. andCucchiaraR. The Unreasonable Effectiveness of CLIP Features for Image Captioning: an Experimental Analysis Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2022 4662\u20134670.","DOI":"10.1109\/CVPRW56347.2022.00512"},{"key":"e_1_2_14_50_2","doi-asserted-by":"crossref","unstructured":"HuangT. DongB. YangY.et al. CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training Proceedings of the IEEE\/CVF International Conference on Computer Vision 2023 22157\u201322167.","DOI":"10.1109\/ICCV51070.2023.02025"},{"key":"e_1_2_14_51_2","first-page":"1","article-title":"Attention Is All You Need","volume":"30","author":"Vaswani A.","year":"2017","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_14_52_2","article-title":"Predicting the Generalization Gap in Deep Networks with Margin Distributions","author":"Jiang Y.","year":"2019","journal-title":"arXiv preprint arXiv:1810.00113"},{"key":"e_1_2_14_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2019.2917862"},{"key":"e_1_2_14_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/38.946629"},{"key":"e_1_2_14_55_2","unstructured":"ZhangH. CisseM. DauphinY. N. andLopez-PazD. Mixup: Beyond Empirical Risk Minimization Proceedings of the International Conference on Learning Representations 2018 1\u201313."},{"key":"e_1_2_14_56_2","article-title":"LSUN: Construction of a Large-Scale Image Dataset Using Deep Learning with Humans in the Loop","author":"Fisher Y.","year":"2015","journal-title":"arXiv preprint arXiv:1506.03365"},{"key":"e_1_2_14_57_2","unstructured":"GoodfellowI. J. ShlensJ. andSzegedyC. Explaining and Harnessing Adversarial Examples Proceedings of the International Conference on Learning Representations 2015 1\u201311."},{"key":"e_1_2_14_58_2","unstructured":"LiJ. LiD. XiongC. andHoiS. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Proceedings of the International Conference on Machine Learning 2022 12888\u201312900."},{"key":"e_1_2_14_59_2","unstructured":"YanS. LiO. CaiJ.et al. A Sanity Check for AI-Generated Image Detection Proceedings of the International Conference on Learning Representations 2025 1\u201319."}],"container-title":["International Journal of Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/int\/8530953","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1155\/int\/8530953","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/int\/8530953","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T17:58:12Z","timestamp":1772992692000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/int\/8530953"}},"subtitle":[],"editor":[{"given":"Beijing","family":"Chen","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,1]]},"references-count":59,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1]]}},"alternative-id":["10.1155\/int\/8530953"],"URL":"https:\/\/doi.org\/10.1155\/int\/8530953","archive":["Portico"],"relation":{},"ISSN":["0884-8173","1098-111X"],"issn-type":[{"value":"0884-8173","type":"print"},{"value":"1098-111X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1]]},"assertion":[{"value":"2025-01-31","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-04","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-30","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"8530953"}}