{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T16:54:48Z","timestamp":1758041688450,"version":"3.44.0"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"4","funder":[{"name":"Key Research and Development Plan Special Project of Henan Province Grant","award":["241111211400"],"award-info":[{"award-number":["241111211400"]}]},{"DOI":"10.13039\/501100017700","name":"Henan Province Science and Technology Research Project","doi-asserted-by":"crossref","award":["242102211077"],"award-info":[{"award-number":["242102211077"]}],"id":[{"id":"10.13039\/501100017700","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Henan Province University Key Scientific Research Project","award":["23A520008"],"award-info":[{"award-number":["23A520008"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Priv. Secur."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>Indiscriminate data poisoning attacks are highly effective against unsupervised learning. However, recent studies show that contrastive learning is also susceptible to data poisoning attacks. As a form of data poisoning attack, the attacker adds poison to the clean pre-training dataset. This article proposes IDPA, an indiscriminate data poisoning attack targeting the encoder in contrastive learning. where the attacker\u2019s goal is to directly poison the pre-trained encoder. The feature vectors of any clean sample and the attacked sample from the attacker will exhibit high similarity, causing the downstream classifier to misclassify the clean sample as the samples designated by the attacker. Therefore, this article formulates IDPA as a dual optimization problem and defines two loss functions: the attack effectiveness loss and the model utility loss. These losses are associated with effectively poisoning the pre-trained encoder and maintaining the accuracy of the downstream classifier, respectively. During training, the attack affects the contrastive learning algorithm and predictions are made on multiple datasets. Experimental results show that the attack success rate of 92%. This article evaluates the effectiveness of IDPA on the CLIP dataset released by OpenAI, with attack success rate of 88%.<\/jats:p>","DOI":"10.1145\/3757916","type":"journal-article","created":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T09:10:59Z","timestamp":1754039459000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["IDPA: Indiscriminate Data Poisoning Attacks Targeting Pre-trained Encoder Based on Contrastive Learning"],"prefix":"10.1145","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-5639-6825","authenticated-orcid":false,"given":"Aodi","family":"Hu","sequence":"first","affiliation":[{"name":"Henan University of Science and Technology","place":["Luoyang, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3061-7768","authenticated-orcid":false,"given":"Zhiyong","family":"Zhang","sequence":"additional","affiliation":[{"name":"Henan University of Science and Technology","place":["Luoyang, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-1076-2995","authenticated-orcid":false,"given":"Gaoyuan","family":"Quan","sequence":"additional","affiliation":[{"name":"Henan University of Science and Technology","place":["Luoyang, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-9495-3592","authenticated-orcid":false,"given":"Xinxin","family":"Yue","sequence":"additional","affiliation":[{"name":"Information Engineering College, Henan University of Science and Technology","place":["Luoyang, China"]}]}],"member":"320","published-online":{"date-parts":[[2025,9,11]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"159","volume-title":"Proceedings of the 2021 IEEE European Symposium on Security and Privacy","author":"Aghakhani Hojjat","year":"2021","unstructured":"Hojjat Aghakhani, Dongyu Meng, Yu-Xiang Wang, Christopher Kruegel, and Giovanni Vigna. 2021. Bullseye polytope: A scalable clean-label poisoning attack with improved transferability. In Proceedings of the 2021 IEEE European Symposium on Security and Privacy. IEEE, 159\u2013178."},{"doi-asserted-by":"crossref","unstructured":"Marco Barreno Blaine Nelson Anthony D. Joseph and J. Doug Tygar. 2010. The security of machine learning. Machine Learning 81 2 (2010) 121\u201348.","key":"e_1_3_1_3_2","DOI":"10.1007\/s10994-010-5188-5"},{"doi-asserted-by":"publisher","key":"e_1_3_1_4_2","DOI":"10.1145\/1128817.1128824"},{"unstructured":"Battista Biggio Blaine Nelson and Pavel Laskov. 2012. Poisoning attacks against support vector machines. In Proceedings of the 29th International Conference on Machine Learning ICML 2012 Edinburgh Scotland UK June 26 - July 1 2012. icml.cc\/Omnipress. https:\/\/dblp.org\/rec\/conf\/icml\/BiggioNL12","key":"e_1_3_1_5_2"},{"doi-asserted-by":"crossref","unstructured":"Min Cao Shiping Li Juntao Li Liqiang Nie and Min Zhang. 2022. Image-text retrieval: A survey on recent research and development. In Proceedings of the Thirty-First International Joint Conference on Artifical Intelligence IJCAI 2022. 5410\u20135417.","key":"e_1_3_1_6_2","DOI":"10.24963\/ijcai.2022\/759"},{"key":"e_1_3_1_7_2","first-page":"1577","volume-title":"Proceedings of the 30th USENIX Security Symposium","author":"Carlini Nicholas","year":"2021","unstructured":"Nicholas Carlini. 2021. Poisoning the unlabeled dataset of \\(\\lbrace\\) Semi-Supervised \\(\\rbrace\\) learning. In Proceedings of the 30th USENIX Security Symposium. 1577\u20131592."},{"unstructured":"Nicholas Carlini and Andreas Terzis. 2021. Poisoning and backdooring contrastive learning. arXiv:2106.09667. Retrieved from https:\/\/arxiv.org\/abs\/2106.09667","key":"e_1_3_1_8_2"},{"doi-asserted-by":"crossref","unstructured":"Jian Chen Yuan Gao Gaoyang Liu Ahmed M. Abdelmoniem and Chen Wang. 2024. Manipulating pre-trained encoder for targeted poisoning attacks in contrastive learning. IEEE Transactions on Information Forensics and Security 19 (2024) 2412\u20132424.","key":"e_1_3_1_9_2","DOI":"10.1109\/TIFS.2024.3350389"},{"key":"e_1_3_1_10_2","first-page":"1597","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. PMLR, 1597\u20131607."},{"key":"e_1_3_1_11_2","first-page":"9727","article-title":"Effective backdoor defense by exploiting sensitivity of poisoned samples","volume":"35","author":"Chen Weixin","year":"2022","unstructured":"Weixin Chen, Baoyuan Wu, and Haoqian Wang. 2022. Effective backdoor defense by exploiting sensitivity of poisoned samples. Advances in Neural Information Processing Systems 35 (2022), 9727\u20139737.","journal-title":"Advances in Neural Information Processing Systems"},{"unstructured":"Xinlei Chen Haoqi Fan Ross Girshick and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv:2003.04297. Retrieved from https:\/\/arxiv.org\/abs\/2003.04297","key":"e_1_3_1_12_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_13_2","DOI":"10.1109\/CVPR46437.2021.01549"},{"unstructured":"Valeriia Cherepanova Micah Goldblum Harrison Foley Shiyuan Duan John P. Dickerson Gavin Taylor and Tom Goldstein. 2021. LowKey: Leveraging adversarial attacks to protect social media users from facial recognition. In International Conference on Learning Representations. Retrieved from https:\/\/arxiv.org\/abs\/2101.07922","key":"e_1_3_1_14_2"},{"key":"e_1_3_1_15_2","first-page":"215","volume-title":"Proceedings of the 14th International Conference on Artificial Intelligence and Statistics","author":"Coates Adam","year":"2011","unstructured":"Adam Coates, Andrew Ng, and Honglak Lee. 2011. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 215\u2013223."},{"unstructured":"Jonas Geiping Liam H. Fowl W. Ronny Huang Wojciech Czaja Gavin Taylor Michael Moeller and Tom Goldstein. 2021. Witches\u2019 brew: Industrial scale data poisoning via gradient matching. In International Conference on Learning Representations. Retrieved from https:\/\/arxiv.org\/abs\/2009.02276","key":"e_1_3_1_16_2"},{"unstructured":"Hao He Kaiwen Zha and Dina Katabi. 2023. Indiscriminate poisoning attacks on unsupervised contrastive learning. In The Eleventh International Conference on Learning Representations. Retrieved from https:\/\/arxiv.org\/abs\/2202.11202","key":"e_1_3_1_17_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_18_2","DOI":"10.1109\/CVPR42600.2020.00975"},{"doi-asserted-by":"publisher","key":"e_1_3_1_19_2","DOI":"10.1109\/CVPR.2016.90"},{"unstructured":"Geoffrey Hinton. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531. Retrieved from https:\/\/arxiv.org\/abs\/1503.02531","key":"e_1_3_1_20_2"},{"unstructured":"Hanxun Huang Sarah Monazam Erfani Yige Li Xingjun Ma and James Bailey. 2025. Detecting backdoor samples in contrastive language image pretraining. In The Thirteenth International Conference on Learning Representations. Retrieved from https:\/\/arxiv.org\/abs\/2502.01385","key":"e_1_3_1_21_2"},{"unstructured":"Hanxun Huang Xingjun Ma Sarah Monazam Erfani James Bailey and Yisen Wang. 2021. Unlearnable examples: Making personal data unexploitable. In International Conference on Learning Representations. Retrieved from https:\/\/arxiv.org\/abs\/2101.04898","key":"e_1_3_1_22_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_23_2","DOI":"10.1109\/CVPR52729.2023.01122"},{"key":"e_1_3_1_24_2","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1109\/SP.2018.00057","volume-title":"Proceedings of the 2018 IEEE Symposium on Security and Privacy","author":"Jagielski Matthew","year":"2018","unstructured":"Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, and Bo Li. 2018. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In Proceedings of the 2018 IEEE Symposium on Security and Privacy. IEEE, 19\u201335."},{"key":"e_1_3_1_25_2","first-page":"7961","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Jia Jinyuan","year":"2021","unstructured":"Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. 2021. Intrinsic certified robustness of bagging against data poisoning attacks. In Proceedings of the AAAI Conference on Artificial Intelligence. 7961\u20137969."},{"key":"e_1_3_1_26_2","first-page":"2043","volume-title":"Proceedings of the 2022 IEEE Symposium on Security and Privacy","author":"Jia Jinyuan","year":"2022","unstructured":"Jinyuan Jia, Yupei Liu, and Neil Zhenqiang Gong. 2022. Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning. In Proceedings of the 2022 IEEE Symposium on Security and Privacy. IEEE, 2043\u20132059."},{"key":"e_1_3_1_27_2","first-page":"16091","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Khorasgani Salar Hosseini","year":"2022","unstructured":"Salar Hosseini Khorasgani, Yuxuan Chen, and Florian Shkurti. 2022. Slic: Self-supervised learning with iterative clustering for human action videos. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 16091\u201316101."},{"unstructured":"Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. University of Toronto.","key":"e_1_3_1_28_2"},{"unstructured":"Bo Li Yining Wang Aarti Singh and Yevgeniy Vorobeychik. 2016. Data poisoning attacks on factorization-based collaborative filtering. In Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NeurIPS 2016). 1893\u20131901.","key":"e_1_3_1_29_2"},{"key":"e_1_3_1_30_2","first-page":"3629","volume-title":"Proceedings of the 31st USENIX Security Symposium","author":"Liu Hongbin","year":"2022","unstructured":"Hongbin Liu, Jinyuan Jia, and Neil Zhenqiang Gong. 2022. \\(\\lbrace\\) PoisonedEncoder \\(\\rbrace\\) : Poisoning the unlabeled pre-training data in contrastive learning. In Proceedings of the 31st USENIX Security Symposium. 3629\u20133645."},{"doi-asserted-by":"publisher","key":"e_1_3_1_31_2","DOI":"10.1007\/978-3-030-00470-5_13"},{"unstructured":"Aleksander Madry Aleksandar Makelov Ludwig Schmidt Dimitris Tsipras and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations. Retrieved from https:\/\/arxiv.org\/abs\/1706.06083","key":"e_1_3_1_32_2"},{"doi-asserted-by":"publisher","key":"e_1_3_1_33_2","DOI":"10.1145\/3128572.3140451"},{"key":"e_1_3_1_34_2","first-page":"4","volume-title":"Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning","author":"Netzer Yuval","year":"2011","unstructured":"Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Baolin Wu, Andrew Y. Ng, et\u00a0al. 2011. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Granada, 4."},{"doi-asserted-by":"publisher","key":"e_1_3_1_35_2","DOI":"10.1007\/978-3-030-13453-2_1"},{"doi-asserted-by":"publisher","key":"e_1_3_1_36_2","DOI":"10.1007\/978-3-030-66415-2_4"},{"doi-asserted-by":"publisher","key":"e_1_3_1_37_2","DOI":"10.1016\/j.cose.2024.104225"},{"key":"e_1_3_1_38_2","first-page":"8748","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et\u00a0al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning. PMLR, 8748\u20138763."},{"doi-asserted-by":"publisher","key":"e_1_3_1_39_2","DOI":"10.1109\/CVPR.2018.00474"},{"doi-asserted-by":"publisher","key":"e_1_3_1_40_2","DOI":"10.1109\/ICCV.2017.74"},{"unstructured":"Ali Shafahi W. Ronny Huang Mahyar Najibi Octavian Suciu Christoph Studer Tudor Dumitras and Tom Goldstein. 2018. Poison frogs! targeted clean-label poisoning attacks on neural networks. Advances in Neural Information Processing Systems 31 (2018) 6106\u20136116.","key":"e_1_3_1_41_2"},{"unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations ICLR 2015 San Diego CA USA May 7-9 2015 Conference Track Proceedings. http:\/\/arxiv.org\/abs\/1409.1556","key":"e_1_3_1_42_2"},{"doi-asserted-by":"crossref","unstructured":"Johannes Stallkamp Marc Schlipsing Jan Salmen and Christian Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks 32 4 (2012) 323\u2013332.","key":"e_1_3_1_43_2","DOI":"10.1016\/j.neunet.2012.02.016"},{"key":"e_1_3_1_44_2","first-page":"1299","volume-title":"Proceedings of the 27th USENIX Security Symposium","author":"Suciu Octavian","year":"2018","unstructured":"Octavian Suciu, Radu Marginean, Yigitcan Kaya, Hal Daume III, and Tudor Dumitras. 2018. When does machine learning \\(\\lbrace\\) FAIL \\(\\rbrace\\) ? generalized transferability for evasion and poisoning attacks. In Proceedings of the 27th USENIX Security Symposium. 1299\u20131316."},{"key":"e_1_3_1_45_2","article-title":"Spectral signatures in backdoor attacks","volume":"31","author":"Tran Brandon","year":"2018","unstructured":"Brandon Tran, Jerry Li, and Aleksander Madry. 2018. Spectral signatures in backdoor attacks. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"doi-asserted-by":"publisher","key":"e_1_3_1_46_2","DOI":"10.1109\/SP.2019.00031"},{"key":"e_1_3_1_47_2","first-page":"39299","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Yang Ziqing","year":"2023","unstructured":"Ziqing Yang, Xinlei He, Zheng Li, Michael Backes, Mathias Humbert, Pascal Berrang, and Yang Zhang. 2023. Data poisoning attacks against multimodal encoders. In Proceedings of the International Conference on Machine Learning. PMLR, 39299\u201339313."},{"key":"e_1_3_1_48_2","first-page":"1667","volume-title":"Proceedings of the 32nd USENIX Security Symposium","author":"Zeng Yi","year":"2023","unstructured":"Yi Zeng, Minzhou Pan, Himanshu Jahagirdar, Ming Jin, Lingjuan Lyu, and Ruoxi Jia. 2023. \\(\\lbrace\\) Meta-Sift \\(\\rbrace\\) : How to sift out a clean subset in the presence of data poisoning?. In Proceedings of the 32nd USENIX Security Symposium. 1667\u20131684."},{"key":"e_1_3_1_49_2","first-page":"11712","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Zhang Tong","year":"2022","unstructured":"Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, and Wen Zhao. 2022. Frequency-aware contrastive learning for neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence. 11712\u201311720."},{"key":"e_1_3_1_50_2","first-page":"7614","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Zhu Chen","year":"2019","unstructured":"Chen Zhu, W. Ronny Huang, Hengduo Li, Gavin Taylor, Christoph Studer, and Tom Goldstein. 2019. Transferable clean-label poisoning attacks on deep neural nets. In Proceedings of the International Conference on Machine Learning. PMLR, 7614\u20137623."}],"container-title":["ACM Transactions on Privacy and Security"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3757916","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T12:34:04Z","timestamp":1757594044000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3757916"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,11]]},"references-count":49,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3757916"],"URL":"https:\/\/doi.org\/10.1145\/3757916","relation":{},"ISSN":["2471-2566","2471-2574"],"issn-type":[{"type":"print","value":"2471-2566"},{"type":"electronic","value":"2471-2574"}],"subject":[],"published":{"date-parts":[[2025,9,11]]},"assertion":[{"value":"2025-02-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-28","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}