{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T18:09:10Z","timestamp":1757614150040,"version":"3.44.0"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Structured data-quality issues\u2014such as missing values correlated with demographics, culturally biased labels, or systemic selection biases\u2014routinely degrade the reliability of machine-learning pipelines. Regulators now increasingly demand evidence that high-stakes systems can withstand these realistic, interdependent errors, yet current robustness evaluations typically use random or overly simplistic corruptions, leaving worst-case scenarios unexplored.<\/jats:p>\n          <jats:p>We introduce Savage, a causally inspired framework that (i) formally models realistic data-quality issues through dependency graphs and flexible corruption templates, and (ii) systematically discovers corruption patterns that maximally degrade a target performance metric. Savage employs a bi-level optimization approach to efficiently identify vulnerable data subpopulations and fine-tune corruption severity, treating the full ML pipeline, including preprocessing and potentially non-differentiable models, as a black box. Extensive experiments across multiple datasets and ML tasks (data cleaning, fairness-aware learning, uncertainty quantification) demonstrate that even a small fraction (around 5%) of structured corruptions identified by Savage severely impacts model performance, far exceeding random or manually crafted errors, and invalidating core assumptions of existing techniques. Thus, Savage provides a practical tool for rigorous pipeline stress-testing, a benchmark for evaluating robustness methods, and actionable guidance for designing more resilient data workflows.<\/jats:p>","DOI":"10.14778\/3749646.3749721","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T17:55:06Z","timestamp":1757008506000},"page":"4668-4681","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Stress-Testing ML Pipelines with Adversarial Data Corruption"],"prefix":"10.14778","volume":"18","author":[{"given":"Jiongli","family":"Zhu","sequence":"first","affiliation":[{"name":"University of California, San Diego, USA"}]},{"given":"Geyang","family":"Xu","sequence":"additional","affiliation":[{"name":"University of California, San Diego, USA"}]},{"given":"Felipe","family":"Lorenzi","sequence":"additional","affiliation":[{"name":"University of California, San Diego, USA"}]},{"given":"Boris","family":"Glavic","sequence":"additional","affiliation":[{"name":"University of Illinois, Chicago, USA"}]},{"given":"Babak","family":"Salimi","sequence":"additional","affiliation":[{"name":"University of California, San Diego, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2024. Regulation (EU) 2024\/1689 of the European Parliament and of the Council on Artificial Intelligence. https:\/\/artificialintelligenceact.eu\/article\/15\/. Article 15: Accuracy Robustness and Cyber-security."},{"key":"e_1_2_1_2_1","unstructured":"2025. Codebase for SAVAGE. https:\/\/github.com\/lodino\/savage"},{"key":"e_1_2_1_3_1","volume-title":"Rein: A comprehensive benchmark framework for data cleaning methods in ml pipelines. arXiv preprint arXiv:2302.04702","author":"Abdelaal Mohamed","year":"2023","unstructured":"Mohamed Abdelaal, Christian Hammacher, and Harald Schoening. 2023. Rein: A comprehensive benchmark framework for data cleaning methods in ml pipelines. arXiv preprint arXiv:2302.04702 (2023)."},{"key":"e_1_2_1_4_1","volume-title":"Algorithms for hyper-parameter optimization. Advances in neural information processing systems 24","author":"Bergstra James","year":"2011","unstructured":"James Bergstra, R\u00e9mi Bardenet, Yoshua Bengio, and Bal\u00e1zs K\u00e9gl. 2011. Algorithms for hyper-parameter optimization. Advances in neural information processing systems 24 (2011)."},{"key":"e_1_2_1_5_1","volume-title":"Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389","author":"Biggio Battista","year":"2012","unstructured":"Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012)."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3585385"},{"volume-title":"Practical machine learning with H2O: powerful, scalable techniques for deep learning and AI","author":"Cook Darren","key":"e_1_2_1_7_1","unstructured":"Darren Cook. 2016. Practical machine learning with H2O: powerful, scalable techniques for deep learning and AI. O'Reilly Media, Inc."},{"key":"e_1_2_1_8_1","volume-title":"NeurIPS ML Safety Workshop.","author":"Di Jimmy Z","year":"2022","unstructured":"Jimmy Z Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, and Ayush Sekhari. 2022. Hidden poison: Machine unlearning enables camouflaged poisoning attacks. In NeurIPS ML Safety Workshop."},{"key":"e_1_2_1_9_1","volume-title":"Zemel","author":"Dwork Cynthia","year":"2012","unstructured":"Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S. Zemel. 2012. Fairness through awareness. In ITCS. ACM, 214\u2013226."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1080\/02664763.2021.1929090"},{"key":"e_1_2_1_11_1","unstructured":"Chen Xinyun et al. 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/3586589.3586850"},{"key":"e_1_2_1_13_1","volume-title":"Potential biases in machine learning algorithms using electronic health record data. JAMA internal medicine 178, 11","author":"Gianfrancesco Milena A","year":"2018","unstructured":"Milena A Gianfrancesco, Suzanne Tamang, Jinoos Yazdany, and Gabriela Schmajuk. 2018. Potential biases in machine learning algorithms using electronic health record data. JAMA internal medicine 178, 11 (2018), 1544\u20131547."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-4018-5"},{"key":"e_1_2_1_15_1","volume-title":"George Davey Smith, et al","author":"Griffith Gareth J","year":"2020","unstructured":"Gareth J Griffith, Tim T Morris, Matthew J Tudball, Annie Herbert, Giulia Mancano, Lindsey Pike, Gemma C Sharp, Jonathan Sterne, Tom M Palmer, George Davey Smith, et al. 2020. Collider bias undermines our understanding of COVID-19 disease risk and severity. Nature communications 11, 1 (2020), 1\u201312."},{"key":"e_1_2_1_16_1","volume-title":"Julia Stoyanovich, and Sebastian Schelter.","author":"Guha Shubha","year":"2022","unstructured":"Shubha Guha, Falaah Arif Khan, Julia Stoyanovich, and Sebastian Schelter. 2022. Automated Data Cleaning Can Hurt Fairness in Machine Learning-based Decision Making. ICDE (2022)."},{"key":"e_1_2_1_17_1","volume-title":"Investigating Labeler Bias in Face Annotation for Machine Learning. arXiv preprint arXiv:2301.09902","author":"Haliburton Luke","year":"2023","unstructured":"Luke Haliburton, Sinksar Ghebremedhin, Robin Welsch, Albrecht Schmidt, and Sven Mayer. 2023. Investigating Labeler Bias in Face Annotation for Machine Learning. arXiv preprint arXiv:2301.09902 (2023)."},{"key":"e_1_2_1_18_1","unstructured":"Moritz Hardt Eric Price and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. In NIPS. 3315\u20133323."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517841"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3460120.3485368"},{"key":"e_1_2_1_21_1","volume-title":"Data preprocessing techniques for classification without discrimination. Knowledge and information systems 33, 1","author":"Kamiran Faisal","year":"2012","unstructured":"Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and information systems 33, 1 (2012), 1\u201333."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-011-0463-8"},{"key":"e_1_2_1_23_1","volume-title":"Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation. arXiv preprint arXiv:2409.07510","author":"Khan Falaah Arif","year":"2024","unstructured":"Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, and Julia Stoyanovich. 2024. Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation. arXiv preprint arXiv:2409.07510 (2024)."},{"key":"e_1_2_1_24_1","volume-title":"Boost-clean: Automated error detection and repair for machine learning. arXiv preprint arXiv:1711.01299","author":"Krishnan Sanjay","year":"2017","unstructured":"Sanjay Krishnan, Michael J Franklin, Ken Goldberg, and Eugene Wu. 2017. Boost-clean: Automated error detection and repair for machine learning. arXiv preprint arXiv:1711.01299 (2017)."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the AutoML Workshop at ICML","volume":"2020","author":"LeDell Erin","year":"2020","unstructured":"Erin LeDell and Sebastien Poirier. 2020. H2o automl: Scalable automatic machine learning. In Proceedings of the AutoML Workshop at ICML, Vol. 2020. ICML San Diego, CA, USA."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.2017.1307116"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589328"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00009"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2021.05.049"},{"key":"e_1_2_1_30_1","volume-title":"Explaining inference queries with bayesian optimization. arXiv preprint arXiv:2102.05308","author":"Lockhart Brandon","year":"2021","unstructured":"Brandon Lockhart, Jinglin Peng, Weiyuan Wu, Jiannan Wang, and Eugene Wu. 2021. Explaining inference queries with bayesian optimization. arXiv preprint arXiv:2102.05308 (2021)."},{"key":"e_1_2_1_31_1","volume-title":"Indiscriminate data poisoning attacks on neural networks. arXiv preprint arXiv:2204.09092","author":"Lu Yiwei","year":"2022","unstructured":"Yiwei Lu, Gautam Kamath, and Yaoliang Yu. 2022. Indiscriminate data poisoning attacks on neural networks. arXiv preprint arXiv:2204.09092 (2022)."},{"key":"e_1_2_1_32_1","volume-title":"International Conference on Machine Learning. PMLR, 22856\u201322879","author":"Lu Yiwei","year":"2023","unstructured":"Yiwei Lu, Gautam Kamath, and Yaoliang Yu. 2023. Exploring the limits of model-targeted indiscriminate data poisoning attacks. In International Conference on Machine Learning. PMLR, 22856\u201322879."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0180033"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.IS.2025.102549"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3128572.3140451"},{"key":"e_1_2_1_36_1","unstructured":"National Institute of Standards and Technology. 2024. Artificial Intelligence Risk Management Framework (1.0) - Generative AI Profile. Technical Report NIST AI 600-1. U.S. Dept. of Commerce. https:\/\/nvlpubs.nist.gov\/nistpubs\/ai\/NIST.AI.600-1.pdf"},{"key":"e_1_2_1_37_1","unstructured":"Judea Pearl. 2009. Causality. Cambridge university press."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10549-022-06764-4"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517886"},{"key":"e_1_2_1_41_1","first-page":"815","article-title":"Sample selection for fair and robust training","volume":"34","author":"Roh Yuji","year":"2021","unstructured":"Yuji Roh, Kangwook Lee, Steven Whang, and Changho Suh. 2021. Sample selection for fair and robust training. Advances in Neural Information Processing Systems 34 (2021), 815\u2013827.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_42_1","volume-title":"International Conference on Machine Learning. PMLR, 29179\u201329209","author":"Roh Yuji","year":"2023","unstructured":"Yuji Roh, Kangwook Lee, Steven Euijong Whang, and Changho Suh. 2023. Improving fair training under correlation shifts. In International Conference on Machine Learning. PMLR, 29179\u201329209."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2588578"},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the survey research methods section of the American Statistical Association","volume":"1","author":"Rubin Donald B","year":"1978","unstructured":"Donald B Rubin. 1978. Multiple imputations in sample surveys-a phenomeno-logical Bayesian approach to nonresponse. In Proceedings of the survey research methods section of the American Statistical Association, Vol. 1. American Statistical Association Alexandria, VA, USA, 20\u201334."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457323"},{"key":"e_1_2_1_46_1","unstructured":"Sebastian Schelter Tammo Rukat and Felix Biessmann. 2021. JENGA-A Framework to Study the Impact of Data Errors on the Predictions of Machine Learning Models.. In EDBT. 529\u2013534."},{"key":"e_1_2_1_47_1","volume-title":"International Conference on Machine Learning. PMLR, 9389\u20139398","author":"Schwarzschild Avi","year":"2021","unstructured":"Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, and Tom Goldstein. 2021. Just how toxic is data poisoning? a unified benchmark for backdoor and data poisoning attacks. In International Conference on Machine Learning. PMLR, 9389\u20139398."},{"key":"e_1_2_1_48_1","volume-title":"Poison frogs! targeted clean-label poisoning attacks on neural networks. Advances in neural information processing systems 31","author":"Shafahi Ali","year":"2018","unstructured":"Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. 2018. Poison frogs! targeted clean-label poisoning attacks on neural networks. Advances in neural information processing systems 31 (2018)."},{"key":"e_1_2_1_49_1","first-page":"2143","article-title":"Improvements in beam search","volume":"94","author":"Steinbiss Volker","year":"1994","unstructured":"Volker Steinbiss, Bach-Hiep Tran, and Hermann Ney. 1994. Improvements in beam search.. In ICSLP, Vol. 94. 2143\u20132146.","journal-title":"ICSLP"},{"key":"e_1_2_1_50_1","unstructured":"Alexander Turner Dimitris Tsipras and Aleksander Madry. 2018. Clean-label backdoor attacks. (2018)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3582078"},{"key":"e_1_2_1_52_1","volume-title":"International Conference on Machine Learning. PMLR, 40578\u201340604","author":"Zaffran Margaux","year":"2023","unstructured":"Margaux Zaffran, Aymeric Dieuleveut, Julie Josse, and Yaniv Romano. 2023. Conformal prediction with missing values. In International Conference on Machine Learning. PMLR, 40578\u201340604."},{"volume-title":"ICML (3) (JMLRWorkshop and Conference Proceedings)","author":"Zemel Richard S.","key":"e_1_2_1_53_1","unstructured":"Richard S. Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork. 2013. Learning Fair Representations. In ICML (3) (JMLRWorkshop and Conference Proceedings), Vol. 28. JMLR.org, 325\u2013333."},{"key":"e_1_2_1_54_1","first-page":"18","article-title":"Overcoming Data Biases: Towards Enhanced Accuracy and Reliability in Machine Learning","volume":"47","author":"Zhu Jiongli","year":"2024","unstructured":"Jiongli Zhu and Babak Salimi. 2024. Overcoming Data Biases: Towards Enhanced Accuracy and Reliability in Machine Learning. IEEE Data Eng. Bull. 47, 1 (2024), 18\u201335.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_55_1","unstructured":"Jiongli Zhu Geyang Xu Felipe Lorenzi Boris Glavic and Babak Salimi. 2025. Stress-Testing ML Pipelines with Adersarial Data Corruption (extended version). Technical Report. https:\/\/github.com\/lodino\/savage\/blob\/main\/techreport\/techreport.pdf"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3749646.3749721","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T03:24:22Z","timestamp":1757042662000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3749646.3749721"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":55,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.14778\/3749646.3749721"],"URL":"https:\/\/doi.org\/10.14778\/3749646.3749721","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}