{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T11:23:31Z","timestamp":1780053811075,"version":"3.54.0"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2023,5,26]],"date-time":"2023-05-26T00:00:00Z","timestamp":1685059200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["NRF-2022R1A2C2004382"],"award-info":[{"award-number":["NRF-2022R1A2C2004382"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Google Research Award"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,5,26]]},"abstract":"<jats:p>As machine learning becomes prevalent, mitigating any unfairness present in the training data becomes critical. Among the various notions of fairness, this paper focuses on the well-known individual fairness, which states that similar individuals should be treated similarly. While individual fairness can be improved when training a model (in-processing), we contend that fixing the data before model training (pre-processing) is a more fundamental solution. In particular, we show that label flipping is an effective pre-processing technique for improving individual fairness.<\/jats:p>\n          <jats:p>Our system iFlipper solves the optimization problem of minimally flipping labels given a limit to the individual fairness violations, where a violation occurs when two similar examples in the training data have different labels. We first prove that the problem is NP-hard. We then propose an approximate linear programming algorithm and provide theoretical guarantees on how close its result is to the optimal solution in terms of the number of label flips. We also propose techniques for making the linear programming solution more optimal without exceeding the violations limit. Experiments on real datasets show that iFlipper significantly outperforms other pre-processing baselines in terms of individual fairness and accuracy on unseen test sets. In addition, iFlipper can be combined with in-processing techniques for even better results.<\/jats:p>","DOI":"10.1145\/3588688","type":"journal-article","created":{"date-parts":[[2023,5,30]],"date-time":"2023-05-30T17:42:05Z","timestamp":1685468525000},"page":"1-26","source":"Crossref","is-referenced-by-count":12,"title":["iFlipper: Label Flipping for Individual Fairness"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9862-8773","authenticated-orcid":false,"given":"Hantian","family":"Zhang","sequence":"first","affiliation":[{"name":"Georgia Institute of Technology, Atlanta, GA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9307-7757","authenticated-orcid":false,"given":"Ki Hyun","family":"Tae","sequence":"additional","affiliation":[{"name":"KAIST, Daejeon, South Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-7583-0586","authenticated-orcid":false,"given":"Jaeyoung","family":"Park","sequence":"additional","affiliation":[{"name":"KAIST, Daejeon, South Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-3202-3767","authenticated-orcid":false,"given":"Xu","family":"Chu","sequence":"additional","affiliation":[{"name":"Georgia Institute of Technology, Atlanta, GA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6419-931X","authenticated-orcid":false,"given":"Steven Euijong","family":"Whang","sequence":"additional","affiliation":[{"name":"KAIST, Daejeon, South Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,5,30]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"A reductions approach to fair classification. arXiv preprint arXiv:1803.02453","author":"Agarwal Alekh","year":"2018","unstructured":"Alekh Agarwal, Alina Beygelzimer, Miroslav Dud'ik, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018)."},{"key":"e_1_2_2_2_1","unstructured":"Alexandr Andoni Piotr Indyk Thijs Laarhoven Ilya Razenshteyn and Ludwig Schmidt. 2015. Practical and Optimal LSH for Angular Distance. In NeurIPS. 1225--1233."},{"key":"e_1_2_2_3_1","unstructured":"J. Angwin J. Larson S. Mattu and L. Kirchner. 2016. Machine bias: There's software used across the country to predict future criminals. And its biased against blacks."},{"key":"e_1_2_2_4_1","first-page":"671","article-title":"Big data's disparate impact","volume":"104","author":"Barocas Solon","year":"2016","unstructured":"Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Calif. L. Rev., Vol. 104 (2016), 671.","journal-title":"Calif. L. Rev."},{"key":"e_1_2_2_5_1","volume-title":"John T. Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang.","author":"Bellamy Rachel K. E.","year":"2018","unstructured":"Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John T. Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. 2018. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias. CoRR, Vol. abs\/1810.01943 (2018). arxiv: 1810.01943"},{"key":"e_1_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Reuben Binns. 2020. On the apparent conflict between individual and group fairness. In FAT*. 514--524.","DOI":"10.1145\/3351095.3372864"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0166-218X(03)00358-5"},{"key":"e_1_2_2_8_1","unstructured":"Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In FAT. PMLR 77--91."},{"key":"e_1_2_2_9_1","volume-title":"Karthikeyan Natesan Ramamurthy, and Kush R Varshney","author":"Calmon Flavio","year":"2017","unstructured":"Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized pre-processing for discrimination prevention. In NeurIPS. 3992--4001."},{"key":"e_1_2_2_10_1","doi-asserted-by":"crossref","unstructured":"Allison JB Chaney Brandon M Stewart and Barbara E Engelhardt. 2018. How algorithmic confounding in recommendation systems increases homogeneity and decreases utility. In RecSys. 224--232.","DOI":"10.1145\/3240323.3240370"},{"key":"e_1_2_2_11_1","volume-title":"Entity Resolution, and Duplicate Detection","author":"Christen Peter","unstructured":"Peter Christen. 2012. Data Matching - Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer."},{"key":"e_1_2_2_12_1","first-page":"157","article-title":"V12. 1: User's Manual for CPLEX","volume":"46","author":"IBM","year":"2009","unstructured":"Cplex, IBM ILOG. 2009. V12. 1: User's Manual for CPLEX. International Business Machines Corporation, Vol. 46, 53 (2009), 157.","journal-title":"International Business Machines Corporation"},{"key":"e_1_2_2_13_1","volume-title":"Amazon scraps secret AI recruiting tool that showed bias against women","author":"Dastin Jeffrey","year":"2018","unstructured":"Jeffrey Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters (2018)."},{"key":"e_1_2_2_14_1","unstructured":"Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http:\/\/archive.ics.uci.edu\/ml"},{"key":"e_1_2_2_15_1","doi-asserted-by":"crossref","unstructured":"Cynthia Dwork Moritz Hardt Toniann Pitassi Omer Reingold and Richard Zemel. 2012. Fairness through awareness. In ITCS. 214--226.","DOI":"10.1145\/2090236.2090255"},{"key":"e_1_2_2_16_1","doi-asserted-by":"crossref","unstructured":"Michael Feldman Sorelle A. Friedler John Moeller Carlos Scheidegger and Suresh Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In KDD. 259--268.","DOI":"10.1145\/2783258.2783311"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3433949"},{"key":"e_1_2_2_18_1","unstructured":"Gurobi Optimization LLC. 2022. Gurobi Optimizer Reference Manual. https:\/\/www.gurobi.com"},{"key":"e_1_2_2_19_1","unstructured":"Moritz Hardt Eric Price and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. In NeurIPS. 3315--3323."},{"key":"e_1_2_2_20_1","volume-title":"Metric Learning for Individual Fairness. arxiv","author":"Ilvento Christina","year":"1906","unstructured":"Christina Ilvento. 2020. Metric Learning for Individual Fairness. arxiv: 1906.00250 [cs.LG]"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-011-0463-8"},{"key":"e_1_2_2_22_1","unstructured":"Ron Kohavi. 1996. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In KDD. 202--207."},{"key":"e_1_2_2_23_1","unstructured":"Matt Kusner Joshua Loftus Chris Russell and Ricardo Silva. 2017. Counterfactual fairness. In NeurIPS. 4069--4079."},{"key":"e_1_2_2_24_1","doi-asserted-by":"crossref","unstructured":"Preethi Lahoti Krishna P. Gummadi and Gerhard Weikum. 2019a. iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making. In ICDE. 1334--1345.","DOI":"10.1109\/ICDE.2019.00121"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/3372716.3372723"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2014.03.001"},{"key":"e_1_2_2_27_1","unstructured":"MOSEK ApS. 2019. MOSEK Optimizer API for Python. https:\/\/docs.mosek.com\/latest\/pythonapi\/index.html"},{"key":"e_1_2_2_28_1","first-page":"7097","article-title":"Two Simple Ways to Learn Individual Fairness Metrics from Data","volume":"119","author":"Mukherjee Debarghya","year":"2020","unstructured":"Debarghya Mukherjee, Mikhail Yurochkin, Moulinath Banerjee, and Yuekai Sun. 2020. Two Simple Ways to Learn Individual Fairness Metrics from Data. In ICML, Vol. 119. 7097--7107.","journal-title":"ICML"},{"key":"e_1_2_2_29_1","volume-title":"FAccT","volume":"1170","author":"Narayanan Arvind","year":"2018","unstructured":"Arvind Narayanan. 2018. Translation tutorial: 21 fairness definitions and their politics. In FAccT, Vol. 1170."},{"key":"e_1_2_2_30_1","volume-title":"PyTorch: An Imperative Style","author":"Paszke Adam","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS. 8024--8035."},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_2_32_1","unstructured":"Felix Petersen Debarghya Mukherjee Yuekai Sun and Mikhail Yurochkin. 2021. Post-processing for Individual Fairness. In ICLR."},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/3157794.3157797"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056442"},{"key":"e_1_2_2_35_1","doi-asserted-by":"crossref","unstructured":"Babak Salimi Luke Rodriguez Bill Howe and Dan Suciu. 2019. Interventional fairness: Causal database repair for algorithmic fairness. In SIGMOD. 793--810.","DOI":"10.1145\/3299869.3319901"},{"key":"e_1_2_2_36_1","first-page":"5907","article-title":"SELFIE: Refurbishing Unclean Samples for Robust Deep Learning","volume":"97","author":"Song Hwanjun","year":"2019","unstructured":"Hwanjun Song, Minseok Kim, and Jae-Gil Lee. 2019. SELFIE: Refurbishing Unclean Samples for Robust Deep Learning. In ICML, Vol. 97. 5907--5915.","journal-title":"ICML"},{"key":"e_1_2_2_37_1","volume-title":"Learning from Noisy Labels with Deep Neural Networks: A Survey. CoRR","author":"Song Hwanjun","year":"2020","unstructured":"Hwanjun Song, Minseok Kim, Dongmin Park, and Jae-Gil Lee. 2020. Learning from Noisy Labels with Deep Neural Networks: A Survey. CoRR, Vol. abs\/2007.08199 (2020)."},{"key":"e_1_2_2_38_1","unstructured":"Ki Hyun Tae. 2022. iFlipper Github repository. https:\/\/github.com\/khtae8250\/iFlipper."},{"key":"e_1_2_2_39_1","unstructured":"Alexander Vargo Fan Zhang Mikhail Yurochkin and Yuekai Sun. 2021. Individually Fair Gradient Boosting. In ICLR."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3194770.3194776"},{"key":"e_1_2_2_41_1","volume-title":"An Empirical Study on Learning Fairness Metrics for COMPAS Data with Human Supervision. arXiv preprint arXiv:1910.10255","author":"Wang Hanchen","year":"2019","unstructured":"Hanchen Wang, Nina Grgic-Hlaca, Preethi Lahoti, Krishna P Gummadi, and Adrian Weller. 2019. An Empirical Study on Learning Fairness Metrics for COMPAS Data with Human Supervision. arXiv preprint arXiv:1910.10255 (2019)."},{"key":"e_1_2_2_42_1","volume-title":"Yuji Roh, and Geon Heo.","author":"Whang Steven Euijong","year":"2021","unstructured":"Steven Euijong Whang, Ki Hyun Tae, Yuji Roh, and Geon Heo. 2021. Responsible AI Challenges in End-to-end Machine Learning. IEEE Data Eng. Bull. (2021)."},{"key":"e_1_2_2_43_1","volume-title":"LSAC national longitudinal bar passage study","author":"Wightman Linda F","unstructured":"Linda F Wightman and Henry Ramsey. 1998. LSAC national longitudinal bar passage study. Law School Admission Council."},{"key":"e_1_2_2_44_1","unstructured":"Mikhail Yurochkin Amanda Bower and Yuekai Sun. 2020. Training individually fair ML models with sensitive subspace robustness. In ICLR."},{"key":"e_1_2_2_45_1","unstructured":"Mikhail Yurochkin and Yuekai Sun. 2021. SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness. In ICLR."},{"key":"e_1_2_2_46_1","volume-title":"Manuel Gomez Rodriguez, and Krishna P Gummadi","author":"Zafar Muhammad Bilal","year":"2017","unstructured":"Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2017a. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In WWW. 1171--1180."},{"key":"e_1_2_2_47_1","volume-title":"Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research","volume":"970","author":"Zafar Muhammad Bilal","unstructured":"Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P. Gummadi. 2017b. Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 54), Aarti Singh and Jerry Zhu (Eds.). PMLR, 962--970. https:\/\/proceedings.mlr.press\/v54\/zafar17a.html"},{"key":"e_1_2_2_48_1","first-page":"325","article-title":"Learning Fair Representations","volume":"28","author":"Zemel Rich","year":"2013","unstructured":"Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning Fair Representations. In ICML, Vol. 28. 325--333.","journal-title":"ICML"},{"key":"e_1_2_2_49_1","doi-asserted-by":"crossref","unstructured":"Hantian Zhang Xu Chu Abolfazl Asudeh and Shamkant B Navathe. 2021. OmniFair: A Declarative System for Model-Agnostic Group Fairness in Machine Learning. In SIGMOD. 2076--2088.","DOI":"10.1145\/3448016.3452787"},{"key":"e_1_2_2_50_1","volume-title":"Jaeyoung Park, Xu Chu, and Steven Euijong Whang.","author":"Zhang Hantian","year":"2022","unstructured":"Hantian Zhang, Ki Hyun Tae, Jaeyoung Park, Xu Chu, and Steven Euijong Whang. 2022. iFlipper: Label Flipping for Individual Fairness. https:\/\/github.com\/khtae8250\/iFlipper\/blob\/main\/techreport.pdf."}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3588688","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3588688","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:47:13Z","timestamp":1750178833000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3588688"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,26]]},"references-count":50,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,5,26]]}},"alternative-id":["10.1145\/3588688"],"URL":"https:\/\/doi.org\/10.1145\/3588688","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,26]]}}}