{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T16:40:01Z","timestamp":1758904801681,"version":"3.44.0"},"reference-count":76,"publisher":"Association for Computing Machinery (ACM)","issue":"4","funder":[{"DOI":"10.13039\/501100003977","name":"Israel Science Foundation","doi-asserted-by":"crossref","award":["2707\/22 1702\/24"],"award-info":[{"award-number":["2707\/22 1702\/24"]}],"id":[{"id":"10.13039\/501100003977","id-type":"DOI","asserted-by":"crossref"}]},{"name":"The Scharf-Ullman Endowment"},{"name":"The Alon Scholarship"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,9,22]]},"abstract":"<jats:p>The dire need to protect sensitive data has led to various flavors of privacy definitions. Among these, Differential privacy (DP) is considered one of the most rigorous and secure notions of privacy, enabling data analysis while preserving the privacy of data contributors. One of the fundamental tasks of data analysis is clustering , which is meant to unravel hidden patterns within complex datasets. However, interpreting clustering results poses significant challenges, and often necessitates an extensive analytical process. Interpreting clustering results under DP is even more challenging, as analysts are provided with noisy responses to queries, and longer, manual exploration sessions require additional noise to meet privacy constraints. While increasing attention has been given to clustering explanation frameworks that aim at assisting analysts by automatically uncovering the characteristics of each cluster, such frameworks may also disclose sensitive information within the dataset, leading to a breach in privacy.<\/jats:p>\n          <jats:p>To address these challenges, we present DPClustX, a framework that provides explanations for black-box clustering results while satisfying DP. DPClustX takes as input the sensitive dataset alongside privately computed clustering labels, and outputs a global explanation, emphasizing prominent characteristics of each cluster while guaranteeing DP. We perform an extensive experimental analysis of DPClustX on real data, showing that it provides insightful and accurate explanations even under tight privacy constraints.<\/jats:p>","DOI":"10.1145\/3749161","type":"journal-article","created":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T17:17:03Z","timestamp":1758647823000},"page":"1-27","source":"Crossref","is-referenced-by-count":0,"title":["Differentially Private Explanations for Clusters"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3764-1958","authenticated-orcid":false,"given":"Amir","family":"Gilad","sequence":"first","affiliation":[{"name":"Hebrew University, Tel Aviv, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8566-8821","authenticated-orcid":false,"given":"Tova","family":"Milo","sequence":"additional","affiliation":[{"name":"Tel Aviv University, Tel Aviv, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9165-2025","authenticated-orcid":false,"given":"Kathy","family":"Razmadze","sequence":"additional","affiliation":[{"name":"Tel Aviv University, Tel Aviv, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8968-4848","authenticated-orcid":false,"given":"Ron","family":"Zadicario","sequence":"additional","affiliation":[{"name":"Tel Aviv University, Tel Aviv, Israel"}]}],"member":"320","published-online":{"date-parts":[[2025,9,23]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2024. DPClustX Git Repository. https:\/\/github.com\/ronzadi\/DPClustX"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1186\/S40537-017-0110-7"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3226070"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2012.80"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457259"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3086464"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","unstructured":"John Clore Krzysztof Cios Jon DeShazo and Beata Strack. 2014. Diabetes 130-US Hospitals for Years 1999-2008. UCI Machine Learning Repository. DOI: https:\/\/doi.org\/10.24432\/C5230J.","DOI":"10.24432\/C5230J"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639329"},{"volume-title":"Mathematical methods of statistics","author":"Cram\u00e9r Harald","key":"e_1_2_1_9_1","unstructured":"Harald Cram\u00e9r. 1999. Mathematical methods of statistics. Vol. 43. Princeton university press."},{"key":"e_1_2_1_10_1","volume-title":"International Conference on Machine Learning. PMLR, 4794-4815","author":"Dasgupta Sanjoy","year":"2022","unstructured":"Sanjoy Dasgupta, Nave Frost, and Michal Moshkovitz. 2022. Framework for evaluating faithfulness of local explanations. In International Conference on Machine Learning. PMLR, 4794-4815."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/3565838.3565841"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/3294996.3295115"},{"key":"e_1_2_1_13_1","volume-title":"International Conference on Machine Learning. PMLR, 2597-2606","author":"Dong Jinshuo","year":"2020","unstructured":"Jinshuo Dong, David Durfee, and Ryan Rogers. 2020. Optimal differential privacy composition for exponential mechanisms. In International Conference on Machine Learning. PMLR, 2597-2606."},{"key":"e_1_2_1_14_1","unstructured":"D. Durfee and R. Rogers. 2021. One-shot DP top-k mechanisms. DifferentialPrivacy.org. https:\/\/differentialprivacy.org\/one-shot-top-k\/."},{"key":"e_1_2_1_15_1","volume-title":"Advances in Neural Information Processing Systems","volume":"32","author":"Durfee David","year":"2019","unstructured":"David Durfee and Ryan M Rogers. 2019. Practical differentially private top-k selection with pay-what-you-get composition. Advances in Neural Information Processing Systems, Vol. 32 (2019)."},{"volume-title":"International colloquium on automata, languages, and programming","author":"Dwork Cynthia","key":"e_1_2_1_16_1","unstructured":"Cynthia Dwork. 2006. Differential privacy. In International colloquium on automata, languages, and programming. Springer, 1-12."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3294052.3322188"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/11681878_14"},{"key":"e_1_2_1_19_1","first-page":"211","volume-title":"Foundations and Trends\u00ae in Theoretical Computer Science","volume":"9","author":"Dwork Cynthia","year":"2014","unstructured":"Cynthia Dwork, Aaron Roth, et al., 2014. The algorithmic foundations of differential privacy. Foundations and Trends\u00ae in Theoretical Computer Science, Vol. 9, 3-4 (2014), 211-407."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2660267.2660348"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611977073.103"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376651"},{"key":"e_1_2_1_23_1","first-page":"28929","article-title":"Nearly-tight and oblivious algorithms for explainable clustering","volume":"34","author":"Gamlath Buddhima","year":"2021","unstructured":"Buddhima Gamlath, Xinrui Jia, Adam Polak, and Ola Svensson. 2021. Nearly-tight and oblivious algorithms for explainable clustering. Advances in Neural Information Processing Systems, Vol. 34 (2021), 28929-28939.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3300092"},{"key":"e_1_2_1_25_1","first-page":"4040","article-title":"Differentially private clustering: Tight approximation ratios","volume":"33","author":"Ghazi Badih","year":"2020","unstructured":"Badih Ghazi, Ravi Kumar, and Pasin Manurangsi. 2020. Differentially private clustering: Tight approximation ratios. Advances in Neural Information Processing Systems, Vol. 33 (2020), 4040-4054.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1536414.1536464"},{"key":"e_1_2_1_27_1","volume-title":"Differentially Private Explanations for Clusters. arXiv preprint arXiv:2506.05900","author":"Gilad Amir","year":"2025","unstructured":"Amir Gilad, Tova Milo, Kathy Razmadze, and Ron Zadicario. 2025. Differentially Private Explanations for Clusters. arXiv preprint arXiv:2506.05900 (2025)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611973075.90"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5827"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920970"},{"volume-title":"Knowledge Discovery and Measures of Interest","author":"Hilderman Robert J","key":"e_1_2_1_31_1","unstructured":"Robert J Hilderman and Howard J Hamilton. 2013. Knowledge Discovery and Measures of Interest. Vol. 638. Springer Science & Business Media."},{"key":"e_1_2_1_32_1","volume-title":"P\u00f3l Mac Aonghusa, and Killian Levacher","author":"Holohan Naoise","year":"2019","unstructured":"Naoise Holohan, Stefano Braghin, P\u00f3l Mac Aonghusa, and Killian Levacher. 2019. Diffprivlib: the IBM differential privacy library. arXiv preprint arXiv:1907.02444 (2019)."},{"key":"e_1_2_1_33_1","volume-title":"Interpretable Clustering: A Survey. arXiv preprint arXiv:2409.00743","author":"Hu Lianyu","year":"2024","unstructured":"Lianyu Hu, Mudi Jiang, Junjie Dong, Xinying Liu, and Zengyou He. 2024. Interpretable Clustering: A Survey. arXiv preprint arXiv:2409.00743 (2024)."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/EuroSP48549.2020.00041"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3187009.3177733"},{"volume-title":"Proceedings of the 36th International Conference on Machine Learning. PMLR.","author":"Jordon James","key":"e_1_2_1_36_1","unstructured":"James Jordon, Jinsung Yoon, and Mihaela van der Schaar. 2019. Differentially private model personalization. In Proceedings of the 36th International Conference on Machine Learning. PMLR."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342274"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939874"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.14778\/3494124.3494151"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2463721"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.61115"},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Yuyu Luo Xuedi Qin Nan Tang and Guoliang Li. 2018. DeepEye: Towards Automatic Data Visualization. ICDE.","DOI":"10.1109\/ICDE.2018.00019"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611479.3611538"},{"key":"e_1_2_1_44_1","volume-title":"International Conference on Machine Learning. PMLR, 7358-7367","author":"Makarychev Konstantin","year":"2021","unstructured":"Konstantin Makarychev and Liren Shan. 2021. Near-optimal algorithms for explainable k-medians and k-means. In International Conference on Machine Learning. PMLR, 7358-7367."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3519935.3520056"},{"key":"e_1_2_1_46_1","volume-title":"Winning the NIST Contest: A scalable and general approach to differentially private synthetic data. arXiv preprint arXiv:2108.04978","author":"McKenna Ryan","year":"2021","unstructured":"Ryan McKenna, Gerome Miklau, and Daniel Sheldon. 2021. Winning the NIST Contest: A scalable and general approach to differentially private synthetic data. arXiv preprint arXiv:2108.04978 (2021)."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2007.66"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559850"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.24432\/C5VP42"},{"key":"e_1_2_1_50_1","volume-title":"International Conference on Machine Learning (ICML 2021), Workshop on Socially Responsible Machine Learning.","author":"Mochaourab Rami","year":"2021","unstructured":"Rami Mochaourab, Sugandh Sinha, Stanley Greenstein, and Panagiotis Papapetrou. 2021. Robust counterfactual explanations for privacy-preserving SVM. In International Conference on Machine Learning (ICML 2021), Workshop on Socially Responsible Machine Learning."},{"key":"e_1_2_1_51_1","volume-title":"International conference on machine learning. PMLR, 7055-7065","author":"Moshkovitz Michal","year":"2020","unstructured":"Michal Moshkovitz, Sanjoy Dasgupta, Cyrus Rashtchian, and Nave Frost. 2020. Explainable k-means and k-medians clustering. In International conference on machine learning. PMLR, 7055-7065."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3351095.3372850"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2018.06.003"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i10.17099"},{"key":"e_1_2_1_55_1","volume-title":"Zhao Ren, Thanh Toan Nguyen, Phi Le Nguyen, Hongzhi Yin, and Quoc Viet Hung Nguyen.","author":"Nguyen Thanh Tam","year":"2024","unstructured":"Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Thanh Toan Nguyen, Phi Le Nguyen, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2024. A survey of privacy-preserving model explanations: Privacy risks, attacks, and countermeasures. arXiv preprint arXiv:2404.00673 (2024)."},{"key":"e_1_2_1_56_1","unstructured":"Stack Overflow. 2018. Stack Overflow Annual Developer Survey. https:\/\/survey.stackoverflow.co."},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.51594\/csitrj.v5i3.911"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3531146.3533235"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556576"},{"key":"e_1_2_1_60_1","first-page":"12187","article-title":"Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses","volume":"33","author":"Rawal Kaivalya","year":"2020","unstructured":"Kaivalya Rawal and Himabindu Lakkaraju. 2020. Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses. Advances in Neural Information Processing Systems, Vol. 33 (2020), 12187-12198.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_61_1","doi-asserted-by":"crossref","unstructured":"Sunita Sarawagi Rakesh Agrawal and Nimrod Megiddo. 1998. Discovery-driven exploration of OLAP data cubes. In EDBT.","DOI":"10.1007\/BFb0100984"},{"key":"e_1_2_1_62_1","volume-title":"Advances in Neural Information Processing Systems","volume":"31","author":"Stemmer Uri","year":"2018","unstructured":"Uri Stemmer and Haim Kaplan. 2018. Differentially private k-means with constant multiplicative error. Advances in Neural Information Processing Systems, Vol. 31 (2018)."},{"key":"e_1_2_1_63_1","volume-title":"Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international","author":"Strack Beata","year":"2014","unstructured":"Beata Strack, Jonathan P DeShazo, Chris Gennings, Juan L Olmo, Sebastian Ventura, Krzysztof J Cios, and John N Clore. 2014. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international, Vol. 2014, 1 (2014), 781670."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/2857705.2857708"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133201"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3035922"},{"key":"e_1_2_1_67_1","volume-title":"Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12. CoRR","author":"Tang Jun","year":"2017","unstructured":"Jun Tang, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and XiaoFeng Wang. 2017b. Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12. CoRR, Vol. abs\/1709.02753 (2017). arXiv:1709.02753 http:\/\/arxiv.org\/abs\/1709.02753"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.14778\/3561261.3561271"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3583780.3614734"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767846"},{"key":"e_1_2_1_71_1","volume-title":"Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research","author":"Willmott Cort J","year":"2005","unstructured":"Cort J Willmott and Kenji Matsuura. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research, Vol. 30, 1 (2005), 79-82."},{"key":"e_1_2_1_72_1","volume-title":"Voyager: Exploratory analysis via faceted browsing of visualization recommendations. TVCG","author":"Wongsuphasawat Kanit","year":"2016","unstructured":"Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. 2016. Voyager: Exploratory analysis via faceted browsing of visualization recommendations. TVCG (2016)."},{"key":"e_1_2_1_73_1","volume-title":"DPCube: Differentially private histogram release through multidimensional partitioning. arXiv preprint arXiv:1202.5358","author":"Xiao Yonghui","year":"2012","unstructured":"Yonghui Xiao, Li Xiong, Liyue Fan, and Slawomir Goryczka. 2012. DPCube: Differentially private histogram release through multidimensional partitioning. arXiv preprint arXiv:1202.5358 (2012)."},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-013-0309-y"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.14778\/3538598.3538603"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/3134428"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3749161","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T16:21:59Z","timestamp":1758903719000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3749161"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,22]]},"references-count":76,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,9,22]]}},"alternative-id":["10.1145\/3749161"],"URL":"https:\/\/doi.org\/10.1145\/3749161","relation":{},"ISSN":["2836-6573"],"issn-type":[{"type":"electronic","value":"2836-6573"}],"subject":[],"published":{"date-parts":[[2025,9,22]]}}}