{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T17:55:56Z","timestamp":1773510956998,"version":"3.50.1"},"reference-count":30,"publisher":"Privacy Enhancing Technologies Symposium Advisory Board","issue":"1","license":[{"start":{"date-parts":[[2021,11,20]],"date-time":"2021-11-20T00:00:00Z","timestamp":1637366400000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Many data analysis operations can be expressed as a GROUP BY query on an unbounded set of partitions, followed by a per-partition aggregation. To make such a query differentially private, adding noise to each aggregation is not enough: we also need to make sure that the set of partitions released is also differentially private.<\/jats:p>\n               <jats:p>This problem is not new, and it was recently formally introduced as <jats:italic>differentially private set union<\/jats:italic> [14]. In this work, we continue this area of study, and focus on the common setting where each user is associated with a single partition. In this setting, we propose a simple, <jats:italic>optimal<\/jats:italic> differentially private mechanism that maximizes the number of released partitions. We discuss implementation considerations, as well as the possible extension of this approach to the setting where each user contributes to a fixed, small number of partitions.<\/jats:p>","DOI":"10.2478\/popets-2022-0017","type":"journal-article","created":{"date-parts":[[2021,11,21]],"date-time":"2021-11-21T02:42:17Z","timestamp":1637462537000},"page":"339-352","source":"Crossref","is-referenced-by-count":6,"title":["Differentially private partition selection"],"prefix":"10.56553","volume":"2022","author":[{"given":"Damien","family":"Desfontaines","sequence":"first","affiliation":[{"name":"Tumult Labs"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James","family":"Voss","sequence":"additional","affiliation":[{"name":"Google"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bryant","family":"Gipson","sequence":"additional","affiliation":[{"name":"Google"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chinmoy","family":"Mandayam","sequence":"additional","affiliation":[{"name":"Google"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"35752","published-online":{"date-parts":[[2021,11,20]]},"reference":[{"key":"2022062314355987849_j_popets-2022-0017_ref_001","doi-asserted-by":"crossref","unstructured":"[1] G. Acs, C. Castelluccia, and R. Chen. Differentially private histogram publishing through lossy compression. In 2012 IEEE 12th International Conference on Data Mining, pages 1\u201310. IEEE, 2012.10.1109\/ICDM.2012.80","DOI":"10.1109\/ICDM.2012.80"},{"key":"2022062314355987849_j_popets-2022-0017_ref_002","unstructured":"[2] B. Balle and Y.-X. Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In International Conference on Machine Learning, pages 394\u2013403. PMLR, 2018."},{"key":"2022062314355987849_j_popets-2022-0017_ref_003","doi-asserted-by":"crossref","unstructured":"[3] J. Bater, X. He, W. Ehrich, A. Machanavajjhala, and J. Rogers. Shrinkwrap: Differentially-private query processing in private data federations. arXiv preprint arXiv:1810.01816, 2018.","DOI":"10.14778\/3291264.3291274"},{"key":"2022062314355987849_j_popets-2022-0017_ref_004","doi-asserted-by":"crossref","unstructured":"[4] G. Cormode, C. Procopiuc, D. Srivastava, and T. T. Tran. Differentially private summaries for sparse data. In Proceedings of the 15th International Conference on Database Theory, pages 299\u2013311, 2012.10.1145\/2274576.2274608","DOI":"10.1145\/2274576.2274608"},{"key":"2022062314355987849_j_popets-2022-0017_ref_005","doi-asserted-by":"crossref","unstructured":"[5] G. Cormode, M. Procopiuc, D. Srivastava, and T. T. Tran. Differentially private publication of sparse data. arXiv preprint arXiv:1103.0825, 2011.10.1145\/2274576.2274608","DOI":"10.1145\/2274576.2274608"},{"key":"2022062314355987849_j_popets-2022-0017_ref_006","doi-asserted-by":"crossref","unstructured":"[6] B. Ding, M. Winslett, J. Han, and Z. Li. Differentially private data cubes: optimizing noise sources and consistency. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 217\u2013228, 2011.10.1145\/1989323.1989347","DOI":"10.1145\/1989323.1989347"},{"key":"2022062314355987849_j_popets-2022-0017_ref_007","doi-asserted-by":"crossref","unstructured":"[7] C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference, pages 265\u2013284. Springer, 2006.10.1007\/11681878_14","DOI":"10.1007\/11681878_14"},{"key":"2022062314355987849_j_popets-2022-0017_ref_008","doi-asserted-by":"crossref","unstructured":"[8] C. Dwork, M. Naor, O. Reingold, G. N. Rothblum, and S. Vadhan. On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 381\u2013390, 2009.10.1145\/1536414.1536467","DOI":"10.1145\/1536414.1536467"},{"key":"2022062314355987849_j_popets-2022-0017_ref_009","doi-asserted-by":"crossref","unstructured":"[9] C. Dwork and A. Roth. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3-4):211\u2013407, 2014.10.1561\/0400000042","DOI":"10.1561\/0400000042"},{"key":"2022062314355987849_j_popets-2022-0017_ref_010","doi-asserted-by":"crossref","unstructured":"[10] P. Flajolet, \u00c9. Fusy, O. Gandouet, and F. Meunier. Hyper-loglog: the analysis of a near-optimal cardinality estimation algorithm. In Discrete Mathematics and Theoretical Computer Science, pages 137\u2013156. Discrete Mathematics and Theoretical Computer Science, 2007.10.46298\/dmtcs.3545","DOI":"10.46298\/dmtcs.3545"},{"key":"2022062314355987849_j_popets-2022-0017_ref_011","unstructured":"[11] Q. Geng, W. Ding, R. Guo, and S. Kumar. Tight analysis of privacy and utility tradeoff in approximate differential privacy. In International Conference on Artificial Intelligence and Statistics, pages 89\u201399, 2020."},{"key":"2022062314355987849_j_popets-2022-0017_ref_012","doi-asserted-by":"crossref","unstructured":"[12] Q. Geng and P. Viswanath. Optimal noise adding mechanisms for approximate differential privacy. IEEE Transactions on Information Theory, 62(2):952\u2013969, Feb 2016.10.1109\/TIT.2015.2504972","DOI":"10.1109\/TIT.2015.2504972"},{"key":"2022062314355987849_j_popets-2022-0017_ref_013","doi-asserted-by":"crossref","unstructured":"[13] A. Ghosh, T. Roughgarden, and M. Sundararajan. Universally utility-maximizing privacy mechanisms. SIAM Journal on Computing, 41(6):1673\u20131693, 2012.10.1137\/09076828X","DOI":"10.1137\/09076828X"},{"key":"2022062314355987849_j_popets-2022-0017_ref_014","doi-asserted-by":"crossref","unstructured":"[14] S. Gopi, P. Gulhane, J. Kulkarni, J. H. Shen, M. Shokouhi, and S. Yekhanin. Differentially private set union. arXiv preprint arXiv:2002.09745, 2020.","DOI":"10.29012\/jpc.780"},{"key":"2022062314355987849_j_popets-2022-0017_ref_015","unstructured":"[15] M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially-private histograms through consistency. arXiv preprint arXiv:0904.0942, 2009."},{"key":"2022062314355987849_j_popets-2022-0017_ref_016","doi-asserted-by":"crossref","unstructured":"[16] N. Holohan, D. J. Leith, and O. Mason. Differential privacy in metric spaces: Numerical, categorical and functional data under the one roof. Information Sciences, 305:256\u2013268, 2015.","DOI":"10.1016\/j.ins.2015.01.021"},{"key":"2022062314355987849_j_popets-2022-0017_ref_017","doi-asserted-by":"crossref","unstructured":"[17] S. Inusah and T. J. Kozubowski. A discrete analogue of the laplace distribution. Journal of statistical planning and inference, 136(3):1090\u20131102, 2006.10.1016\/j.jspi.2004.08.014","DOI":"10.1016\/j.jspi.2004.08.014"},{"key":"2022062314355987849_j_popets-2022-0017_ref_018","doi-asserted-by":"crossref","unstructured":"[18] N. Johnson, J. P. Near, and D. Song. Towards practical differential privacy for sql queries. Proceedings of the VLDB Endowment, 11(5):526\u2013539, 2018.10.1145\/3187009.3177733","DOI":"10.1145\/3187009.3177733"},{"key":"2022062314355987849_j_popets-2022-0017_ref_019","unstructured":"[19] H. Kaplan, Y. Mansour, and U. Stemmer. The sparse vector technique, revisited. arXiv preprint arXiv:2010.00917, 2020."},{"key":"2022062314355987849_j_popets-2022-0017_ref_020","doi-asserted-by":"crossref","unstructured":"[20] D. Kifer and A. Machanavajjhala. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 193\u2013204, 2011.10.1145\/1989323.1989345","DOI":"10.1145\/1989323.1989345"},{"key":"2022062314355987849_j_popets-2022-0017_ref_021","doi-asserted-by":"crossref","unstructured":"[21] A. Korolova, K. Kenthapadi, N. Mishra, and A. Ntoulas. Releasing search queries and clicks privately. In Proceedings of the 18th international conference on World wide web, pages 171\u2013180, 2009.10.1145\/1526709.1526733","DOI":"10.1145\/1526709.1526733"},{"key":"2022062314355987849_j_popets-2022-0017_ref_022","doi-asserted-by":"crossref","unstructured":"[22] I. Kotsogiannis, Y. Tao, X. He, M. Fanaeepour, A. Machanavajjhala, M. Hay, and G. Miklau. Privatesql: a differentially private sql query engine. Proceedings of the VLDB Endowment, 12(11):1371\u20131384, 2019.","DOI":"10.14778\/3342263.3342274"},{"key":"2022062314355987849_j_popets-2022-0017_ref_023","doi-asserted-by":"crossref","unstructured":"[23] J. Lee and C. W. Clifton. Top-k frequent itemsets via differentially private fp-trees. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 931\u2013940, 2014.10.1145\/2623330.2623723","DOI":"10.1145\/2623330.2623723"},{"key":"2022062314355987849_j_popets-2022-0017_ref_024","unstructured":"[24] H. Li, L. Xiong, and X. Jiang. Differentially private synthesization of multi-dimensional data using copula functions. In Advances in database technology: proceedings. International conference on extending database technology, volume 2014, page 475. NIH Public Access, 2014."},{"key":"2022062314355987849_j_popets-2022-0017_ref_025","unstructured":"[25] M. Lyu, D. Su, and N. Li. Understanding the sparse vector technique for differential privacy. arXiv preprint arXiv:1603.01699, 2016."},{"key":"2022062314355987849_j_popets-2022-0017_ref_026","doi-asserted-by":"crossref","unstructured":"[26] R. J. Wilson, C. Y. Zhang, W. Lam, D. Desfontaines, D. Simmons-Marengo, and B. Gipson. Differentially private SQL with bounded user contribution. arXiv preprint arXiv:1909.01917, 2019.","DOI":"10.2478\/popets-2020-0025"},{"key":"2022062314355987849_j_popets-2022-0017_ref_027","doi-asserted-by":"crossref","unstructured":"[27] X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. IEEE Transactions on knowledge and data engineering, 23(8):1200\u20131214, 2010.10.1109\/TKDE.2010.247","DOI":"10.1109\/TKDE.2010.247"},{"key":"2022062314355987849_j_popets-2022-0017_ref_028","unstructured":"[28] Y. Xiao, L. Xiong, L. Fan, and S. Goryczka. Dpcube: differentially private histogram release through multidimensional partitioning. arXiv preprint arXiv:1202.5358, 2012."},{"key":"2022062314355987849_j_popets-2022-0017_ref_029","doi-asserted-by":"crossref","unstructured":"[29] J. Xu, Z. Zhang, X. Xiao, Y. Yang, G. Yu, and M. Winslett. Differentially private histogram publication. The VLDB Journal, 22(6):797\u2013822, 2013.10.1007\/s00778-013-0309-y","DOI":"10.1007\/s00778-013-0309-y"},{"key":"2022062314355987849_j_popets-2022-0017_ref_030","doi-asserted-by":"crossref","unstructured":"[30] J. Zhang, G. Cormode, C. M. Procopiuc, D. Srivastava, and X. Xiao. Privbayes: Private data release via bayesian networks. ACM Transactions on Database Systems (TODS), 42(4):1\u201341, 2017.","DOI":"10.1145\/3134428"}],"container-title":["Proceedings on Privacy Enhancing Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.sciendo.com\/pdf\/10.2478\/popets-2022-0017","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,20]],"date-time":"2022-07-20T16:31:54Z","timestamp":1658334714000},"score":1,"resource":{"primary":{"URL":"https:\/\/petsymposium.org\/popets\/2022\/popets-2022-0017.php"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,20]]},"references-count":30,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,11,20]]},"published-print":{"date-parts":[[2022,1,1]]}},"alternative-id":["10.2478\/popets-2022-0017"],"URL":"https:\/\/doi.org\/10.2478\/popets-2022-0017","relation":{},"ISSN":["2299-0984"],"issn-type":[{"value":"2299-0984","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,20]]}}}