{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T17:43:12Z","timestamp":1757612592068,"version":"3.44.0"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,6]]},"abstract":"<jats:p>\n            Vertical federated learning (VFL) trains models when multiple databases (a.k.a participants) hold different features of the same set of samples. By quantifying each participant's contribution to model training,\n            <jats:italic toggle=\"yes\">data valuation<\/jats:italic>\n            can prevent hitch-riders and reward the instrumental parties. However, vertical federated data valuation (VFDV) is challenging because it needs to be accurate and efficient while protecting participant data privacy. In this paper, we propose a method meeting all three requirements by using\n            <jats:italic toggle=\"yes\">projection<\/jats:italic>\n            and\n            <jats:italic toggle=\"yes\">sampling<\/jats:italic>\n            for\n            <jats:italic toggle=\"yes\">mutual information<\/jats:italic>\n            estimation (thus dubbed PS-MI). In particular, we first show that the utility of a participant set (a.k.a a\n            <jats:italic toggle=\"yes\">coalition<\/jats:italic>\n            ) can be expressed as the mutual information (MI) between their features and the target labels. MI is favorable because it does not depend on the model to train (i.e.,\n            <jats:italic toggle=\"yes\">model-agnostic<\/jats:italic>\n            ) and can be estimated via\n            <jats:italic toggle=\"yes\">k<\/jats:italic>\n            -nearest neighbor (KNN). To run KNN, instead of using costly homomorphic encryption to protect data privacy, we apply simple\n            <jats:italic toggle=\"yes\">random projection<\/jats:italic>\n            to participant features before distance computation. We prove that random projection ensures differential privacy and preserves unbiased distance estimates. Since the contribution of a participant involves many coalitions, we adopt\n            <jats:italic toggle=\"yes\">stratified sampling<\/jats:italic>\n            to reduce the number of coalitions while controlling estimation variance. To further improve efficiency, we incorporate optimizations including using locality sensitive hashing (LSH) to prune kNN candidates, batching kNN candidate checking for multiple coalitions, and adaptive early termination for utility evaluation. We compare PS-MI with 5 state-of-the-art VFDV methods. The results show that PS-MI yields higher accuracy and shorter running time than the baselines, and the maximum speedup can be 592\u00d7.\n          <\/jats:p>","DOI":"10.14778\/3748191.3748215","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T13:50:16Z","timestamp":1756993816000},"page":"3559-3572","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["PS-MI: Accurate, Efficient, and Private Data Valuation in Vertical Federated Learning"],"prefix":"10.14778","volume":"18","author":[{"given":"Xiaokai","family":"Zhou","sequence":"first","affiliation":[{"name":"School of Computer Science, Wuhan University"}]},{"given":"Xiao","family":"Yan","sequence":"additional","affiliation":[{"name":"Institute for Math &amp; AI, Wuhan University"}]},{"given":"Fangcheng","family":"Fu","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence, Shanghai Jiao Tong University"}]},{"given":"Ziwen","family":"Fu","sequence":"additional","affiliation":[{"name":"School of Cyber Science and Engineering, Wuhan University"}]},{"given":"Tieyun","family":"Qian","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University"}]},{"given":"Yuanyuan","family":"Zhu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University"}]},{"given":"Qinbo","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University"}]},{"given":"Bin","family":"Cui","sequence":"additional","affiliation":[{"name":"School of Computer Science, Peking University"}]},{"given":"Jiawei","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wuhan University"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"2024","article-title":"PaddleFL: Federated Deep Learning in PaddlePaddle. https:\/\/github.com\/PaddlePaddle\/PaddleFL","year":"2023","unstructured":"Baidu. 2023. PaddleFL: Federated Deep Learning in PaddlePaddle. https:\/\/github.com\/PaddlePaddle\/PaddleFL. Accessed: 2024-12.","journal-title":"Accessed"},{"key":"e_1_2_1_2_1","volume-title":"TenSEAL: a library for encrypted tensor operations using homomorphic encryption. arXiv preprint arXiv:2104.03152","author":"Benaissa Ayoub","year":"2021","unstructured":"Ayoub Benaissa, Bilal Retiat, Bogdan Cebere, and Alaa Eddine Belfedhal. 2021. TenSEAL: a library for encrypted tensor operations using homomorphic encryption. arXiv preprint arXiv:2104.03152 (2021)."},{"key":"e_1_2_1_3_1","volume-title":"International Conference on Machine Learning (ICML). PMLR, 3757\u20133781","author":"Castiglia Timothy","year":"2023","unstructured":"Timothy Castiglia, Yi Zhou, Shiqiang Wang, Swanand Kadhe, Nathalie Baracaldo, and Stacy Patterson. 2023. Less-vfl: communication-efficient feature selection for vertical federated learning. In International Conference on Machine Learning (ICML). PMLR, 3757\u20133781."},{"key":"e_1_2_1_4_1","volume-title":"International Conference on Machine Learning (ICML) (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.)","volume":"139","author":"Catav Amnon","year":"2021","unstructured":"Amnon Catav, Boyang Fu, Yazeed Zoabi, Ahuva Libi Weiss Meilik, Noam Shomron, Jason Ernst, Sriram Sankararaman, and Ran Gilad-Bachrach. 2021. Marginal contribution feature importance - an axiomatic approach for explaining data. In International Conference on Machine Learning (ICML) (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.), Vol. 139. PMLR, 1324\u20131335."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS). 1243\u20131255","author":"Chen Hao","year":"2017","unstructured":"Hao Chen, Kim Laine, and Peter Rindal. 2017. Fast private set intersection from homomorphic encryption. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS). 1243\u20131255."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/3659437.3659459"},{"key":"e_1_2_1_7_1","volume-title":"Understanding global feature contributions with additive importance measures. Advances in Neural Information Processing Systems (NIPS)","author":"Covert Ian C.","year":"2020","unstructured":"Ian C. Covert, Scott Lundberg, and Su-In Lee. 2020. Understanding global feature contributions with additive importance measures. Advances in Neural Information Processing Systems (NIPS) (2020)."},{"key":"e_1_2_1_8_1","unstructured":"D. Dua and C. Graff. 2017. UCI machine learning repository. UCI Machine Learning Repository. Available online: https:\/\/archive.ics.uci.edu\/ml."},{"volume-title":"Advances in Cryptology-Eurocrypt 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques","author":"Dwork Cynthia","key":"e_1_2_1_9_1","unstructured":"Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-Eurocrypt 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 486\u2013503."},{"key":"e_1_2_1_10_1","volume-title":"Theory of Cryptography: Third Theory of Cryptography Conference. 265\u2013284","author":"Dwork Cynthia","year":"2006","unstructured":"Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference. 265\u2013284."},{"key":"e_1_2_1_11_1","unstructured":"Xiaokai Zhou et.al. 2025. Supplementary. https:\/\/drive.google.com\/file\/d\/1oWaU8utMGjc2qBwwccej26XlWosHNn7G\/view?usp=sharing."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). 216\u2013226","author":"Fagin Ronald","year":"1996","unstructured":"Ronald Fagin. 1996. Combining fuzzy information from multiple systems. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). 216\u2013226."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the VLDB Endowment (VLDB) 15","author":"Fu Fangcheng","year":"2022","unstructured":"Fangcheng Fu, Xupeng Miao, Jiawei Jiang, Huanran Xue, and Bin Cui. 2022. Towards communication-efficient vertical federated learning training via cache-enabled local updates. Proceedings of the VLDB Endowment (VLDB) 15, 10 (2022)."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD). Association for Computing Machinery, 563\u2013576","author":"Fu Fangcheng","year":"2021","unstructured":"Fangcheng Fu, Yingxia Shao, Lele Yu, Jiawei Jiang, Huanran Xue, Yangyu Tao, and Bin Cui. 2021. VF2Boost: very fast vertical federated gradient boosting for cross-enterprise learning. In Proceedings of the International Conference on Management of Data (SIGMOD). Association for Computing Machinery, 563\u2013576."},{"key":"e_1_2_1_15_1","volume-title":"ProjPert: projection-based perturbation for label protection in split learning based vertical federated learning","author":"Fu Fangcheng","year":"2024","unstructured":"Fangcheng Fu, Xuanyu Wang, Jiawei Jiang, Huanran Xue, and Bui Cui. 2024. ProjPert: projection-based perturbation for label protection in split learning based vertical federated learning. IEEE Transactions on Knowledge and Data Engineering (TKDE) (2024)."},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD). 1316\u20131330","author":"Fu Fangcheng","year":"2022","unstructured":"Fangcheng Fu, Huanran Xue, Yong Cheng, Yangyu Tao, and Bin Cui. 2022. Blindfl: vertical federated machine learning without peeking into your data. In Proceedings of the International Conference on Management of Data (SIGMOD). 1316\u20131330."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD)","volume":"1","author":"Fu Rui","year":"2023","unstructured":"Rui Fu, Yuncheng Wu, Quanqing Xu, and Meihui Zhang. 2023. FEAST: a communication-efficient federated feature selection framework for relational data. In Proceedings of the International Conference on Management of Data (SIGMOD), Vol. 1. 1\u201328."},{"key":"e_1_2_1_18_1","volume-title":"Data valuation for vertical federated learning: an information-theoretic approach. arXiv preprint arXiv:2112.08364","author":"Han Xiao","year":"2021","unstructured":"Xiao Han, Leye Wang, and Junjie Wu. 2021. Data valuation for vertical federated learning: an information-theoretic approach. arXiv preprint arXiv:2112.08364 (2021)."},{"key":"e_1_2_1_19_1","volume-title":"Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677","author":"Hardy Stephen","year":"2017","unstructured":"Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017)."},{"key":"e_1_2_1_20_1","first-page":"1","article-title":"Efficient modality selection in multimodal learning","volume":"25","author":"He Yifei","year":"2024","unstructured":"Yifei He, Runxiang Cheng, Gargi Balasubramaniam, Yao-Hung Hubert Tsai, and Han Zhao. 2024. Efficient modality selection in multimodal learning. Journal of Machine Learning Research (JMLR) 25, 47 (2024), 1\u201339.","journal-title":"Journal of Machine Learning Research (JMLR)"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2024.3402355"},{"key":"e_1_2_1_22_1","volume-title":"2023 19th International Conference on Mobility, Sensing and Networking (MSN). 455\u2013462","author":"Huang Jiahui","year":"2023","unstructured":"Jiahui Huang, Lan Zhang, Anran Li, Haoran Cheng, Jiexin Xu, and Hongmei Song. 2023. Adaptive and efficient participant selection in vertical federated learning. In 2023 19th International Conference on Mobility, Sensing and Networking (MSN). 455\u2013462."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342637"},{"key":"e_1_2_1_24_1","first-page":"2088","article-title":"Vf-ps: how to select important participants in vertical federated learning, efficiently and securely","volume":"35","author":"Jiang Jiawei","year":"2022","unstructured":"Jiawei Jiang, Lukas Burkhalter, Fangcheng Fu, Bolin Ding, Bo Du, Anwar Hithnawi, Bo Li, and Ce Zhang. 2022. Vf-ps: how to select important participants in vertical federated learning, efficiently and securely? Advances in Neural Information Processing Systems (NIPS) 35 (2022), 2088\u20132101.","journal-title":"Advances in Neural Information Processing Systems (NIPS)"},{"key":"e_1_2_1_25_1","first-page":"994","article-title":"Cafe: catastrophic data leakage in vertical federated learning","volume":"34","author":"Jin Xiao","year":"2021","unstructured":"Xiao Jin, Pin-Yu Chen, Chia-Yi Hsu, Chia-Mu Yu, and Tianyi Chen. 2021. Cafe: catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems (NIPS) 34 (2021), 994\u20131006.","journal-title":"Advances in Neural Information Processing Systems (NIPS)"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1090\/conm\/026\/737400"},{"key":"e_1_2_1_27_1","unstructured":"Kaggle. 2024. Kaggle: your home for data science. Kaggle. Available online: https:\/\/www.kaggle.com\/datasets."},{"key":"e_1_2_1_28_1","volume-title":"Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba. 2014. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Article 1477","author":"Kolpaczki Patrick","year":"2025","unstructured":"Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, and Eyke H\u00fcllermeier. 2025. Approximating the shapley value without marginal contributions. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Article 1477 (2025), 10 pages."},{"key":"e_1_2_1_30_1","volume-title":"Estimating mutual information. Physical Review E\u2014Statistical, Nonlinear, and Soft Matter Physics 69, 6","author":"Kraskov Alexander","year":"2004","unstructured":"Alexander Kraskov, Harald St\u00f6gbauer, and Peter Grassberger. 2004. Estimating mutual information. Physical Review E\u2014Statistical, Nonlinear, and Soft Matter Physics 69, 6 (2004), 066138."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM) (CIKM '24)","author":"Li Juan","year":"2024","unstructured":"Juan Li, Rui Deng, Tianzi Zang, Mingqi Kong, and Kun Zhu. 2024. Efficient and Secure Contribution Estimation in Vertical Federated Learning. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM) (CIKM '24). Association for Computing Machinery, 1205\u20131214."},{"key":"e_1_2_1_32_1","volume-title":"International Conference on Machine Learning (ICML). PMLR","author":"Li Songze","year":"2023","unstructured":"Songze Li, Duanyi Yao, and Jin Liu. 2023. Fedvs: straggler-resilient and privacy-preserving vertical federated learning for split models. In International Conference on Machine Learning (ICML). PMLR, 20296\u201320311."},{"volume-title":"Advances in Neural Information Processing Systems (NIPS) (New Orleans, LA, USA)","author":"Li Weida","key":"e_1_2_1_33_1","unstructured":"Weida Li and Yaoliang Yu. 2024. Robust data valuation with weighted banzhaf values. In Advances in Neural Information Processing Systems (NIPS) (New Orleans, LA, USA). Curran Associates Inc., Article 2637, 35 pages."},{"key":"e_1_2_1_34_1","first-page":"2","article-title":"OpBoost: a vertical federated tree boosting framework based on order-preserving desensitization","volume":"16","author":"Li Xiaochen","year":"2022","unstructured":"Xiaochen Li, Yuke Hu, Weiran Liu, Hanwen Feng, Li Peng, Yuan Hong, Kui Ren, and Zhan Qin. 2022. OpBoost: a vertical federated tree boosting framework based on order-preserving desensitization. Proceedings of the VLDB Endowment (VLDB) 16, 2 (oct 2022), 202\u2013215.","journal-title":"Proceedings of the VLDB Endowment (VLDB)"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the VLDB Endowment (VLDB) 15","author":"Li Zitao","year":"2021","unstructured":"Zitao Li, Bolin Ding, Ce Zhang, Ninghui Li, and Jingren Zhou. 2021. Federated matrix factorization with privacy guarantee. Proceedings of the VLDB Endowment (VLDB) 15, 4 (2021)."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583146"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2024.3352628"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00023"},{"key":"e_1_2_1_39_1","unstructured":"Sasan Maleki. 2015. Addressing the computational issues of the Shapley value with applications in the smart grid. Ph.D. Dissertation. University of Southampton."},{"key":"e_1_2_1_40_1","volume-title":"Bounding the estimation error of sampling-based Shapley value approximation. arXiv preprint arXiv:1306.4265","author":"Maleki Sasan","year":"2013","unstructured":"Sasan Maleki, Long Tran-Thanh, Greg Hines, Talal Rahwan, and Alex Rogers. 2013. Bounding the estimation error of sampling-based Shapley value approximation. arXiv preprint arXiv:1306.4265 (2013)."},{"key":"e_1_2_1_41_1","volume-title":"SecureML: A System for Scalable Privacy-Preserving Machine Learning. In 2017 IEEE Symposium on Security and Privacy (SP). 19\u201338","author":"Mohassel Payman","year":"2017","unstructured":"Payman Mohassel and Yupeng Zhang. 2017. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In 2017 IEEE Symposium on Security and Privacy (SP). 19\u201338."},{"volume-title":"Breakthroughs in statistics: Methodology and distribution","author":"Neyman Jerzy","key":"e_1_2_1_42_1","unstructured":"Jerzy Neyman. 1992. On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. In Breakthroughs in statistics: Methodology and distribution. Springer, 123\u2013150."},{"key":"e_1_2_1_43_1","first-page":"2024","article-title":"PySyft: A library for answering questions using data you cannot see. https:\/\/github.com\/OpenMined\/PySyft","year":"2023","unstructured":"OpenMined. 2023. PySyft: A library for answering questions using data you cannot see. https:\/\/github.com\/OpenMined\/PySyft. Accessed: 2024-12.","journal-title":"Accessed"},{"key":"e_1_2_1_44_1","volume-title":"Estimation of R\u00e9nyi entropy and mutual information based on generalized nearest-neighbor graphs. Advances in Neural Information Processing Systems (NIPS) 23","author":"P\u00e1l D\u00e1vid","year":"2010","unstructured":"D\u00e1vid P\u00e1l, Barnab\u00e1s P\u00f3czos, and Csaba Szepesv\u00e1ri. 2010. Estimation of R\u00e9nyi entropy and mutual information based on generalized nearest-neighbor graphs. Advances in Neural Information Processing Systems (NIPS) 23 (2010)."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3154794"},{"key":"e_1_2_1_46_1","first-page":"49","article-title":"Sur l'approximation des fonctions convexes d'ordre sup\u00e9rieur","volume":"10","author":"Popoviciu Tiberiu","year":"1935","unstructured":"Tiberiu Popoviciu. 1935. Sur l'approximation des fonctions convexes d'ordre sup\u00e9rieur. Mathematica (Cluj) 10 (1935), 49\u201354.","journal-title":"Mathematica (Cluj)"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS)","author":"Raghuraman Srinivasan","year":"2022","unstructured":"Srinivasan Raghuraman and Peter Rindal. 2022. Blazing fast PSI from improved OKVS and subfield VOLE. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS) (Los Angeles, CA, USA) (CCS '22). Association for Computing Machinery, New York, NY, USA, 2505\u20132517."},{"volume-title":"The Shapley value: essays in honor of Lloyd S. Shapley","author":"Roth Alvin E","key":"e_1_2_1_48_1","unstructured":"Alvin E Roth. 1988. The Shapley value: essays in honor of Lloyd S. Shapley. Cambridge University Press."},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of the VLDB Endowment (VLDB)","author":"Shastri Supreeth","year":"2019","unstructured":"Supreeth Shastri, Vinay Banakar, Melissa Wasserman, Arun Kumar, and Vijay Chidambaram. 2019. Understanding and benchmarking the impact of gdpr on database systems. Proceedings of the VLDB Endowment (VLDB) (2019), 1064\u20131077."},{"key":"e_1_2_1_50_1","volume-title":"IEEE Conference on Computer Communications (INFOCOM). 1\u20136.","author":"Suimon Takumi","year":"2024","unstructured":"Takumi Suimon, Yuki Koizumi, Junji Takemasa, and Toru Hasegawa. 2024. A data reconstruction attack against vertical federated learning based on knowledge transfer. In IEEE Conference on Computer Communications (INFOCOM). 1\u20136."},{"key":"e_1_2_1_51_1","volume-title":"Split learning for health: distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564","author":"Vepakomma Praneeth","year":"2018","unstructured":"Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split learning for health: distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018)."},{"volume-title":"The EU general data protection regulation (gdpr): a practical guide","author":"Voigt Paul","key":"e_1_2_1_52_1","unstructured":"Paul Voigt and Axel von dem Bussche. 2017. The EU general data protection regulation (gdpr): a practical guide. Springer Publishing Company, Incorporated."},{"key":"e_1_2_1_53_1","volume-title":"IEEE International Conference on Big Data (Big Data). 2597\u20132604","author":"Wang Guan","year":"2019","unstructured":"Guan Wang, Charlie Xiaoqian Dang, and Ziye Zhou. 2019. Measure contribution of participants in federated learning. In IEEE International Conference on Big Data (Big Data). 2597\u20132604."},{"key":"e_1_2_1_54_1","volume-title":"2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 911\u2013923","author":"Wang Junhao","year":"2022","unstructured":"Junhao Wang, Lan Zhang, Anran Li, Xuanke You, and Haoran Cheng. 2022. Efficient participant contribution evaluation for horizontal and vertical federated learning. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 911\u2013923."},{"volume-title":"A principled approach to data valuation for federated learning","author":"Wang Tianhao","key":"e_1_2_1_55_1","unstructured":"Tianhao Wang, Johannes Rausch, Ce Zhang, Ruoxi Jia, and Dawn Song. 2020. A principled approach to data valuation for federated learning. Springer International Publishing, 153\u2013167."},{"key":"e_1_2_1_56_1","first-page":"2024","article-title":"FATE: Federated AI Technology Enabler. https:\/\/github.com\/FederatedAI\/FATE","year":"2019","unstructured":"Webank. 2019. FATE: Federated AI Technology Enabler. https:\/\/github.com\/FederatedAI\/FATE. Accessed: 2024-12.","journal-title":"Accessed"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407811"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.14778\/3603581.3603588"},{"key":"e_1_2_1_59_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD)","volume":"1","author":"Xiang Zihang","year":"2023","unstructured":"Zihang Xiang, Tianhao Wang, Wanyu Lin, and Di Wang. 2023. Practical differentially private and byzantine-resilient federated learning. In Proceedings of the International Conference on Management of Data (SIGMOD), Vol. 1. 1\u201326."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2017.2737966"},{"key":"e_1_2_1_61_1","volume-title":"Finding the piste: towards understanding privacy leaks in vertical federated learning systems","author":"Xu Xiangrui","year":"2024","unstructured":"Xiangrui Xu, Wei Wang, Zheng Chen, Bin Wang, Chao Li, Li Duan, Zhen Han, and Yufei Han. 2024. Finding the piste: towards understanding privacy leaks in vertical federated learning systems. IEEE Transactions on Dependable and Secure Computing (TDSC) (2024)."},{"key":"e_1_2_1_62_1","volume-title":"International Conference on Machine Learning (ICML), Hal Daum\u00e9 III and Aarti Singh (Eds.)","volume":"119","author":"Yoon Jinsung","year":"2020","unstructured":"Jinsung Yoon, Sercan Arik, and Tomas Pfister. 2020. Data valuation using reinforcement Llarning. In International Conference on Machine Learning (ICML), Hal Daum\u00e9 III and Aarti Singh (Eds.), Vol. 119. 10842\u201310851."},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the International Conference on Management of Data (SIGMOD)","volume":"1","author":"Zhang Jiayao","year":"2023","unstructured":"Jiayao Zhang, Qiheng Sun, Jinfei Liu, Li Xiong, Jian Pei, and Kui Ren. 2023. Efficient sampling approaches to Shapley Value approximation. In Proceedings of the International Conference on Management of Data (SIGMOD), Vol. 1."},{"key":"e_1_2_1_64_1","volume-title":"2023 IEEE 39th International Conference on Data Engineering (ICDE). 639\u2013652","author":"Zhang Jiayao","year":"2023","unstructured":"Jiayao Zhang, Haocheng Xia, Qiheng Sun, Jinfei Liu, Li Xiong, Jian Pei, and Kui Ren. 2023. Dynamic Shapley value computation. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). 639\u2013652."},{"key":"e_1_2_1_65_1","volume-title":"Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). Association for Computing Machinery, 4431\u20134442","author":"Zhao Fangyuan","year":"2024","unstructured":"Fangyuan Zhao, Zitao Li, Xuebin Ren, Bolin Ding, Shusen Yang, and Yaliang Li. 2024. VertiMRF: differentially private vertical federated data synthesis. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). Association for Computing Machinery, 4431\u20134442."},{"key":"e_1_2_1_66_1","volume-title":"Proceedings of the VLDB Endowment (VLDB) 16","author":"Zheng Shuyuan","year":"2023","unstructured":"Shuyuan Zheng, Yang Cao, and Masatoshi Yoshikawa. 2023. Secure shapley value for cross-silo federated learning. Proceedings of the VLDB Endowment (VLDB) 16, 7 (2023)."},{"key":"e_1_2_1_67_1","volume-title":"International Conference on Database Systems for Advanced Applications (DASFAA). 409\u2013424","author":"Zhou Xiaokai","year":"2024","unstructured":"Xiaokai Zhou, Xiao Yan, Xinyan Li, Hao Huang, Quanqing Xu, Qinbo Zhang, Yen Jerome, Zhaohui Cai, and Jiawei Jiang. 2024. VFDV-IM: an efficient and securely vertical federated data valuation. In International Conference on Database Systems for Advanced Applications (DASFAA). 409\u2013424."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3748191.3748215","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T13:51:38Z","timestamp":1756993898000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3748191.3748215"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6]]},"references-count":67,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,6]]}},"alternative-id":["10.14778\/3748191.3748215"],"URL":"https:\/\/doi.org\/10.14778\/3748191.3748215","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2025,6]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}