{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T19:02:32Z","timestamp":1774983752772,"version":"3.50.1"},"reference-count":77,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2025,2,10]],"date-time":"2025-02-10T00:00:00Z","timestamp":1739145600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,2,10]]},"abstract":"<jats:p>The Shapley value has been extensively used in many fields as the unique metric to fairly evaluate player contributions in cooperative settings. Since the exact computation of Shapley values is \\#P-hard in the task-agnostic setting, many studies have been developed to utilize the Monte Carlo method for Shapley value estimation. The existing methods estimate the Shapley values directly. In this paper, we explore a novel idea-inferring the Shapley values by estimating the differences between them. Technically, we estimate a differential matrix consisting of pairwise Shapley value differences to reduce the variance of the estimated Shapley values. We develop a least-squares optimization solution to derive the Shapley values from the differential matrix, minimizing the estimator variances. Additionally, we devise a Monte Carlo method for efficient estimation of the differential matrix and introduce two stratified Monte Carlo methods for further variance reduction. Our experimental results on real and synthetic data sets demonstrate the effectiveness and efficiency of the differential-matrix-based sampling approaches.<\/jats:p>","DOI":"10.1145\/3709725","type":"journal-article","created":{"date-parts":[[2025,2,11]],"date-time":"2025-02-11T15:45:06Z","timestamp":1739288706000},"page":"1-28","source":"Crossref","is-referenced-by-count":1,"title":["Shapley Value Estimation based on Differential Matrix"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-2653-2811","authenticated-orcid":false,"given":"Junyuan","family":"Pang","sequence":"first","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2200-8711","authenticated-orcid":false,"given":"Jian","family":"Pei","sequence":"additional","affiliation":[{"name":"Duke University, Durham, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8317-6217","authenticated-orcid":false,"given":"Haocheng","family":"Xia","sequence":"additional","affiliation":[{"name":"University of Illinois Urbana-Champaign, Urbana, IL, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-6633-6883","authenticated-orcid":false,"given":"Xiang","family":"Li","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2921-2827","authenticated-orcid":false,"given":"Jinfei","family":"Liu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2025,2,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3328526.3329589"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.48786\/EDBT.2024.60"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE55515.2023.00300"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3615952.3615954"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3301286"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3651606"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.COR.2017.01.019"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.COR.2008.04.004"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3300078"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3320231"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/3659437.3659459"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence","author":"Cohen Shay B.","year":"2005","unstructured":"Shay B. Cohen, Eytan Ruppin, and Gideon Dror. 2005. Feature Selection Based on the Shapley Value. In IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30 - August 5, 2005, Leslie Pack Kaelbling and Alessandro Saffiotti (Eds.). Professional Book Center, 665--670. http:\/\/ijcai.org\/Proceedings\/05\/Papers\/0763.pdf"},{"key":"e_1_2_1_13_1","volume-title":"Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021, April 13--15","volume":"3465","author":"Covert Ian","year":"2021","unstructured":"Ian Covert and Su-In Lee. 2021. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021, April 13--15, 2021, Virtual Event (Proceedings of Machine Learning Research, Vol. 130), Arindam Banerjee and Kenji Fukumizu (Eds.). PMLR, 3457--3465. http:\/\/proceedings.mlr.press\/v130\/covert21a.html"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3520172"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1287\/MOOR.19.2.257"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517912"},{"key":"e_1_2_1_17_1","volume-title":"Fair and Efficient Contribution Valuation for Vertical Federated Learning. In The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=sLQb8q0sUi","author":"Fan Zhenan","year":"2024","unstructured":"Zhenan Fan, Huang Fang, Xinglu Wang, Zirui Zhou, Jian Pei, Michael Friedlander, and Yong Zhang. 2024. Fair and Efficient Contribution Valuation for Vertical Federated Learning. In The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=sLQb8q0sUi"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE53745.2022.00228"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00180"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517855"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.24432\/C56C76"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3-031--47843--725"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18","volume":"3544","author":"Ghorbani Amirata","year":"2020","unstructured":"Amirata Ghorbani, Michael P. Kim, and James Zou. 2020. A Distributional Framework For Data Valuation. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 3535--3544. http:\/\/proceedings.mlr.press\/v119\/ghorbani20a.html"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15","volume":"2251","author":"Ghorbani Amirata","year":"2019","unstructured":"Amirata Ghorbani and James Y. Zou. 2019. Data Shapley: Equitable Valuation of Data for Machine Learning. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 2242--2251. http:\/\/proceedings.mlr.press\/v97\/ghorbani19c.html"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611540.3611606"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00096"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1963.10500830"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-0865-5_26"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342637"},{"key":"e_1_2_1_30_1","volume-title":"Towards Efficient Data Valuation Based on the Shapley Value. In The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16--18","volume":"1176","author":"Jia Ruoxi","year":"2019","unstructured":"Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Nezihe Merve G''urel, Bo Li, Ce Zhang, Dawn Song, and Costas J. Spanos. 2019b. Towards Efficient Data Valuation Based on the Shapley Value. In The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16--18 April 2019, Naha, Okinawa, Japan (Proceedings of Machine Learning Research, Vol. 89), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). PMLR, 1167--1176. http:\/\/proceedings.mlr.press\/v89\/jia19a.html"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3651142"},{"key":"e_1_2_1_32_1","volume-title":"The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=qxGXjWxabq","author":"Karla\u0161 Bojan","year":"2024","unstructured":"Bojan Karla\u0161, David Dao, Matteo Interlandi, Sebastian Schelter, Wentao Wu, and Ce Zhang. 2024. Data Debugging with Shapley Importance over Machine Learning Pipelines. In The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=qxGXjWxabq"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3651593"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1609\/AAAI.V38I12.29225"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611540.3611639"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE55515.2023.00168"},{"key":"e_1_2_1_37_1","volume-title":"The Twelfth International Conference on Learning Representations, ICLR 2024","author":"Li Weida","year":"2024","unstructured":"Weida Li and Yaoliang Yu. 2024. Faster Approximation of Probabilistic and Distributional Values via Least Squares. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7--11, 2024. OpenReview.net. https:\/\/openreview.net\/forum?id=lvSMIsztka"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3623278.3624771"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.20.3.370"},{"key":"e_1_2_1_40_1","volume-title":"Aircraft landing fees: a game theory approach. The Bell Journal of Economics","author":"Littlechild Stephen C","year":"1977","unstructured":"Stephen C Littlechild and GF Thompson. 1977. Aircraft landing fees: a game theory approach. The Bell Journal of Economics (1977), 186--204."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476311.3476335"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/3447689.3447700"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPICS.ICDT.2020.20"},{"key":"e_1_2_1_44_1","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Scott","year":"2017","unstructured":"Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 4765--4774. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/8a20a8621978632d76c43dfd28b67767-Abstract.html"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3626246.3654680"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/3551793.3551829"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639311"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW53142.2021.00023"},{"key":"e_1_2_1_49_1","volume-title":"Addressing the computational issues of the Shapley value with applications in the smart grid. Ph.,D. Dissertation","author":"Maleki Sasan","unstructured":"Sasan Maleki. 2015. Addressing the computational issues of the Shapley value with applications in the smart grid. Ph.,D. Dissertation. University of Southampton, UK. http:\/\/eprints.soton.ac.uk\/383963\/"},{"key":"e_1_2_1_50_1","volume-title":"Values of large games, IV: Evaluating the electoral college by Montecarlo techniques","author":"Mann Irwin","unstructured":"Irwin Mann and Lloyd S Shapley. 1960. Values of large games, IV: Evaluating the electoral college by Montecarlo techniques. Rand Corporation."},{"key":"e_1_2_1_51_1","article-title":"Sampling Permutations for Shapley Value Estimation","volume":"23","author":"Mitchell Rory","year":"2022","unstructured":"Rory Mitchell, Joshua Cooper, Eibe Frank, and Geoffrey Holmes. 2022. Sampling Permutations for Shapley Value Estimation. J. Mach. Learn. Res., Vol. 23 (2022), 43:1--43:46. http:\/\/jmlr.org\/papers\/v23\/21-0439.html","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00185"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.24432\/C55C7W"},{"key":"e_1_2_1_54_1","unstructured":"Art B. Owen. 2013a. Monte Carlo theory methods and examples. https:\/\/artowen.su.domains\/mc\/."},{"key":"e_1_2_1_55_1","volume-title":"Game theory","author":"Owen Guillermo","unstructured":"Guillermo Owen. 2013b. Game theory. Emerald Group Publishing."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE53745.2022.00300"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3375395.3387664"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511528446"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3555041.3589682"},{"key":"e_1_2_1_60_1","doi-asserted-by":"crossref","unstructured":"Lloyd S. Shapley. 1953. A value for n-person games. (1953).","DOI":"10.1515\/9781400881970-018"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/BIGDATA47090.2019.9006327"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599500"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.COR.2020.105094"},{"key":"e_1_2_1_64_1","volume-title":"Interpret Federated Learning with Shapley Values. CoRR","author":"Wang Guan","year":"2019","unstructured":"Guan Wang. 2019. Interpret Federated Learning with Shapley Values. CoRR, Vol. abs\/1905.04519 (2019). showeprint[arXiv]1905.04519 http:\/\/arxiv.org\/abs\/1905.04519"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE53745.2022.00073"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.14778\/3648160.3648172"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2311.05346"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","unstructured":"WIlliam Wolberg. 1992. Breast Cancer Wisconsin (Original). UCI Machine Learning Repository. DOI: https:\/\/doi.org\/10.24432\/C5HP4Z.","DOI":"10.24432\/C5HP4Z"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.COR.2023.106305"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.14778\/3654621.3654638"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611479.3611531"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588728"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE55515.2023.00055"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.14778\/3538598.3538604"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.14778\/3587136.3587141"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE53745.2022.00208"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.14778\/3489496.3489514"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709725","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3709725","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:22:57Z","timestamp":1774981377000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709725"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,10]]},"references-count":77,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,2,10]]}},"alternative-id":["10.1145\/3709725"],"URL":"https:\/\/doi.org\/10.1145\/3709725","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,10]]}}}