{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T08:54:43Z","timestamp":1775638483352,"version":"3.50.1"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>In this paper, we introduce a novel approach to computing the contribution of input tuples to the result of the query, quantified by the Banzhaf and Shapley values. In contrast to prior algorithmic work that focuses on Select-Project-Join-Union queries, ours is the first practical approach for queries with aggregates. It relies on two novel optimizations that are essential for its practicality and significantly improve the runtime performance already for queries without aggregates. The first optimization exploits the observation that many input tuples have the same contribution to the query result, so it is enough to compute the contribution of one of them. The second optimization uses the gradient of the query lineage to compute the contributions of all tuples with the same complexity as for one of them. Experiments with a million instances over 3 databases show that our approach achieves up to 3 orders of magnitude runtime improvements over the state-of-the-art for queries without aggregates, and that it is practical for aggregate queries.<\/jats:p>","DOI":"10.14778\/3749646.3749670","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T17:55:06Z","timestamp":1757008506000},"page":"3996-4008","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Advancing Fact Attribution for Query Answering: Aggregate Queries and Novel Algorithms"],"prefix":"10.14778","volume":"18","author":[{"given":"Omer","family":"Abramovich","sequence":"first","affiliation":[{"name":"Tel Aviv University, Tel Aviv, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Deutch","sequence":"additional","affiliation":[{"name":"Tel Aviv University, Tel Aviv, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nave","family":"Frost","sequence":"additional","affiliation":[{"name":"eBay Research, Netanya, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ahmet","family":"Kara","sequence":"additional","affiliation":[{"name":"OTH Regensburg, Regensburg, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dan","family":"Olteanu","sequence":"additional","affiliation":[{"name":"University of Zurich, Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Foundations of Databases","author":"Abiteboul Serge","unstructured":"Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Vol. 8. Addison-Wesley Reading. http:\/\/webdam.inria.fr\/Alice\/"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654926"},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Omer Abramovich Daniel Deutch Nave Frost Ahmet Kara and Dan Olteanu. 2025. Advancing Fact Attribution for Query Answering: Aggregate Queries and Novel Algorithms. arXiv:2506.16923 [cs.DB] https:\/\/arxiv.org\/abs\/2506.16923","DOI":"10.14778\/3749646.3749670"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","unstructured":"Yael Amsterdamer Daniel Deutch and Val Tannen. 2011. Provenance for Aggregate Queries. In PODS. 153\u2013164. 10.1145\/1989284.1989302","DOI":"10.1145\/1989284.1989302"},{"key":"e_1_2_1_5_1","first-page":"51","article-title":"GProM - A Swiss Army Knife for Your Provenance Needs","volume":"41","author":"Arab Bahareh Sadat","year":"2018","unstructured":"Bahareh Sadat Arab, Su Feng, Boris Glavic, Seokki Lee, Xing Niu, and Qitian Zeng. 2018. GProM - A Swiss Army Knife for Your Provenance Needs. IEEE Data Eng. Bull. 41, 1 (2018), 51\u201362. http:\/\/sites.computer.org\/debull\/A18mar\/p51.pdf","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_6_1","first-page":"317","article-title":"Weighted Voting Doesn't Work: A Mathematical Analysis","volume":"19","author":"John F","year":"1965","unstructured":"John F Banzhaf III. 1965. Weighted Voting Doesn't Work: A Mathematical Analysis. Rutgers Law Review 19, 2 (1965), 317\u2013343. https:\/\/heinonline.org\/HOL\/LandingPage?handle=hein.journals\/rutlr19&div=19&id=&page=","journal-title":"Rutgers Law Review"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(83)90110-X"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3615952.3615954"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","unstructured":"Meghyn Bienvenu Diego Figueira and Pierre Lafourcade. 2024. Shapley Value Computation in Ontology-Mediated Query Answering. In KR. Article 15 11 pages. 10.24963\/kr.2024\/15","DOI":"10.24963\/kr.2024\/15"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000006"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/765568.765570"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","unstructured":"Daniel Deutch Nave Frost Benny Kimelfeld and Mika\u00ebl Monet. 2022. Computing the Shapley Value of Facts in Query Answering. In SIGMOD. 1570\u20131583. 10.1145\/3514221.3517912","DOI":"10.1145\/3514221.3517912"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.14778\/2140436.2140445"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3119110"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000074"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0507655102"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3651142"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3651593"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPICS.ICDT.2023.11"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.46298\/lmcs-17(3:22)2021"},{"key":"e_1_2_1_21_1","unstructured":"Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In NeurIPS. 4765\u20134774. http:\/\/papers.nips.cc\/paper\/7062-a-unified-approach-to-interpreting-model-predictions.pdf"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639311"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/1880172.1880176"},{"key":"e_1_2_1_24_1","unstructured":"Alexandra Meliou Wolfgang Gatterbauer and Dan Suciu. 2011. Bringing Provenance to Its Full Potential Using Causal Reasoning. In TaPP. https:\/\/www.usenix.org\/conference\/tapp11\/bringing-provenance-its-full-potential-using-causal-reasoning"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733004.2733070"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.2307\/2981392"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","unstructured":"Alon Reshef Benny Kimelfeld and Ester Livshits. 2020. The Impact of Negation on the Complexity of the Shapley Value in Conjunctive Queries. In PODS. 285\u2013297. 10.1145\/3375395.3387664","DOI":"10.1145\/3375395.3387664"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","unstructured":"Benedek Rozemberczki Lauren Watson P\u00e9ter Bayer Hao-Tsung Yang Oliver Kiss Sebastian Nilsson and Rik Sarkar. 2022. The Shapley Value in Machine Learning. In IJCAI. 5572\u20135579. 10.24963\/IJCAI.2022\/778","DOI":"10.24963\/IJCAI.2022\/778"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3236253"},{"key":"e_1_2_1_30_1","first-page":"39","volume-title":"A Value for n-Person Games. Contributions to the Theory of Games 2, 28","author":"Shapley Lloyd S","year":"1953","unstructured":"Lloyd S Shapley. 1953. A Value for n-Person Games. Contributions to the Theory of Games 2, 28 (1953), 307\u2013317. http:\/\/www.library.fa.ru\/files\/Roth2.pdf#page=39"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.NEUNET.2018.06.006"},{"key":"e_1_2_1_32_1","first-page":"6388","article-title":"Data Banzhaf: A Robust Data Valuation Framework for Machine Learning","volume":"206","author":"Wang Jiachen T.","year":"2023","unstructured":"Jiachen T. Wang and Ruoxi Jia. 2023. Data Banzhaf: A Robust Data Valuation Framework for Machine Learning. In AISTATS, Vol. 206. 6388\u20136421. https:\/\/proceedings.mlr.press\/v206\/wang23e.html","journal-title":"AISTATS"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3749646.3749670","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T03:06:01Z","timestamp":1757041561000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3749646.3749670"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":32,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.14778\/3749646.3749670"],"URL":"https:\/\/doi.org\/10.14778\/3749646.3749670","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}