{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,24]],"date-time":"2025-08-24T01:35:40Z","timestamp":1755999340580,"version":"3.44.0"},"reference-count":17,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2024,5,10]],"date-time":"2024-05-10T00:00:00Z","timestamp":1715299200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000780","name":"European Union","doi-asserted-by":"crossref","award":["P2022KHTX7"],"award-info":[{"award-number":["P2022KHTX7"]}],"id":[{"id":"10.13039\/501100000780","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000266","name":"EPSRC","doi-asserted-by":"crossref","award":["EP\/S003800\/1"],"award-info":[{"award-number":["EP\/S003800\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,5,10]]},"abstract":"<jats:p>Datalog is a powerful rule-based language that allows us to express complex recursive queries and has found numerous applications over the years. Explaining why a result to a Datalog query is obtained is an essential task towards explainable and transparent data-intensive applications that rely on Datalog. A standard way of explaining a query result is the so-called why-provenance, which provides information about the witnesses to a query result in the form of subsets of the input database that as a whole can be used to derive that result. To our surprise, despite the fact that the notion of why-provenance for Datalog queries has been around for decades and intensively studied, its computational complexity remains unexplored. Our goal is to fill this gap in the why-provenance literature. Towards this end, we pinpoint the data complexity of why-provenance for Datalog queries and key subclasses thereof. The takeaway of our work is that why-provenance for recursive queries, even if the recursion is limited to be linear, is an intractable problem, whereas for non-recursive queries is highly tractable.<\/jats:p>","DOI":"10.1145\/3651146","type":"journal-article","created":{"date-parts":[[2024,5,14]],"date-time":"2024-05-14T08:32:13Z","timestamp":1715675533000},"page":"1-16","source":"Crossref","is-referenced-by-count":2,"title":["The Complexity of Why-Provenance for Datalog Queries"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0921-4040","authenticated-orcid":false,"given":"Marco","family":"Calautti","sequence":"first","affiliation":[{"name":"University of Milano, Milan, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3485-9887","authenticated-orcid":false,"given":"Ester","family":"Livshits","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4779-3469","authenticated-orcid":false,"given":"Andreas","family":"Pieris","sequence":"additional","affiliation":[{"name":"University of Edinburgh &amp; University of Cyprus, Edinburgh, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8338-5660","authenticated-orcid":false,"given":"Markus","family":"Schneider","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2024,5,14]]},"reference":[{"volume-title":"Foundations of Databases","author":"Abiteboul Serge","key":"e_1_2_1_1_1","unstructured":"Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley."},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Camille Bourgaux Pierre Bourhis Liat Peterfreund and Micha\u00eb l Thomazo. 2022. Revisiting Semiring Provenance for Datalog. In KR.","DOI":"10.24963\/kr.2022\/10"},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Peter Buneman Sanjeev Khanna and Wang Chiew Tan. 2001. Why and Where: A Characterization of Data Provenance. In ICDT. 316--330.","DOI":"10.1007\/3-540-44503-X_20"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Marco Calautti Ester Livshits Andreas Pieris and Markus Schneider. 2024. Computing the Why-Provenance for Datalog Queries via SAT Solvers. In AAAI.","DOI":"10.1609\/aaai.v38i9.28914"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-0000(74)80046-2"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Carlos Viegas Dam\u00e1 sio Anastasia Analyti and Grigoris Antoniou. 2013. Justifications for Logic Programming. In LPNMR. 530--542.","DOI":"10.1007\/978-3-642-40564-8_53"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/502807.502810"},{"key":"e_1_2_1_8_1","unstructured":"Daniel Deutch Tova Milo Sudeepa Roy and Val Tannen. 2014. Circuits for Datalog Provenance. In ICDT. 201--212."},{"key":"e_1_2_1_9_1","volume-title":"Markus Kr\u00f6 tzsch, and Stephan Mennicke","author":"Elhalawati Ali","year":"2022","unstructured":"Ali Elhalawati, Markus Kr\u00f6 tzsch, and Stephan Mennicke. 2022. An Existential Rule Framework for Computing Why-Provenance On-Demand for Datalog. In RuleMLRR."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Javier Esparza Michael Luttenberger and Maximilian Schlund. 2014. FPsolve: A Generic Solver for Fixpoint Equations over Semirings. In CIAA. 1--15.","DOI":"10.1007\/978-3-319-08846-4_1"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00224-011-9327-6"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Todd J. Green Gregory Karvounarakis and Val Tannen. 2007. Provenance semirings. In PODS. 31--40.","DOI":"10.1145\/1265530.1265535"},{"key":"e_1_2_1_13_1","volume-title":"Green and Val Tannen","author":"Todd","year":"2017","unstructured":"Todd J. Green and Val Tannen. 2017. The Semiring Framework for Database Provenance. In PODS. ACM, 93--99."},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Mahmoud Abo Khamis Hung Q. Ngo Reinhard Pichler Dan Suciu and Yisu Remy Wang. 2022. Convergence of Datalog over (Pre-) Semirings. In PODS. ACM 105--117.","DOI":"10.1145\/3517804.3524140"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-018-0518-5"},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Moshe Y. Vardi. 1995. On the Complexity of Bounded-Variable Queries. In PODS. 266--276.","DOI":"10.1145\/212433.212474"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379446"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3651146","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3651146","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T21:39:24Z","timestamp":1755898764000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3651146"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,10]]},"references-count":17,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,5,10]]}},"alternative-id":["10.1145\/3651146"],"URL":"https:\/\/doi.org\/10.1145\/3651146","relation":{},"ISSN":["2836-6573"],"issn-type":[{"type":"electronic","value":"2836-6573"}],"subject":[],"published":{"date-parts":[[2024,5,10]]}}}