{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T20:36:33Z","timestamp":1780346193016,"version":"3.54.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2020,4,17]],"date-time":"2020-04-17T00:00:00Z","timestamp":1587081600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Australian Government through the ARC Discovery Project","award":["DP180104030"],"award-info":[{"award-number":["DP180104030"]}]},{"name":"AWS"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Program. Lang. Syst."],"published-print":{"date-parts":[[2020,6,30]]},"abstract":"<jats:p>\n            Logic programming languages such as Datalog have become popular as Domain Specific Languages (DSLs) for solving large-scale, real-world problems, in particular, static program analysis and network analysis. The logic specifications that model analysis problems process millions of tuples of data and contain hundreds of highly recursive rules. As a result, they are notoriously difficult to debug. While the database community has proposed several data provenance techniques that address the\n            <jats:italic>Declarative Debugging Challenge<\/jats:italic>\n            for Databases, in the cases of analysis problems, these state-of-the-art techniques do not scale.\n          <\/jats:p>\n          <jats:p>\n            In this article, we introduce a novel bottom-up Datalog evaluation strategy for debugging: Our provenance evaluation strategy relies on a new provenance lattice that includes proof annotations and a new fixed-point semantics for semi-na\u00efve evaluation. A debugging query mechanism allows arbitrary provenance queries, constructing partial proof trees of tuples with minimal height. We integrate our technique into Souffl\u00e9, a Datalog engine that synthesizes C++ code, and achieve high performance by using specialized parallel data structures. Experiments are conducted with D\n            <jats:sc>OOP<\/jats:sc>\n            \/DaCapo, producing proof annotations for tens of millions of output tuples. We show that our method has a runtime overhead of 1.31\u00d7 on average while being more flexible than existing state-of-the-art techniques.\n          <\/jats:p>","DOI":"10.1145\/3379446","type":"journal-article","created":{"date-parts":[[2020,5,4]],"date-time":"2020-05-04T07:56:36Z","timestamp":1588578996000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":20,"title":["Debugging Large-scale Datalog"],"prefix":"10.1145","volume":"42","author":[{"given":"David","family":"Zhao","sequence":"first","affiliation":[{"name":"University of Sydney"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Pavle","family":"Suboti\u0107","sequence":"additional","affiliation":[{"name":"Amazon"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bernhard","family":"Scholz","sequence":"additional","affiliation":[{"name":"University of Sydney"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2020,4,17]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Retrieved","year":"2017","unstructured":"2017. souffle-lang\/souffle: Souffl\u00e9 is a variant of Datalog for tool designers crafting analyses in Horn clauses. Souffl\u00e9 synthesizes a native parallel C++ program from a logic specification. Retrieved October 19, 2017 from http:\/\/souffle-lang.github.io\/."},{"key":"e_1_2_1_2_1","volume-title":"Foundations of Databases","author":"Abiteboul Serge","unstructured":"Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-46663-6_7"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2017.2769056"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742796"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-57530-8_7"},{"key":"e_1_2_1_7_1","first-page":"5","article-title":"An introduction to ULDBs and the trio system","volume":"29","author":"Benjelloun Omar","year":"2006","unstructured":"Omar Benjelloun, Anish Das Sarma, Chris Hayworth, and Jennifer Widom. 2006. An introduction to ULDBs and the trio system. IEEE Data Eng. Bull. 29 (2006), 5\u201316.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1639949.1640108"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/645504.656274"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-88594-8_8"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2790449.2790522"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3106740"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000006"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/2994509.2994530"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824039"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-018-0496-7"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","unstructured":"Daniel Deutch Tova Milo Sudeepa Roy and Val Tannen. 2014. Circuits for datalog provenance. In ICDT. 201--212. DOI:https:\/\/doi.org\/10.5441\/002\/icdt.2014.22","DOI":"10.5441\/002"},{"key":"e_1_2_1_18_1","unstructured":"Lodek Drabent and Simin Nadjm-Tehrani. 1989. Algorithmic debugging with assertions. In Meta-programming in Logic Programming. Citeseer."},{"key":"e_1_2_1_19_1","volume-title":"Datalog disassembly. arXiv preprint arXiv:1906.03969","author":"Flores-Montoya Antonio","year":"2019","unstructured":"Antonio Flores-Montoya and Eric Schulte. 2019. Datalog disassembly. arXiv preprint arXiv:1906.03969 (2019)."},{"key":"e_1_2_1_20_1","first-page":"4","article-title":"Partial evaluation of computation process\u2014An approach to a compiler-compiler","volume":"12","author":"Futamura Yoshihiko","year":"1999","unstructured":"Yoshihiko Futamura. 1999. Partial evaluation of computation process\u2014An approach to a compiler-compiler. High. Order Symbol. Comput. 12, 4 (Dec. 1999), 381--391.","journal-title":"High. Order Symbol. Comput."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2009.15"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-41660-6_16"},{"key":"e_1_2_1_23_1","volume-title":"Gigahorse: Thorough, declarative decompilation of smart contracts (unpublished).","author":"Grech Neville","year":"2019","unstructured":"Neville Grech, Lexi Brent, Bernhard Scholz, and Yannis Smaragdakis. 2019. Gigahorse: Thorough, declarative decompilation of smart contracts (unpublished)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276486"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Sergio Greco and Cristian Molinaro. 2015. Datalog and Logic Databases. Morgan 8 Claypool.","DOI":"10.1007\/978-3-031-01854-1"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the IJCSLP. 294--309","author":"Greco Sergio","year":"1998","unstructured":"Sergio Greco and Carlo Zaniolo. 1998. Greedy algorithms in datalog with choice and negation. In Proceedings of the IJCSLP. 294--309."},{"key":"e_1_2_1_27_1","volume-title":"\u00b5Z\u2014An efficient engine for fixed points with constraints","author":"Hoder Kry\u0161tof","unstructured":"Kry\u0161tof Hoder, Nikolaj Bj\u00f8rner, and Leonardo de Moura. 2011. \u00b5Z\u2014An efficient engine for fixed points with constraints. In Computer Aided Verification, Ganesh Gopalakrishnan and Shaz Qadeer (Eds.). Springer, Berlin, 457--462."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989456"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-41540-6_23"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3293883.3295719"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-57802-1_7"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-32925-8_12"},{"key":"e_1_2_1_33_1","volume-title":"Efficiently computing provenance graphs for queries with negation. CoRR abs\/1701.05699","author":"Lee Seokki","year":"2017","unstructured":"Seokki Lee, Sven K\u00f6hler, Bertram Lud\u00e4scher, and Boris Glavic. 2017. Efficiently computing provenance graphs for queries with negation. CoRR abs\/1701.05699 (2017). http:\/\/arxiv.org\/abs\/1701.05699."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3236233"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526790"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908096"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2786805.2786851"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 14th Conference on USENIX Security Symposium","volume":"14","author":"Ou Xinming","unstructured":"Xinming Ou, Sudhakar Govindavajhala, and Andrew W. Appel. 2005. MulVAL: A logic-based network security analyzer. In Proceedings of the 14th Conference on USENIX Security Symposium, Volume 14 (SSYM\u201905). USENIX Association, Berkeley, CA, 8--8."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192417"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the International Logic Programming Symposium. MIT Press, 321--336","author":"Ramakrishnan Raghu","unstructured":"Raghu Ramakrishnan and S. Sudarshan. 1991. Top-down vs. bottom-up revisited. In Proceedings of the International Logic Programming Symposium. MIT Press, 321--336."},{"key":"e_1_2_1_41_1","volume-title":"Algorithmic Program DeBugging","author":"Shapiro Ehud Y.","unstructured":"Ehud Y. Shapiro. 1983. Algorithmic Program DeBugging. MIT Press, Cambridge, MA."},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Josep Silva. 2007. A comparative study of algorithmic debugging strategies. In Logic-Based Program Synthesis and Transformation Germ\u00e1n Puebla (Ed.). Springer Berlin 143--159.","DOI":"10.1007\/978-3-540-71410-1_11"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250734.1250748"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1094811.1094817"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 7th USENIX Workshop on the Theory and Practice of Provenance (TaPP\u201915)","author":"Stamatogiannakis Manolis","year":"2015","unstructured":"Manolis Stamatogiannakis, Paul Groth, and Herbert Bos. 2015. Decoupling provenance capture and analysis from execution. In Proceedings of the 7th USENIX Workshop on the Theory and Practice of Provenance (TaPP\u201915). USENIX Association."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/3282495.3282500"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/73721.73736"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/11575467_8"},{"key":"e_1_2_1_49_1","volume-title":"Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research (CIDR\u201905)","author":"Widom Jennifer","year":"2005","unstructured":"Jennifer Widom. 2005. Trio: A system for integrated management of data, accuracy, and lineage. In Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research (CIDR\u201905). 262--276."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133881"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2594291.2594327"},{"key":"e_1_2_1_52_1","volume-title":"Large-Scale Provenance for Souffle","author":"Zhao David","unstructured":"David Zhao. 2017. Large-Scale Provenance for Souffle. University of Sydney."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807234"}],"container-title":["ACM Transactions on Programming Languages and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3379446","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3379446","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:22Z","timestamp":1750197742000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3379446"}},"subtitle":["A Scalable Provenance Evaluation Strategy"],"short-title":[],"issued":{"date-parts":[[2020,4,17]]},"references-count":53,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,6,30]]}},"alternative-id":["10.1145\/3379446"],"URL":"https:\/\/doi.org\/10.1145\/3379446","relation":{},"ISSN":["0164-0925","1558-4593"],"issn-type":[{"value":"0164-0925","type":"print"},{"value":"1558-4593","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,17]]},"assertion":[{"value":"2019-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-04-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}