{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T09:53:41Z","timestamp":1773482021437,"version":"3.50.1"},"reference-count":74,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA1","license":[{"start":{"date-parts":[[2024,4,29]],"date-time":"2024-04-29T00:00:00Z","timestamp":1714348800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["No.62032010, No.62232001, No.62202220, and No.6217220"],"award-info":[{"award-number":["No.62032010, No.62232001, No.62202220, and No.6217220"]}]},{"name":"Fundamental Research Funds for the Central Universities","award":["No.2023300180"],"award-info":[{"award-number":["No.2023300180"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2024,4,29]]},"abstract":"<jats:p>Datalog is a popular and widely-used declarative logic programming language. Datalog engines apply many cross-rule optimizations; bugs in them can cause incorrect results. To detect such optimization bugs, we propose an automated testing approach called Incremental Rule Evaluation (IRE), which synergistically tackles the test oracle and test case generation problem. The core idea behind the test oracle is to compare the results of an optimized program and a program without cross-rule optimization; any difference indicates a bug in the Datalog engine. Our core insight is that, for an optimized, incrementally-generated Datalog program, we can evaluate all rules individually by constructing a reference program to disable the optimizations that are performed among multiple rules. Incrementally generating test cases not only allows us to apply the test oracle for every new rule generated\u2014we also can ensure that every newly added rule generates a non-empty result with a given probability and eschew recomputing already-known facts. We implemented IRE as a tool named Deopt, and evaluated Deopt on four mature Datalog engines, namely Souffl\u00e9, CozoDB, \u03bcZ, and DDlog, and discovered a total of 30 bugs. Of these, 13 were logic bugs, while the remaining were crash and error bugs. Deopt can detect all bugs found by queryFuzz, a state-of-the-art approach. Out of the bugs identified by Deopt, queryFuzz might be unable to detect 5. Our incremental test case generation approach is efficient; for example, for test cases containing 60 rules, our incremental approach can produce 1.17\u00d7 (for DDlog) to 31.02\u00d7 (for Souffl\u00e9) as many valid test cases with non-empty results as the naive random method. We believe that the simplicity and the generality of the approach will lead to its wide adoption in practice.<\/jats:p>","DOI":"10.1145\/3649815","type":"journal-article","created":{"date-parts":[[2024,4,29]],"date-time":"2024-04-29T17:53:50Z","timestamp":1714413230000},"page":"110-136","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Finding Cross-Rule Optimization Bugs in Datalog Engines"],"prefix":"10.1145","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2848-6108","authenticated-orcid":false,"given":"Chi","family":"Zhang","sequence":"first","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4794-1652","authenticated-orcid":false,"given":"Linzhang","family":"Wang","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8303-2099","authenticated-orcid":false,"given":"Manuel","family":"Rigger","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2024,4,29]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/50202.50218"},{"key":"e_1_2_2_2_1","volume-title":"Foundations of databases. 8","author":"Abiteboul Serge","unstructured":"Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of databases. 8, Addison-Wesley Reading."},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2012.04.008"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3088515.3088522"},{"key":"e_1_2_2_5_1","volume-title":"Building a\u00a0Join Optimizer for\u00a0Souffl\u00e9","author":"Arch Samuel","unstructured":"Samuel Arch, Xiaowen Hu, David Zhao, Pavle Suboti\u0107, and Bernhard Scholz. 2022. Building a\u00a0Join Optimizer for\u00a0Souffl\u00e9. In Logic-Based Program Synthesis and Transformation, Alicia Villanueva (Ed.). Springer International Publishing, Cham. 83\u2013102. isbn:978-3-031-16767-6"},{"key":"e_1_2_2_6_1","volume-title":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 1371\u20131382","author":"Aref Molham","year":"2015","unstructured":"Molham Aref, Balder ten Cate, Todd J Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L Veldhuizen, and Geoffrey Washburn. 2015. Design and implementation of the LogicBlox system. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 1371\u20131382."},{"key":"e_1_2_2_7_1","volume-title":"International Conference on Computer Aided Verification. 231\u2013241","author":"Backes John","year":"2019","unstructured":"John Backes, Sam Bayless, Byron Cook, Catherine Dodge, Andrew Gacek, Alan J Hu, Temesghen Kahsai, Bill Kocik, Evgenii Kotelnikov, and Jure Kukovec. 2019. Reachability analysis for AWS-based networks. In International Conference on Computer Aided Verification. 231\u2013241."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/0743-1066(87)90004-5"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/6012.15399"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/16894.16859"},{"key":"e_1_2_2_11_1","volume-title":"Proceedings of the ACM on Programming Languages, 4, OOPSLA","author":"Bembenek Aaron","year":"2020","unstructured":"Aaron Bembenek, Michael Greenberg, and Stephen Chong. 2020. Formulog: Datalog for SMT-based static analysis. Proceedings of the ACM on Programming Languages, 4, OOPSLA (2020), 1\u201331."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3191315.3191322"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1640089.1640108"},{"key":"e_1_2_2_14_1","volume-title":"Vandal: A scalable security analysis framework for smart contracts. arXiv preprint arXiv:1809.03981.","author":"Brent Lexi","year":"2018","unstructured":"Lexi Brent, Anton Jurisevic, Michael Kong, Eric Liu, Francois Gauthier, Vincent Gramoli, Ralph Holz, and Bernhard Scholz. 2018. Vandal: A scalable security analysis framework for smart contracts. arXiv preprint arXiv:1809.03981."},{"key":"e_1_2_2_15_1","volume-title":"Strong and weak constraints in disjunctive datalog","author":"Buccafurri Francesco","unstructured":"Francesco Buccafurri, Nicola Leone, and Pasquale Rullo. 1997. Strong and weak constraints in disjunctive datalog. In Logic Programming And Nonmonotonic Reasoning, J\u00fcrgen Dix, Ulrich Furbach, and Anil Nerode (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 2\u201317. isbn:978-3-540-69249-2"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.877512"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-88594-8_8"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2790449.2790522"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.43410"},{"key":"e_1_2_2_20_1","unstructured":"Tsong Y Chen Shing C Cheung and Shiu Ming Yiu. 2020. Metamorphic testing: a new approach for generating next test cases. arXiv preprint arXiv:2002.12543."},{"key":"e_1_2_2_21_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3143561","article-title":"Metamorphic testing: A review of challenges and opportunities","volume":"51","author":"Chen Tsong Yueh","year":"2018","unstructured":"Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, TH Tse, and Zhi Quan Zhou. 2018. Metamorphic testing: A review of challenges and opportunities. ACM Computing Surveys (CSUR), 51, 1 (2018), 1\u201327.","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"e_1_2_2_22_1","volume-title":"Mendelzon","author":"Consens Mariano P.","year":"1990","unstructured":"Mariano P. Consens and Alberto O. Mendelzon. 1990. Low complexity aggregation in graphlog and Datalog. In ICDT \u201990, Serge Abiteboul and Paris C. Kanellakis (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 379\u2013394. isbn:978-3-540-46682-6"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2391229.2391230"},{"key":"e_1_2_2_24_1","unstructured":"Edsger Wybe Dijkstra. 1970. Notes on structured programming."},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00120"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.761663"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000017"},{"key":"e_1_2_2_28_1","volume-title":"Boon Thau Loo, and Wenchao Zhou.","author":"Green Todd J","year":"2013","unstructured":"Todd J Green, Shan Shan Huang, Boon Thau Loo, and Wenchao Zhou. 2013. Datalog and recursive query processing. Foundations and Trends\u00ae in Databases, 5, 2 (2013), 105\u2013195."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/11785477_2"},{"key":"e_1_2_2_30_1","volume-title":"International Conference on Computer Aided Verification. 457\u2013462","author":"Hoder Kry\u0161tof","year":"2011","unstructured":"Kry\u0161tof Hoder, Nikolaj Bj\u00f8rner, and Leonardo de Moura. 2011. \u03bc Z\u2013an efficient engine for fixed points with constraints. In International Conference on Computer Aided Verification. 457\u2013462."},{"key":"e_1_2_2_31_1","volume-title":"The Choice Construct in the Souffl\u00e9 Language. In Asian Symposium on Programming Languages and Systems. 163\u2013181","author":"Hu Xiaowen","year":"2021","unstructured":"Xiaowen Hu, Joshua Karp, David Zhao, Abdul Zreika, Xi Wu, and Bernhard Scholz. 2021. The Choice Construct in the Souffl\u00e9 Language. In Asian Symposium on Programming Languages and Systems. 163\u2013181."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453483.3454070"},{"key":"e_1_2_2_33_1","unstructured":"Ziyang Hu. 2023. CozoDB: Hippocampus for AI with Embedded Datalog. https:\/\/www.cozodb.org\/ Accessed: 2023-05-16"},{"key":"e_1_2_2_34_1","volume-title":"Proceedings of the 32nd USENIX Security Symposium (Security\u201923)","author":"Jiang Zu-Ming","year":"2023","unstructured":"Zu-Ming Jiang, Jia-Ju Bai, and Zhendong Su. 2023. DynSQL: Stateful Fuzzing for Database Management Systems with Complex and Valid SQL Query Generation. In Proceedings of the 32nd USENIX Security Symposium (Security\u201923)."},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-41540-6_23"},{"key":"e_1_2_2_36_1","volume-title":"Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores. 31\u201340","author":"Jordan Herbert","year":"2019","unstructured":"Herbert Jordan, Pavle Suboti\u0107, David Zhao, and Bernhard Scholz. 2019. Brie: A specialized trie for concurrent datalog. In Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores. 31\u201340."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3293883.3295719"},{"key":"e_1_2_2_38_1","doi-asserted-by":"crossref","first-page":"e5643","DOI":"10.1002\/cpe.5643","article-title":"Specializing parallel data structures for Datalog","volume":"34","author":"Jordan Herbert","year":"2022","unstructured":"Herbert Jordan, Pavle Suboti\u0107, David Zhao, and Bernhard Scholz. 2022. Specializing parallel data structures for Datalog. Concurrency and Computation: Practice and Experience, 34, 2 (2022), e5643.","journal-title":"Concurrency and Computation: Practice and Experience"},{"key":"e_1_2_2_39_1","doi-asserted-by":"crossref","DOI":"10.1561\/9781638280439","volume-title":"Modern Datalog Engines. Foundations and Trends\u00ae in Databases, 12, 1","author":"Ketsman Bas","year":"2022","unstructured":"Bas Ketsman and Paraschos Koutris. 2022. Modern Datalog Engines. Foundations and Trends\u00ae in Databases, 12, 1 (2022), 1\u201368."},{"key":"e_1_2_2_40_1","volume-title":"Workshop on Information Systems and Artificial Intelligence. 118\u2013138","author":"Werner","unstructured":"Werner Kie\u00df ling and Ulrich G\u00fcntzer. 1994. Database reasoning\u2014a deductive framework for solving large and complex problems by means of subsumption. In Workshop on Information Systems and Artificial Intelligence. 118\u2013138."},{"key":"e_1_2_2_41_1","doi-asserted-by":"crossref","unstructured":"Sven K\u00f6hler Bertram Lud\u00e4scher and Yannis Smaragdakis. 2012. Declarative datalog debugging for mere mortals. In International Datalog 2.0 Workshop. 111\u2013122.","DOI":"10.1007\/978-3-642-32925-8_12"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065167.1065169"},{"key":"e_1_2_2_43_1","volume-title":"31st USENIX Security Symposium (USENIX Security 22)","author":"Liang Yu","year":"2022","unstructured":"Yu Liang, Song Liu, and Hong Hu. 2022. Detecting Logical Bugs of $DBMS$ with Coverage-based Guidance. In 31st USENIX Security Symposium (USENIX Security 22). 4309\u20134326."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1592761.1592785"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908096"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3468573"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597926.3598052"},{"key":"e_1_2_2_48_1","first-page":"100","article-title":"Differential testing for software","volume":"10","author":"McKeeman William M","year":"1998","unstructured":"William M McKeeman. 1998. Differential testing for software. Digital Technical Journal, 10, 1 (1998), 100\u2013107.","journal-title":"Digital Technical Journal"},{"key":"e_1_2_2_49_1","volume-title":"International conference on inductive logic programming. 1\u201322","author":"Mooney Raymond J","year":"1996","unstructured":"Raymond J Mooney. 1996. Inductive logic programming for natural language processing. In International conference on inductive logic programming. 1\u201322."},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-78800-3_24"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2019.00015"},{"key":"e_1_2_2_52_1","volume-title":"A survey of deductive database systems. The journal of logic programming, 23, 2","author":"Ramakrishnan Raghu","year":"1995","unstructured":"Raghu Ramakrishnan and Jeffrey D Ullman. 1995. A survey of deductive database systems. The journal of logic programming, 23, 2 (1995), 125\u2013149."},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2345156.2254104"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409710"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3428279"},{"key":"e_1_2_2_56_1","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Rigger Manuel","year":"2020","unstructured":"Manuel Rigger and Zhendong Su. 2020. Testing database engines via pivoted query synthesis. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 667\u2013682."},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/298514.298558"},{"key":"e_1_2_2_58_1","first-page":"4","article-title":"Differential Datalog","volume":"2","author":"Ryzhyk Leonid","year":"2019","unstructured":"Leonid Ryzhyk and Mihai Budiu. 2019. Differential Datalog. Datalog, 2 (2019), 4\u20135.","journal-title":"Datalog"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/191839.191927"},{"key":"e_1_2_2_60_1","volume-title":"Commercial-Grade Static Analyzers in Datalog. SAS","author":"Scholz Bernhard","year":"2022","unstructured":"Bernhard Scholz. 2022. Commercial-Grade Static Analyzers in Datalog. SAS 2022."},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2892208.2892226"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2016.2532875"},{"key":"e_1_2_2_63_1","unstructured":"Andreas Seltenreich. 2023. SQLsmith. https:\/\/github.com\/anse1\/sqlsmith Accessed: 2023-01-25"},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376673"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915229"},{"key":"e_1_2_2_66_1","first-page":"618","article-title":"Massive stochastic testing of SQL","volume":"98","author":"Slutz Donald R","year":"1998","unstructured":"Donald R Slutz. 1998. Massive stochastic testing of SQL. In VLDB. 98, 618\u2013622.","journal-title":"VLDB."},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.14778\/3282495.3282500"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243780"},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/73721.73736"},{"key":"e_1_2_2_70_1","volume-title":"Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. 1\u201310","author":"Gelder Allen Van","year":"1989","unstructured":"Allen Van Gelder. 1989. The alternating fixpoint of logic programs with negation. In Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. 1\u201310."},{"key":"e_1_2_2_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/996841.996859"},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","unstructured":"Chi Zhang Linzhang Wang and Manuel Rigger. 2024. Artifact for \"Finding Cross-Rule Optimization Bugs in Datalog Engines\". https:\/\/doi.org\/10.5281\/zenodo.10609061 10.5281\/zenodo.10609061","DOI":"10.5281\/zenodo.10609061"},{"key":"e_1_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379446"},{"key":"e_1_2_2_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372297.3417260"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3649815","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3649815","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:54:06Z","timestamp":1750287246000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3649815"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,29]]},"references-count":74,"journal-issue":{"issue":"OOPSLA1","published-print":{"date-parts":[[2024,4,29]]}},"alternative-id":["10.1145\/3649815"],"URL":"https:\/\/doi.org\/10.1145\/3649815","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,29]]},"assertion":[{"value":"2024-04-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}