{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:13:21Z","timestamp":1779174801944,"version":"3.51.4"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T00:00:00Z","timestamp":1686614400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,6,13]]},"abstract":"<jats:p>We investigate how to efficiently compute the difference result of two (or multiple) conjunctive queries, which is the last operator in relational algebra to be unraveled. The standard approach in practical database systems is to materialize the results for every input query as a separate set, and then compute the difference of two (or multiple) sets. This approach is bottlenecked by the complexity of evaluating every input query individually, which could be very expensive, particularly when there are only a few results in the difference. In this paper, we introduce a new approach by exploiting the structural property of input queries and rewriting the original query by pushing the difference operator down as much as possible. We show that for a large class of difference queries, this approach can lead to a linear-time algorithm, in terms of the input size and (final) output size, i.e., the number of query results that survive from the difference operator. We complete this result by showing the hardness of computing the remaining difference queries in linear time. Although a linear-time algorithm is hard to achieve in general, we also provide some heuristics that can provably improve the standard approach. At last, we compare our approach with standard SQL engines over graph and benchmark datasets. The experiment results demonstrate order-of-magnitude speedups achieved by our approach over the vanilla SQL engine.<\/jats:p>","DOI":"10.1145\/3589298","type":"journal-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T20:26:45Z","timestamp":1687292805000},"page":"1-26","source":"Crossref","is-referenced-by-count":10,"title":["Computing the Difference of Conjunctive Queries Efficiently"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7890-665X","authenticated-orcid":false,"given":"Xiao","family":"Hu","sequence":"first","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0959-5536","authenticated-orcid":false,"given":"Qichen","family":"Wang","sequence":"additional","affiliation":[{"name":"Hong Kong Baptist University, Kowloon, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,6,20]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"DuckDB. https:\/\/duckdb.org\/."},{"key":"e_1_2_2_2_1","unstructured":"MySQL. https:\/\/www.mysql.com\/."},{"key":"e_1_2_2_3_1","unstructured":"Oracle. https:\/\/www.oracle.com\/."},{"key":"e_1_2_2_4_1","unstructured":"PostgreSQL. https:\/\/www.postgre.org\/."},{"key":"e_1_2_2_5_1","unstructured":"SNAP. https:\/\/snap.stanford.edu\/snap\/."},{"key":"e_1_2_2_6_1","unstructured":"SparkSQL. https:\/\/spark.apache.org\/sql\/."},{"key":"e_1_2_2_7_1","unstructured":"SQLite. https:\/\/www.sqlite.org\/."},{"key":"e_1_2_2_8_1","unstructured":"TPC-DS. https:\/\/www.tpc.org\/tpcds\/."},{"key":"e_1_2_2_9_1","unstructured":"TPC-H. https:\/\/www.tpc.org\/tpch\/."},{"key":"e_1_2_2_10_1","unstructured":"https:\/\/arxiv.org\/abs\/2302.13140."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2014.53"},{"key":"e_1_2_2_12_1","volume-title":"Foundations of databases","author":"Abiteboul Serge","unstructured":"Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of databases. Vol. 8. Addison-Wesley Reading."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3517804.3524156"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2902251.2902280"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1514894.1514909"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/2392389.2392412"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2402.322389"},{"key":"e_1_2_2_18_1","volume-title":"Virginia Vassilevska Williams, and Uri Zwick","author":"Bj\u00f6rklund Andreas","year":"2014","unstructured":"Andreas Bj\u00f6rklund, Rasmus Pagh, Virginia Vassilevska Williams, and Uri Zwick. 2014. Listing triangles. In International Colloquium on Automata, Languages, and Programming. Springer, 223--234."},{"key":"e_1_2_2_19_1","unstructured":"Johann Brault-Baron. 2012. A Negative Conjunctive Query is Easy if and only if it is Beta-Acyclic. In Computer Science Logic (CSL'12)-26th International Workshop\/21st Annual Conference of the EACSL. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik."},{"key":"e_1_2_2_20_1","unstructured":"Johann Brault-Baron. 2013. De la pertinence de l'\u00e9num\u00e9ration: complexit\u00e9 en logiques propositionnelle et du premier ordre. Ph. D. Dissertation. Universit\u00e9 de Caen."},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3294052.3319700"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(99)00220-0"},{"key":"e_1_2_2_23_1","volume-title":"Proceedings of the 21st International Conference on Database Theory (ICDT'18)","volume":"98","author":"Christoph Berkholz","year":"2018","unstructured":"Berkholz Christoph, Keppeler Jens, and Schweikardt Nicole. 2018. Answering UCQs under Updates and in the presence of integrity constraints. In Proceedings of the 21st International Conference on Database Theory (ICDT'18), Vol. 98. 1--8."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3380607"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2402.322390"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.2001.1809"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1568318.1568320"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517893"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3547305.3547326"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2902251.2902293"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/275487.275511"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3452021.3458308"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180143"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/263661.263664"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2021.09.004"},{"key":"e_1_2_2_36_1","volume-title":"Proceedings of the fourteenth annual ACM symposium on Theory of computing. 137--146","author":"Vardi Moshe Y","year":"1982","unstructured":"Moshe Y Vardi. 1982. The complexity of relational query languages. In Proceedings of the fourteenth annual ACM symposium on Theory of computing. 137--146."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517830"},{"key":"e_1_2_2_38_1","first-page":"82","article-title":"Algorithms for acyclic database schemes","volume":"81","author":"Yannakakis Mihalis","year":"1981","unstructured":"Mihalis Yannakakis. 1981. Algorithms for acyclic database schemes. In VLDB, Vol. 81. 82--94.","journal-title":"VLDB"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589298","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3589298","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:46:13Z","timestamp":1750178773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589298"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,13]]},"references-count":38,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6,13]]}},"alternative-id":["10.1145\/3589298"],"URL":"https:\/\/doi.org\/10.1145\/3589298","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,13]]}}}