{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:04:21Z","timestamp":1750309461384,"version":"3.41.0"},"reference-count":21,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T00:00:00Z","timestamp":1739750400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"FWO","award":["G062721N, G055219N"],"award-info":[{"award-number":["G062721N, G055219N"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Logic"],"published-print":{"date-parts":[[2025,4,30]]},"abstract":"<jats:p>Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query. This property is referred to as parallel-correctness. Another key problem is to detect whether the data reshuffle step can be avoided when evaluating subsequent queries. The latter problem is referred to as transfer of parallel-correctness. This article extends the study of parallel-correctness and transfer of parallel-correctness of conjunctive queries to incorporate bag semantics. We provide semantical characterizations for both problems, obtain complexity bounds, and discuss the relationship with their set semantics counterparts. Finally, we revisit both problems under a modified distribution model that takes advantage of a linear order on compute nodes and obtain tight complexity bounds.<\/jats:p>","DOI":"10.1145\/3712291","type":"journal-article","created":{"date-parts":[[2025,1,15]],"date-time":"2025-01-15T15:15:16Z","timestamp":1736954116000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Parallel-Correctness and Transferability for Conjunctive Queries under Bag Semantics"],"prefix":"10.1145","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4032-0709","authenticated-orcid":false,"given":"Bas","family":"Ketsman","sequence":"first","affiliation":[{"name":"Vrije Universiteit Brussel, Brussel, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7143-1903","authenticated-orcid":false,"given":"Frank","family":"Neven","sequence":"additional","affiliation":[{"name":"UHasselt, Data Science Institute, ACSL, Hasselt, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7212-4625","authenticated-orcid":false,"given":"Brecht","family":"Vandevoort","sequence":"additional","affiliation":[{"name":"UHasselt, Data Science Institute, ACSL, Hasselt, Belgium"}]}],"member":"320","published-online":{"date-parts":[[2025,2,17]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"4:1","article-title":"GYM: A multiround distributed join algorithm","author":"Afrati Foto N.","year":"2017","unstructured":"Foto N. Afrati, Manas R. Joglekar, Christopher R\u00e9, Semih Salihoglu, and Jeffrey D. Ullman. 2017. GYM: A multiround distributed join algorithm. In ICDT, 4:1\u20134:18.","journal-title":"ICDT"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.14778\/2535570.2488334"},{"key":"e_1_3_2_4_2","first-page":"99","article-title":"Optimizing joins in a map-reduce environment","author":"Afrati Foto N.","year":"2010","unstructured":"Foto N. Afrati and Jeffrey D. Ullman. 2010. Optimizing joins in a map-reduce environment. In EDBT, 99\u2013110.","journal-title":"EDBT"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/2949741.2949750"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3106412"},{"key":"e_1_3_2_7_2","first-page":"212","article-title":"Skew in parallel query processing","author":"Beame Paul","year":"2014","unstructured":"Paul Beame, Paraschos Koutris, and Dan Suciu. 2014. Skew in parallel query processing. In PODS. ACM, 212\u2013223.","journal-title":"PODS"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/153850.153856"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-008-0122-1"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/93597.98724"},{"key":"e_1_3_2_11_2","first-page":"9:1","article-title":"Parallel-correctness and containment for conjunctive queries with union and negation","author":"Geck Gaetano","year":"2016","unstructured":"Gaetano Geck, Bas Ketsman, Frank Neven, and Thomas Schwentick. 2016. Parallel-correctness and containment for conjunctive queries with union and negation. In ICDT, 9:1\u20139:17.","journal-title":"ICDT"},{"key":"e_1_3_2_12_2","first-page":"13:1","volume-title":"ICDT","volume":"155","author":"Geck Gaetano","year":"2020","unstructured":"Gaetano Geck, Frank Neven, and Thomas Schwentick. 2020. Distribution constraints: The chase for distributed data. In ICDT. LIPIcs, Vol. 155, Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik, 13:1\u201313:19."},{"key":"e_1_3_2_13_2","unstructured":"Hadoop. 2024. Retrieved from https:\/\/hadoop.apache.org\/"},{"issue":"5","key":"e_1_3_2_14_2","first-page":"965","article-title":"Distribution policies for datalog","volume":"64","author":"Ketsman Bas","year":"2020","unstructured":"Bas Ketsman, Aws Albarghouthi, and Paraschos Koutris. 2020. Distribution policies for datalog. TOCS 64, 5 (2020), 965\u2013998.","journal-title":"TOCS"},{"key":"e_1_3_2_15_2","first-page":"18:1","volume-title":"ICDT","volume":"98","author":"Ketsman Bas","year":"2018","unstructured":"Bas Ketsman, Frank Neven, and Brecht Vandevoort. 2018. Parallel-correctness and transferability for conjunctive queries under bag semantics. In ICDT. LIPIcs, Vol. 98, Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik, 18:1\u201318:16."},{"key":"e_1_3_2_16_2","first-page":"417","article-title":"A worst-case optimal multi-round algorithm for parallel computation of conjunctive queries","author":"Ketsman Bas","year":"2017","unstructured":"Bas Ketsman and Dan Suciu. 2017. A worst-case optimal multi-round algorithm for parallel computation of conjunctive queries. In PODS, 417\u2013428.","journal-title":"PODS"},{"key":"e_1_3_2_17_2","first-page":"223","article-title":"Parallel evaluation of conjunctive queries","author":"Koutris Paraschos","year":"2011","unstructured":"Paraschos Koutris and Dan Suciu. 2011. Parallel evaluation of conjunctive queries. In PODS, 223\u2013234.","journal-title":"PODS"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989444"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564757"},{"key":"e_1_3_2_20_2","first-page":"22:1","volume-title":"ICDT","volume":"186","author":"Sundarmurthy Bruhathi","year":"2021","unstructured":"Bruhathi Sundarmurthy, Paraschos Koutris, and Jeffrey F. Naughton. 2021. Locality-aware distribution schemes. In ICDT. LIPIcs, Vol. 186, Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik, 22:1\u201322:25."},{"key":"e_1_3_2_21_2","first-page":"331","article-title":"The complexity of querying indefinite data about linearly ordered domains","author":"Meyden Ron van der","year":"1992","unstructured":"Ron van der Meyden. 1992. The complexity of querying indefinite data about linearly ordered domains. In PODS, 331\u2013345.","journal-title":"PODS"},{"key":"e_1_3_2_22_2","first-page":"13","article-title":"Shark: SQL and rich analytics at scale","author":"Xin Reynold S.","year":"2013","unstructured":"Reynold S. Xin, Josh Rosen, Matei Zaharia, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2013. Shark: SQL and rich analytics at scale. In SIGMOD, 13\u201324.","journal-title":"SIGMOD"}],"container-title":["ACM Transactions on Computational Logic"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712291","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3712291","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:10:28Z","timestamp":1750295428000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712291"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,17]]},"references-count":21,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,4,30]]}},"alternative-id":["10.1145\/3712291"],"URL":"https:\/\/doi.org\/10.1145\/3712291","relation":{},"ISSN":["1529-3785","1557-945X"],"issn-type":[{"type":"print","value":"1529-3785"},{"type":"electronic","value":"1557-945X"}],"subject":[],"published":{"date-parts":[[2025,2,17]]},"assertion":[{"value":"2022-10-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}