{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T22:20:34Z","timestamp":1772835634734,"version":"3.50.1"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"8","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,4]]},"abstract":"<jats:p>Acyclic join queries can be evaluated instance-optimally using Yannakakis' algorithm, which avoids needlessly large intermediate results through semi-join passes. Recent work proposes to address the significant hidden constant factors arising from a naive implementation of Yannakakis by decomposing the hash join operator into two suboperators, called Lookup and Expand. We present a novel method for integrating Lookup and Expand plans in interpreted environments, like column stores, formalizing them using Nested Semijoin Algebra (NSA) and implementing them through a shredding approach. We characterize the class of NSA expressions that can be evaluated instance-optimally as those that are 2-phase: no 'shrinking' operator is applied after an unnest (i.e., expand). We introduce Shredded Yannakakis (SYA), an evaluation algorithm for acyclic joins that, starting from a binary join plan, transforms it into a 2-phase NSA plan, and then evaluates it through the shredding technique. We show that SYA is provably robust (i.e., never produces large intermediate results) and without regret (i.e., is never worse than the binary join plan under a suitable cost model) on the class of well-behaved binary join plans. Our experiments on a suite of 1,849 queries show that SYA improves performance for 85.3% of the queries with speedups up to 62.5x, while remaining competitive on the other queries.<\/jats:p>","DOI":"10.14778\/3742728.3742737","type":"journal-article","created":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T13:32:53Z","timestamp":1756906373000},"page":"2413-2426","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores"],"prefix":"10.14778","volume":"18","author":[{"given":"Liese","family":"Bekkers","sequence":"first","affiliation":[{"name":"UHasselt, Data Science Institute"}]},{"given":"Frank","family":"Neven","sequence":"additional","affiliation":[{"name":"UHasselt, Data Science Institute"}]},{"given":"Stijn","family":"Vansummeren","sequence":"additional","affiliation":[{"name":"UHasselt, Data Science Institute"}]},{"given":"Yisu Remy","family":"Wang","sequence":"additional","affiliation":[{"name":"University of California, Los Angeles"}]}],"member":"320","published-online":{"date-parts":[[2025,9,3]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3129246"},{"key":"e_1_2_1_2_1","volume-title":"21st International Workshop, CSL 2007, 16th Annual Conference of the EACSL, Lausanne, Switzerland, September 11\u201315, 2007, Proceedings (Lecture Notes in Computer Science","volume":"222","author":"Bagan Guillaume","year":"2007","unstructured":"Guillaume Bagan, Arnaud Durand, and Etienne Grandjean. 2007. On Acyclic Conjunctive Queries and Constant Delay Enumeration. In Computer Science Logic, 21st International Workshop, CSL 2007, 16th Annual Conference of the EACSL, Lausanne, Switzerland, September 11\u201315, 2007, Proceedings (Lecture Notes in Computer Science, Vol. 4646), Jacques Duparc and Thomas A. Henzinger (Eds.). Springer, 208\u2013222. 10.1007\/978-3-540-74915-8_18"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 13th Annual ACM Symposium on Theory of Computing, May 11\u201313","author":"Beeri Catriel","year":"1981","unstructured":"Catriel Beeri, Ronald Fagin, David Maier, Alberto O. Mendelzon, Jeffrey D. Ullman, and Mihalis Yannakakis. 1981. Properties of Acyclic Database Schemes. In Proceedings of the 13th Annual ACM Symposium on Theory of Computing, May 11\u201313, 1981, Milwaukee, Wisconsin, USA. ACM, 355\u2013362. 10.1145\/800076.802489"},{"key":"e_1_2_1_4_1","unstructured":"Liese Bekkers Frank Neven Stijn Vansummeren and Yisu Remy Wang. 2025. Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores. Technical Report. Full paper version available at https:\/\/arxiv.org\/abs\/2411.04042."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/3681954.3681995"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(95)00024-Q"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529339"},{"key":"e_1_2_1_8_1","volume-title":"International Conference on Management of Data, SIGMOD 2014","author":"Cheney James","year":"2014","unstructured":"James Cheney, Sam Lindley, and Philip Wadler. 2014. Query shredding: efficient relational evaluation of queries over nested multisets. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22\u201327, 2014, Curtis E. Dyreson, Feifei Li, and M. Tamer \u00d6zsu (Eds.). ACM, 1027\u20131038. 10.1145\/2588555.2612186"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(99)00301-1"},{"key":"e_1_2_1_10_1","volume-title":"VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7\u201310","author":"Deutsch Alin","year":"1999","unstructured":"Alin Deutsch, Lucian Popa, and Val Tannen. 1999. Physical Data Independence, Constraints, and Optimization with Universal Plans. In VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7\u201310, 1999, Edinburgh, Scotland, UK, Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie (Eds.). Morgan Kaufmann, 459\u2013470. http:\/\/www.vldb.org\/conf\/1999\/P44.pdf"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2402.322390"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","first-page":"1891","DOI":"10.14778\/3407790.3407797","article-title":"Adopting Worst-Case Optimal Joins in Relational Database Systems","volume":"13","author":"Freitag Michael J.","year":"2020","unstructured":"Michael J. Freitag, Maximilian Bandle, Tobias Schmidt, Alfons Kemper, and Thomas Neumann. 2020. Adopting Worst-Case Optimal Joins in Relational Database Systems. Proc. VLDB Endow. 13, 11 (2020), 1891\u20131904. http:\/\/www.vldb.org\/pvldb\/vol13\/p1891-freitag.pdf","journal-title":"Proc. VLDB Endow."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016","author":"Gottlob Georg","year":"2016","unstructured":"Georg Gottlob, Gianluigi Greco, Nicola Leone, and Francesco Scarcello. 2016. Hypertree Decompositions: Questions and Answers. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Tova Milo and Wang-Chiew Tan (Eds.). ACM, 57\u201374. 10.1145\/2902251.2902309"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2303.02723"},{"key":"e_1_2_1_15_1","unstructured":"M. H. Graham. 1979. On the universal relation. Technical Report. University of Toronto Toronto Ontario Canada."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/3503585.3503586"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017","author":"Idris Muhammad","year":"2017","unstructured":"Muhammad Idris, Mart\u00edn Ugarte, and Stijn Vansummeren. 2017. The Dynamic Yannakakis Algorithm: Compact and Efficient Query Processing Under Updates. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14\u201319, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM, 1259\u20131274. 10.1145\/3035918.3064027"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/S00778-019-00590-9"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016","author":"Khamis Mahmoud Abo","year":"2016","unstructured":"Mahmoud Abo Khamis, Hung Q. Ngo, and Atri Rudra. 2016. FAQ: Questions Asked Frequently. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Tova Milo and Wang-Chiew Tan (Eds.). ACM, 13\u201328. 10.1145\/2902251.2902280"},{"key":"e_1_2_1_20_1","volume-title":"Modular Analytic Query Engine. In Companion of the 2024 International Conference on Management of Data, SIGMOD\/PODS 2024","author":"Lamb Andrew","year":"2024","unstructured":"Andrew Lamb, Yijie Shen, Dani\u00ebl Heres, Jayjeet Chakraborty, Mehmet Ozan Kabak, Liang-Chi Hsieh, and Chao Sun. 2024. Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine. In Companion of the 2024 International Conference on Management of Data, SIGMOD\/PODS 2024, Santiago AA, Chile, June 9\u201315, 2024, Pablo Barcel\u00f3, Nayat S\u00e1nchez Pi, Alexandra Meliou, and S. Sudarshan (Eds.). ACM, 5\u201317. 10.1145\/3626246.3653368"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850583.2850594"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/S00778-017-0480-7"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 2022 International Conference on Management of Data","author":"Mancini Riccardo","year":"2022","unstructured":"Riccardo Mancini, Srinivas Karthik, Bikash Chandra, Vasilis Mageirakos, and Anastasia Ailamaki. 2022. Efficient Massively Parallel Join Optimization for Large Queries. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 122\u2013135. 10.1145\/3514221.3517871"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342643"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/2002938.2002940"},{"key":"e_1_2_1_26_1","volume-title":"Companion of the 43rd Symposium on Principles of Database Systems, PODS 2024","author":"Neumann Thomas","year":"2024","unstructured":"Thomas Neumann. 2024. Closing the Gap between Theory and Practice in Query Optimization. In Companion of the 43rd Symposium on Principles of Database Systems, PODS 2024, Santiago, Chile, June 9\u201315, 2024. ACM, 4. 10.1145\/3635138.3654765"},{"key":"e_1_2_1_27_1","volume-title":"Freitag","author":"Neumann Thomas","year":"2020","unstructured":"Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. In 10th Conference on Innovative Data Systems Research, CIDR 2020, Amsterdam, The Netherlands, January 12\u201315, 2020, Online Proceedings. www.cidrdb.org. http:\/\/cidrdb.org\/cidr2020\/papers\/p29-neumann-cidr20.pdf"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 2018 International Conference on Management of Data","author":"Neumann Thomas","year":"2018","unstructured":"Thomas Neumann and Bernhard Radke. 2018. Adaptive Optimization of Very Large Join Queries. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 677\u2013692. 10.1145\/3183713.3183733"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","author":"Ngo Hung Q.","year":"2018","unstructured":"Hung Q. Ngo. 2018. Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Houston, TX, USA, June 10\u201315, 2018, Jan Van den Bussche and Marcelo Arenas (Eds.). ACM, 111\u2013124. 10.1145\/3196959.3196990"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180143"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2656335"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019","author":"Raasveldt Mark","year":"2019","unstructured":"Mark Raasveldt and Hannes M\u00fchleisen. 2019. DuckDB: an Embeddable Analytical Database. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30-July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1981\u20131984. 10.1145\/3299869.3320212"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/3430915.3442441"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 17th International Conference on Data Engineering, April 2\u20136, 2001","author":"Stocker Konrad","year":"2001","unstructured":"Konrad Stocker, Donald Kossmann, Reinhard Braumandl, and Alfons Kemper. 2001. Integrating Semi-Join-Reducers into State of the Art Query Processors. In Proceedings of the 17th International Conference on Data Engineering, April 2\u20136, 2001, Heidelberg, Germany, Dimitrios Georgakopoulos and Alexander Buchmann (Eds.). IEEE Computer Society, 575\u2013584. 10.1109\/ICDE.2001.914872"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1137\/0213035"},{"key":"e_1_2_1_36_1","first-page":"269","article-title":"Nested Relational","volume":"3","author":"Thomas Stan J.","year":"1986","unstructured":"Stan J. Thomas and Patrick C. Fischer. 1986. Nested Relational Structures. Adv. Comput. Res. 3 (1986), 269\u2013307.","journal-title":"Structures. Adv. Comput. Res."},{"key":"e_1_2_1_37_1","volume-title":"Proc. 17th International Conference on Database Theory (ICDT)","author":"Veldhuizen Todd L.","year":"2014","unstructured":"Todd L. Veldhuizen. 2014. Triejoin: A Simple, Worst-Case Optimal Join Algorithm. In Proc. 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24\u201328, 2014, Nicole Schweikardt, Vassilis Christophides, and Vincent Leroy (Eds.). OpenProceedings.org, 96\u2013106. 10.5441\/002\/ICDT.2014.13"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 25\u201328","author":"Wong Limsoon","year":"1993","unstructured":"Limsoon Wong. 1993. Normal Forms and Conservative Properties for Query Languages over Collection Types. In Proceedings of the Twelfth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 25\u201328, 1993, Washington, DC, USA, Catriel Beeri (Ed.). ACM Press, 26\u201336. 10.1145\/153850.153853"},{"key":"e_1_2_1_39_1","volume-title":"7th International Conference, September 9\u201311","author":"Yannakakis Mihalis","year":"1981","unstructured":"Mihalis Yannakakis. 1981. Algorithms for Acyclic Database Schemes. In Very Large Data Bases, 7th International Conference, September 9\u201311, 1981, Cannes, France, Proceedings. IEEE Computer Society, 82\u201394."},{"key":"e_1_2_1_40_1","volume-title":"COMPSAC 1979","author":"Yu C. T.","year":"1979","unstructured":"C. T. Yu and M. Z. Ozsoyoglu. 1979. An algorithm for tree-query membership of a distributed query. In The IEEE Computer Society's Third International Computer Software and Applications Conference, COMPSAC 1979, 6\u20138 November, 1979, Chicago, Illinois, USA. IEEE, 306\u2013312. 10.1109\/CMPSAC.1979.762509"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3742728.3742737","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T13:36:57Z","timestamp":1756906617000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3742728.3742737"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4]]},"references-count":40,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,4]]}},"alternative-id":["10.14778\/3742728.3742737"],"URL":"https:\/\/doi.org\/10.14778\/3742728.3742737","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,4]]},"assertion":[{"value":"2025-09-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}