{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T00:41:20Z","timestamp":1722991280179},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"9","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,5]]},"abstract":"<jats:p>Window function expressions (WFEs) became part of the SQL:2003 standard, and since then, they have often been implemented in database systems (DBS). They are especially essential to OLAP DBSs, and people use them daily. Even though WFEs are a heavily used part of the SQL language, the amount of research done on their optimization in the last two decades is not significant.<\/jats:p>\n          <jats:p>WFE does not extend the expressive power of the SQL language, but it makes writing SQL queries easier and more transparent. DBSs always compile SQL queries with WFE using a sequence of partition-sort-compute operators, which we call a linear strategy. Plans resulting from the linear strategy are robust and, in many cases, efficient.<\/jats:p>\n          <jats:p>This article introduces an alternative strategy using a self-join, which is not considered in the current DBSs. We call it the self-join strategy, and it is based on an SQL query transformation where the result query uses a self-join query plan to compute WFE. One output of this work is a tool that can automatically perform such SQL query transformations.<\/jats:p>\n          <jats:p>We created a microbenchmark showing that the self-join strategy is more effective than the linear strategy in many cases. We also performed a cost-based experiment to evaluate the query optimizers' ability to select an appropriate strategy. The article's main aim is to show that usage of the self-join strategy for queries with WFE is beneficial if selected in a cost-based manner.<\/jats:p>","DOI":"10.14778\/3665844.3665848","type":"journal-article","created":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T22:19:07Z","timestamp":1722982747000},"page":"2162-2174","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Window Function Expression: Let the Self-Join Enter"],"prefix":"10.14778","volume":"17","author":[{"given":"Radim","family":"Ba\u010da","sequence":"first","affiliation":[{"name":"VSB - Technical Universitty of Ostrava, Ostrava, Czech Republic"}]}],"member":"320","published-online":{"date-parts":[[2024,8,6]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"1026","article-title":"Cost-based Query Transformation in Oracle","volume":"6","author":"Ahmed Rafi","year":"2006","unstructured":"Rafi Ahmed, Allison Lee, Andrew Witkowski, Dinesh Das, Hong Su, Mohamed Zait, and Thierry Cruanes. 2006. Cost-based Query Transformation in Oracle. In VLDB, Vol. 6. 1026--1036.","journal-title":"VLDB"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the Thirtieth international conference on Very large data bases","volume":"4","author":"Arasu Arvind","year":"2004","unstructured":"Arvind Arasu and Jennifer Widom. 2004. Resource Sharing in Continuous Sliding-window Aggregates. In Proceedings of the Thirtieth international conference on Very large data bases, Vol. 4. 336--347."},{"key":"e_1_2_1_3_1","volume-title":"DuckDB - Window Functions. Retrieved","author":"DB","year":"2023","unstructured":"DuckDB authors. 2023. DuckDB - Window Functions. Retrieved July 8, 2023. https:\/\/duckdb.org\/docs\/sql\/window_functions"},{"key":"e_1_2_1_4_1","volume-title":"Retrieved","author":"Impala","year":"2023","unstructured":"Impala authors. 2023. Subqueries in Impala SELECT Statements. Retrieved August 22, 2023. https:\/\/impala.apache.org\/docs\/build\/html\/topics\/impala_subqueries.html"},{"key":"e_1_2_1_5_1","volume-title":"MySQL - Window Functions. Retrieved","author":"SQL","year":"2023","unstructured":"MySQL authors. 2023. MySQL - Window Functions. Retrieved July 8, 2023. https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/window-functions.html"},{"key":"e_1_2_1_6_1","volume-title":"Retrieved","author":"SQL","year":"2023","unstructured":"PostgreSQL authors. 2023. PostgreSql 14.2. Retrieved August 20, 2022. https:\/\/www.postgresql.org\/docs\/14\/release-14.html"},{"key":"e_1_2_1_7_1","volume-title":"SQL Performance Tips 1. Retrieved","author":"Banhudo Guilherme","year":"2024","unstructured":"Guilherme Banhudo. 2020. SQL Performance Tips 1. Retrieved January 2, 2024. https:\/\/towardsdatascience.com\/sql-performance-tips-1-50eb318cd0e5"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"1366","DOI":"10.14778\/1687553.1687563","article-title":"Enhanced Subquery Optimizations in Oracle","volume":"2","author":"Bellamkonda Srikanth","year":"2009","unstructured":"Srikanth Bellamkonda, Rafi Ahmed, Andrew Witkowski, Angela Amor, Mohamed Zait, and Chun-Chieh Lin. 2009. Enhanced Subquery Optimizations in Oracle. Proceedings of the VLDB Endowment 2, 2 (2009), 1366--1377.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","first-page":"2588","DOI":"10.1016\/j.tcs.2010.05.003","article-title":"Towards Optimal Range Medians","volume":"412","author":"Brodal Gerth St\u00f8lting","year":"2011","unstructured":"Gerth St\u00f8lting Brodal, Beat Gfeller, Allan Gr\u00f8nlund J\u00f8rgensen, and Peter Sanders. 2011. Towards Optimal Range Medians. Theoretical Computer Science 412, 24 (2011), 2588--2601.","journal-title":"Theoretical Computer Science"},{"key":"e_1_2_1_10_1","volume-title":"Data Structures for Range Median Queries. In International Symposium on Algorithms and Computation. Springer, 822--831","author":"Brodal Gerth St\u00f8lting","year":"2009","unstructured":"Gerth St\u00f8lting Brodal and Allan Gr\u00f8nlund J\u00f8rgensen. 2009. Data Structures for Range Median Queries. In International Symposium on Algorithms and Computation. Springer, 822--831."},{"key":"e_1_2_1_11_1","first-page":"11","volume-title":"Proceedings of the VLDB Endowment 5","author":"Cao Yu","year":"2012","unstructured":"Yu Cao, Chee-Yong Chan, Jie Li, and Kian-Lee Tan. 2012. Optimization of Analytic Window Functions. Proceedings of the VLDB Endowment 5, 11 (2012)."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. ACM, 34--43","author":"Chaudhuri Surajit","year":"1998","unstructured":"Surajit Chaudhuri. 1998. An Overview of Query Optimization in Relational Systems. In Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. ACM, 34--43."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the VLDB Conference. VLDB Endowment, 197--208","author":"Dayal Umeshwar","year":"1987","unstructured":"Umeshwar Dayal. 1987. Of Nests aud Trees: A Untied Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers. In Proceedings of the VLDB Conference. VLDB Endowment, 197--208."},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"1206","DOI":"10.14778\/3389133.3389138","article-title":"Quantifying TPC-H Choke Points and Their Optimizations","volume":"13","author":"Dreseler Markus","year":"2020","unstructured":"Markus Dreseler, Martin Boissier, Tilmann Rabl, and Matthias Uflacker. 2020. Quantifying TPC-H Choke Points and Their Optimizations. Proceedings of the VLDB Endowment 13, 8 (2020), 1206--1220.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"e_1_2_1_15_1","volume-title":"Range Medians. In European Symposium on Algorithms. Springer, 503--514","author":"Har-Peled Sariel","year":"2008","unstructured":"Sariel Har-Peled and S Muthukrishnan. 2008. Range Medians. In European Symposium on Algorithms. Springer, 503--514."},{"key":"e_1_2_1_16_1","unstructured":"2023 Hive authors. Retrieved August 22. 2023. Language Manual Sub-Queries. https:\/\/cwiki.apache.org\/confluence\/display\/Hive\/LanguageManual+SubQueries"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3589275","article-title":"T-Rex: Optimizing Pattern Search on Time Series","volume":"1","author":"Huang Silu","year":"2023","unstructured":"Silu Huang, Erkang Zhu, Surajit Chaudhuri, and Leonhard Spiegelberg. 2023. T-Rex: Optimizing Pattern Search on Time Series. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--26.","journal-title":"Proceedings of the ACM on Management of Data"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1145\/234313.234367","article-title":"Query Optimization","volume":"28","author":"Ioannidis Yannis E","year":"1996","unstructured":"Yannis E Ioannidis. 1996. Query Optimization. ACM Computing Surveys (CSUR) 28, 1 (1996), 121--123.","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 805--813","author":"J\u00f8rgensen Allan Gr\u00f8nlund","year":"2011","unstructured":"Allan Gr\u00f8nlund J\u00f8rgensen and Kasper Green Larsen. 2011. Range Selection and Median: Tight Cell Probe Lower Bounds and Adaptive Data Structures. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 805--813."},{"key":"e_1_2_1_20_1","volume-title":"2011 IEEE 27th International Conference on Data Engineering. IEEE, 195--206","author":"Kemper Alfons","year":"2011","unstructured":"Alfons Kemper and Thomas Neumann. 2011. HyPer: A Hybrid OLTP&OLAP Main Memory Database System Based on Virtual Memory snapshots. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 195--206."},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s00778-021-00676-3","article-title":"Data Dependencies for Query Optimization: a Survey","volume":"31","author":"Kossmann Jan","year":"2022","unstructured":"Jan Kossmann, Thorsten Papenbrock, and Felix Naumann. 2022. Data Dependencies for Query Optimization: a Survey. The VLDB Journal 31, 1 (2022), 1--22.","journal-title":"The VLDB Journal"},{"key":"e_1_2_1_22_1","volume-title":"Islands and Gaps in Sequential Numbers. Retrieved","author":"Kozak Alexander","year":"2024","unstructured":"Alexander Kozak. 2006. Islands and Gaps in Sequential Numbers. Retrieved January 2, 2024. https:\/\/learn.microsoft.com\/en-us\/previous-versions\/sql\/legacy\/aa175780(v=sql.80)"},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","first-page":"1058","DOI":"10.14778\/2794367.2794375","article-title":"Efficient Processing of Window Functions in Analytical SQL Queries","volume":"8","author":"Leis Viktor","year":"2015","unstructured":"Viktor Leis, Kan Kundhikanjana, Alfons Kemper, and Thomas Neumann. 2015. Efficient Processing of Window Functions in Analytical SQL Queries. Proceedings of the VLDB Endowment 8, 10 (2015), 1058--1069.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"e_1_2_1_24_1","volume-title":"Retrieved","author":"Moerkotte Guido","year":"2023","unstructured":"Guido Moerkotte. 2023. Building Query Compilers. Retrieved October 10, 2023. http:\/\/pi3.informatik.uni-mannheim.de\/~moer\/querycompiler.pdf"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","first-page":"843","DOI":"10.14778\/3402707.3402723","article-title":"Accelerating Queries with Group-by and Join by GroupJoin","volume":"4","author":"Moerkotte Guido","year":"2011","unstructured":"Guido Moerkotte and Thomas Neumann. 2011. Accelerating Queries with Group-by and Join by GroupJoin. Proceedings of the VLDB Endowment 4, 11 (2011), 843--851.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 18th International Conference on Very Large Data Bases","volume":"92","author":"Muralikrishna M.","year":"1992","unstructured":"M. Muralikrishna. 1992. Improved Unnesting Algorithms for Join Aggregate SQL Queries. In Proceedings of the 18th International Conference on Very Large Data Bases, Vol. 92. Morgan Kaufmann Publishers Inc., 91--102."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 1996 ACM SIGMOD international conference on Management of data. ACM, 435--446","author":"Seshadri Praveen","year":"1996","unstructured":"Praveen Seshadri, Joseph M Hellerstein, Hamid Pirahesh, TY Cliff Leung, Raghu Ramakrishnan, Divesh Srivastava, Peter J Stuckey, and S Sudarshan. 1996. Cost-based optimization for magic: Algebra and implementation. In Proceedings of the 1996 ACM SIGMOD international conference on Management of data. ACM, 435--446."},{"key":"e_1_2_1_28_1","volume-title":"HammerDB: the Open Source Database Load Test Tool. Retrieved","author":"Shaw Steve","year":"2021","unstructured":"Steve Shaw. 2012. HammerDB: the Open Source Database Load Test Tool. Retrieved June 1, 2021. https:\/\/www.hammerdb.com\/"},{"key":"e_1_2_1_29_1","volume-title":"Revisited. In Proceedings of the 2018 International Conference on Management of Data. ACM, 307--322","author":"Tahboub Ruby Y","year":"2018","unstructured":"Ruby Y Tahboub, Gr\u00e9gory M Essertel, and Tiark Rompf. 2018. How to Architect a Query Compiler, Revisited. In Proceedings of the 2018 International Conference on Management of Data. ACM, 307--322."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 2022 International Conference on Management of Data. ACM, 1243--1256","author":"Vogelsgesang Adrian","year":"2022","unstructured":"Adrian Vogelsgesang, Thomas Neumann, Viktor Leis, and Alfons Kemper. 2022. Efficient Evaluation of Arbitrarily-framed Holistic SQL Aggregates and Window Functions. In Proceedings of the 2022 International Conference on Management of Data. ACM, 1243--1256."},{"key":"e_1_2_1_31_1","volume-title":"Windowing in DuckDB. Retrieved","author":"Wesley Richard","year":"2024","unstructured":"Richard Wesley. 2021. Windowing in DuckDB. Retrieved January 2, 2024. https:\/\/duckdb.org\/2021\/10\/13\/windowing.html"},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","first-page":"1221","DOI":"10.14778\/2994509.2994537","article-title":"Incremental Computation of Common Windowed Holistic Aggregates","volume":"9","author":"Wesley Richard","year":"2016","unstructured":"Richard Wesley and Fei Xu. 2016. Incremental Computation of Common Windowed Holistic Aggregates. Proceedings of the VLDB Endowment 9, 12 (2016), 1221--1232.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"e_1_2_1_33_1","volume-title":"Window Function vs. Self-join in SAP HANA. Retrieved","author":"Zhou Wenjun","year":"2024","unstructured":"Wenjun Zhou. 2014. Window Function vs. Self-join in SAP HANA. Retrieved January 2, 2024. https:\/\/blogs.sap.com\/2014\/10\/04\/window-function-vs-self-join-in-sap-hana\/"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 2003 ACM SIGMOD international conference on Management of data. 652--656","author":"Zuzarte Calisto","year":"2003","unstructured":"Calisto Zuzarte, Hamid Pirahesh, Wenbin Ma, Qi Cheng, Linqi Liu, and Kwai Wong. 2003. Winmagic: Subquery Elimination Using Window Aggregation. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data. 652--656."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3665844.3665848","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T22:30:17Z","timestamp":1722983417000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3665844.3665848"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5]]},"references-count":34,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2024,5]]}},"alternative-id":["10.14778\/3665844.3665848"],"URL":"https:\/\/doi.org\/10.14778\/3665844.3665848","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2024,5]]},"assertion":[{"value":"2024-08-06","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}