{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T02:40:57Z","timestamp":1755830457985,"version":"3.44.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,3,12]],"date-time":"2024-03-12T00:00:00Z","timestamp":1710201600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NSF","award":["2008815"],"award-info":[{"award-number":["2008815"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,3,12]]},"abstract":"<jats:p>Q-error -- the standard metric for quantifying the error of individual cardinality estimates -- has been widely adopted as a surrogate for query plan optimality in recent work on learning-based cardinality estimation. However, the only result connecting Q-error with plan optimality is an upper-bound on the cost of the worst possible query plan computed from a set of cardinality estimates---there is no connection between Q-error and the real plans generated by standard query optimizers. Therefore, in order to identify sub-optimal query plans, we propose a learning-based method having as its main feature a novel measure called L1-error. Similar to Q-error, L1-error requires complete knowledge of true cardinalities and estimates for all the sub-plans of a query plan. Unlike Q-error, which considers the estimates independently, L1-error is defined as a permutation distance between true cardinalities and estimates for all the sub-plans having the same number of joins. Moreover, L1-error takes into account errors relative to the magnitude of their cardinalities and gives larger weight to small multi-way joins. Our experimental results confirm that, when L1-error is integrated into a standard decision tree classifier, it leads to the accurate identification of sub-optimal plans across four different benchmarks. This accuracy can be further improved by combining L1-error with Q-error into a composite feature that can be computed without overhead from the same data.<\/jats:p>","DOI":"10.1145\/3639272","type":"journal-article","created":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T18:51:32Z","timestamp":1711479092000},"page":"1-24","source":"Crossref","is-referenced-by-count":0,"title":["Sub-optimal Join Order Identification with L1-error"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3484-2535","authenticated-orcid":false,"given":"Yesdaulet","family":"Izenov","sequence":"first","affiliation":[{"name":"University of California, Merced, Merced, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-7161-8444","authenticated-orcid":false,"given":"Asoke","family":"Datta","sequence":"additional","affiliation":[{"name":"University of California, Merced, Merced, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8407-7563","authenticated-orcid":false,"given":"Brian","family":"Tsan","sequence":"additional","affiliation":[{"name":"University of California, Merced, Merced, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7018-9043","authenticated-orcid":false,"given":"Florin","family":"Rusu","sequence":"additional","affiliation":[{"name":"University of California, Merced, Merced, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,3,26]]},"reference":[{"unstructured":"Peter Boncz. [n. d.]. The IMDB Dataset. http:\/\/homepages.cwi.nl\/~boncz\/job\/imdb.tgz.","key":"e_1_2_2_1_1"},{"key":"e_1_2_2_2_1","volume-title":"TPCTC","author":"Boncz Peter","year":"2017","unstructured":"Peter Boncz, Angelos-Christos Anatiotis, and Steffen Klabe. 2017. JCC-H: Adding Join Crossing Correlations with Skew to TPC-H. In TPCTC 2017. 103--119."},{"key":"e_1_2_2_3_1","volume-title":"Polynomial Heuristics for Query Optimization. In ICDE","author":"Bruno Nicolas","year":"2010","unstructured":"Nicolas Bruno, C\u00e9sar Galindo-Legaria, and Milind Joshi. 2010. Polynomial Heuristics for Query Optimization. In ICDE 2010. 589--600."},{"key":"e_1_2_2_4_1","first-page":"18","article-title":"Pessimistic Cardinality Estimation","volume":"2019","author":"Cai Walter","year":"2019","unstructured":"Walter Cai, Magdalena Balazinska, and Dan Suciu. 2019. Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities. In SIGMOD 2019. 18--35.","journal-title":"Tighter Upper Bounds for Intermediate Join Cardinalities. In SIGMOD"},{"key":"e_1_2_2_5_1","volume-title":"An Overview of Query Optimization in Relational Systems. In PODS","author":"Chaudhuri Surajit","year":"1998","unstructured":"Surajit Chaudhuri. 1998. An Overview of Query Optimization in Relational Systems. In PODS 1998. 34--43."},{"key":"e_1_2_2_6_1","volume-title":"Optimal Top-Down Join Enumeration. In SIGMOD","author":"DeHaan David","year":"2007","unstructured":"David DeHaan and Frank Tompa. 2007. Optimal Top-Down Join Enumeration. In SIGMOD 2007. 785--796."},{"doi-asserted-by":"publisher","key":"e_1_2_2_7_1","DOI":"10.14778\/3484224.3484234"},{"doi-asserted-by":"publisher","key":"e_1_2_2_8_1","DOI":"10.14778\/3329772.3329780"},{"doi-asserted-by":"publisher","key":"e_1_2_2_9_1","DOI":"10.1007\/BFb0054528"},{"key":"e_1_2_2_10_1","volume-title":"Efficiently Computing Join Orders with Heuristic Search. PACMMOD 1, 1","author":"Haffner Immanuel","year":"2023","unstructured":"Immanuel Haffner and Jens Dittrich. 2023. Efficiently Computing Join Orders with Heuristic Search. PACMMOD 1, 1 (2023)."},{"key":"e_1_2_2_11_1","first-page":"752","article-title":"Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation","volume":"15","author":"Han Yuxing","year":"2022","unstructured":"Yuxing Han, Ziniu Wu, Peizhi Wu, Rong Zhu, Jingyi Yang, Liang Wei Tan, Kai Zeng, Gao Cong, Yanzhao Qin, Andreas Pfadler, Zhengping Qian, Jingren Zhou, Jiangneng Li, and Bin Cui. 2022. Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation. PVLDB 15, 4 (2022), 752--765.","journal-title":"PVLDB"},{"doi-asserted-by":"publisher","key":"e_1_2_2_12_1","DOI":"10.14778\/3384345.3384349"},{"doi-asserted-by":"publisher","key":"e_1_2_2_13_1","DOI":"10.1145\/1270.1498"},{"doi-asserted-by":"publisher","key":"e_1_2_2_14_1","DOI":"10.1145\/119995.115835"},{"doi-asserted-by":"crossref","unstructured":"Yesdaulet Izenov. 2024. Sub-optimal Join Order Identification with L1-error. https:\/\/github.com\/yizenov\/l1-error.","key":"e_1_2_2_15_1","DOI":"10.1145\/3639272"},{"key":"e_1_2_2_16_1","first-page":"106","article-title":"COMPASS","volume":"2021","author":"Izenov Yesdaulet","year":"2021","unstructured":"Yesdaulet Izenov, Asoke Datta, Florin Rusu, and Jun Hyung Shin. 2021. COMPASS: Online Sketch-based Query Optimization for In-memory Databases. In SIGMOD 2021. 106--117.","journal-title":"Online Sketch-based Query Optimization for In-memory Databases. In SIGMOD"},{"key":"e_1_2_2_17_1","volume-title":"Online Sketch-based Query Optimization. CoRR arXiv:2102.02440v1","author":"Izenov Yesdaulet","year":"2021","unstructured":"Yesdaulet Izenov, Asoke Datta, Florin Rusu, and Jun Hyung Shin. 2021. Online Sketch-based Query Optimization. CoRR arXiv:2102.02440v1 (2021)."},{"doi-asserted-by":"publisher","key":"e_1_2_2_18_1","DOI":"10.1093\/biomet\/30.1-2.81"},{"doi-asserted-by":"publisher","key":"e_1_2_2_19_1","DOI":"10.14778\/3151106.3151112"},{"unstructured":"Andreas Kipf. [n. d.]. JOB-light Benchmark. https:\/\/github.com\/andreaskipf\/learnedcardinalities\/blob\/master\/workloads\/job-light.sql.","key":"e_1_2_2_20_1"},{"key":"e_1_2_2_21_1","volume-title":"Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In CIDR","author":"Kipf Andreas","year":"2019","unstructured":"Andreas Kipf, Thomas Kipf, Bernhard Radke, Victor Leis, Peter Boncz, and Alfons Kemper. 2019. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In CIDR 2019."},{"doi-asserted-by":"publisher","key":"e_1_2_2_22_1","DOI":"10.1145\/352958.352982"},{"key":"e_1_2_2_23_1","volume-title":"Generalized Distances Between Rankings. In WWW","author":"Kumar Ravi","year":"2010","unstructured":"Ravi Kumar and Sergei Vassilvitskii. 2010. Generalized Distances Between Rankings. In WWW 2010. 571--580."},{"doi-asserted-by":"publisher","key":"e_1_2_2_24_1","DOI":"10.1007\/s41019-020-00149-7"},{"doi-asserted-by":"publisher","key":"e_1_2_2_25_1","DOI":"10.14778\/2850583.2850594"},{"doi-asserted-by":"publisher","key":"e_1_2_2_26_1","DOI":"10.1007\/s00778-017-0480-7"},{"key":"e_1_2_2_27_1","volume-title":"Q-error Bounds of Random Uniform Sampling for Cardinality Estimation. CoRR arXiv:2108.02715v2","author":"Li Beibin","year":"2021","unstructured":"Beibin Li, Yao Lu, Chi Wang, and Srikanth Kandula. 2021. Q-error Bounds of Random Uniform Sampling for Cardinality Estimation. CoRR arXiv:2108.02715v2 (2021)."},{"doi-asserted-by":"publisher","key":"e_1_2_2_28_1","DOI":"10.1016\/j.cogsys.2017.03.001"},{"unstructured":"Guy Lohman. 2014. Is Query Optimization a Solved Problem? https:\/\/wp.sigmod.org\/?p=1075.","key":"e_1_2_2_29_1"},{"key":"e_1_2_2_30_1","first-page":"1822","article-title":"Counter Strike: Generic Top-Down Join Enumeration for Hypergraphs","volume":"6","author":"Moerkotte Guido","year":"2013","unstructured":"Guido Moerkotte and Pit Fender. 2013. Counter Strike: Generic Top-Down Join Enumeration for Hypergraphs. PVLDB 6, 14 (2013), 1822--1833.","journal-title":"PVLDB"},{"key":"e_1_2_2_31_1","volume-title":"Analysis of Two Existing and One New Dynamic Programming Algorithm for the Generation of Optimal Bushy Join Trees Without Cross Products. In VLDB","author":"Moerkotte Guido","year":"2006","unstructured":"Guido Moerkotte and Thomas Neumann. 2006. Analysis of Two Existing and One New Dynamic Programming Algorithm for the Generation of Optimal Bushy Join Trees Without Cross Products. In VLDB 2006. 930--941."},{"key":"e_1_2_2_32_1","volume-title":"Dynamic Programming Strikes Back. In SIGMOD","author":"Moerkotte Guido","year":"2008","unstructured":"Guido Moerkotte and Thomas Neumann. 2008. Dynamic Programming Strikes Back. In SIGMOD 2008. 539--552."},{"doi-asserted-by":"publisher","key":"e_1_2_2_33_1","DOI":"10.14778\/1687627.1687738"},{"doi-asserted-by":"publisher","key":"e_1_2_2_34_1","DOI":"10.14778\/3476249.3476259"},{"doi-asserted-by":"publisher","key":"e_1_2_2_35_1","DOI":"10.1109\/ICDEW49219.2020.00034"},{"key":"e_1_2_2_36_1","first-page":"403","article-title":"Query Simplification","volume":"2009","author":"Neumann Thomas","year":"2009","unstructured":"Thomas Neumann. 2009. Query Simplification: Graceful Degradation for Join-Order Optimization. In SIGMOD 2009. 403--414.","journal-title":"Graceful Degradation for Join-Order Optimization. In SIGMOD"},{"unstructured":"Open Source Relational Database [n. d.]. PostgreSQL. www.postgresql.org.","key":"e_1_2_2_37_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_38_1","DOI":"10.1109\/ICDE.2019.00191"},{"unstructured":"Greg Rahn. [n. d.]. Join Order Benchmark (JOB). https:\/\/github.com\/gregrahn\/join-order-benchmark.","key":"e_1_2_2_39_1"},{"key":"e_1_2_2_40_1","volume-title":"SIGMOD","author":"Selinger P. Griffiths","year":"1979","unstructured":"P. Griffiths Selinger, M. M. Astrahan, D. D. Chamberlain, R. A. Lorie, and T. G. Price. 1979. Access Path Selection in a Relational Database Management System. In SIGMOD 1979. 23--34."},{"doi-asserted-by":"publisher","key":"e_1_2_2_41_1","DOI":"10.2307\/1412159"},{"doi-asserted-by":"publisher","key":"e_1_2_2_42_1","DOI":"10.1007\/s007780050040"},{"key":"e_1_2_2_43_1","first-page":"367","article-title":"Optimization of Large Join Queries","volume":"1989","author":"Swami Arun","year":"1989","unstructured":"Arun Swami. 1989. Optimization of Large Join Queries: Combining Heuristics and Combinatorial Techniques. In SIGMOD 1989. 367--376.","journal-title":"Combining Heuristics and Combinatorial Techniques. In SIGMOD"},{"doi-asserted-by":"publisher","key":"e_1_2_2_44_1","DOI":"10.14778\/3461535.3461552"},{"doi-asserted-by":"publisher","key":"e_1_2_2_45_1","DOI":"10.14778\/3236187.3236191"},{"doi-asserted-by":"publisher","key":"e_1_2_2_46_1","DOI":"10.14778\/3421424.3421432"},{"doi-asserted-by":"publisher","key":"e_1_2_2_47_1","DOI":"10.14778\/3368289.3368294"},{"doi-asserted-by":"publisher","key":"e_1_2_2_48_1","DOI":"10.14778\/3461535.3461539"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3639272","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3639272","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T15:13:48Z","timestamp":1755789228000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3639272"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,12]]},"references-count":48,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,3,12]]}},"alternative-id":["10.1145\/3639272"],"URL":"https:\/\/doi.org\/10.1145\/3639272","relation":{},"ISSN":["2836-6573"],"issn-type":[{"type":"electronic","value":"2836-6573"}],"subject":[],"published":{"date-parts":[[2024,3,12]]}}}