{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T20:23:59Z","timestamp":1781382239917,"version":"3.54.1"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2017,9]]},"abstract":"<jats:p>Accurately predicting the cardinality of intermediate plan operations is an essential part of any modern relational query optimizer. The accuracy of said estimates has a strong and direct impact on the quality of the generated plans, and incorrect estimates can have a negative impact on query performance. One of the biggest challenges in this field is to predict the result size of join operations.<\/jats:p><jats:p>Kernel Density Estimation (KDE) is a statistical method to estimate multivariate probability distributions from a data sample. Previously, we introduced a modern, self-tuning selectivity estimator for range scans based on KDE that out-performs state-of-the-art multidimensional histograms and is efficient to evaluate on graphics cards. In this paper, we extend these bandwidth-optimized KDE models to estimate the result size of single and multiple joins. In particular, we propose two approaches: (1) Building a KDE model from a sample drawn from the join result. (2) Efficiently combining the information from base table KDE models.<\/jats:p><jats:p>We evaluated our KDE-based join estimators on a variety of synthetic and real-world datasets, demonstrating that they are superior to state-of-the art join estimators based on sketching or sampling.<\/jats:p>","DOI":"10.14778\/3151106.3151112","type":"journal-article","created":{"date-parts":[[2017,10,19]],"date-time":"2017-10-19T12:30:08Z","timestamp":1508416208000},"page":"2085-2096","source":"Crossref","is-referenced-by-count":52,"title":["Estimating join selectivities using bandwidth-optimized kernel density models"],"prefix":"10.14778","volume":"10","author":[{"given":"Martin","family":"Kiefer","sequence":"first","affiliation":[{"name":"Technische Universit\u00e4t Berlin"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Max","family":"Heimel","sequence":"additional","affiliation":[{"name":"Snowflake Computing"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sebastian","family":"Bre\u00df","sequence":"additional","affiliation":[{"name":"Technische Universit\u00e4t Berlin and German Research Center for Artificial Intelligence (DFKI)"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Volker","family":"Markl","sequence":"additional","affiliation":[{"name":"Technische Universit\u00e4t Berlin and German Research Center for Artificial Intelligence (DFKI)"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2017,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/303976.303978"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882936"},{"key":"e_1_2_1_3_1","volume-title":"Products and convolutions of gaussian probability density functions","author":"Bromiley P.","year":"2003","unstructured":"P. Bromiley . Products and convolutions of gaussian probability density functions . 2003 . http:\/\/www.tina-vision.net, Internal Report . P. Bromiley. Products and convolutions of gaussian probability density functions. 2003. http:\/\/www.tina-vision.net, Internal Report."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/376284.375686"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304206"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/329.318578"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.61"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/235968.233340"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/376284.375727"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-003-0090-4"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/182591.182594"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2749438"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536360.2536370"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007641"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/119995.115835"},{"key":"e_1_2_1_16_1","unstructured":"S. G. Johnson. The NLopt nonlinear-optimization package. http:\/\/ab-initio.mit.edu\/nlopt. S. G. Johnson. The NLopt nonlinear-optimization package. http:\/\/ab-initio.mit.edu\/nlopt."},{"key":"e_1_2_1_17_1","first-page":"513","volume-title":"EDBT","author":"Kiefer M.","year":"2015","unstructured":"M. Kiefer , M. Heimel , and V. Markl . Demonstrating transfer-efficient sample maintenance on graphics cards . In EDBT , pages 513 -- 516 , 2015 . M. Kiefer, M. Heimel, and V. Markl. Demonstrating transfer-efficient sample maintenance on graphics cards. In EDBT, pages 513--516, 2015."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1247480.1247502"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/1978665.1978666"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850583.2850594"},{"key":"e_1_2_1_21_1","volume-title":"CIDR","author":"Leis V.","year":"2017","unstructured":"V. Leis , B. Radke , A. Gubichev , A. Kemper , and T. Neumann . Cardinality estimation done right: Index-based join sampling . In CIDR , 2017 . V. Leis, B. Radke, A. Gubichev, A. Kemper, and T. Neumann. Cardinality estimation done right: Index-based join sampling. In CIDR, 2017."},{"key":"e_1_2_1_22_1","volume-title":"SIGMOD Blog","author":"Lohman G.","year":"2014","unstructured":"G. Lohman . Is query optimization a \"solved\" problem ? SIGMOD Blog , April 2014 . G. Lohman. Is query optimization a \"solved\" problem? SIGMOD Blog, April 2014."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-006-0030-1"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007642"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687738"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-015-8330-5_4"},{"key":"e_1_2_1_28_1","first-page":"1228","volume-title":"PVLDB","author":"Reddy N.","year":"2005","unstructured":"N. Reddy and J. Haritsa . Analyzing plan diagrams of database query optimizers . In PVLDB , pages 1228 -- 1239 , 2005 . N. Reddy and J. Haritsa. Analyzing plan diagrams of database query optimizers. In PVLDB, pages 1228--1239, 2005."},{"key":"e_1_2_1_29_1","volume-title":"University of Florida","author":"Rusu F.","year":"2009","unstructured":"F. Rusu . Sketches for aggregate estimations over data streams. PhD thesis , University of Florida , 2009 . F. Rusu. Sketches for aggregate estimations over data streams. PhD thesis, University of Florida, 2009."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1386118.1386121"},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","DOI":"10.1002\/9781118575574","volume-title":"Multivariate Density Estimation - Theory, Practice, and Visualization","author":"Scott D.","year":"2015","unstructured":"D. Scott . Multivariate Density Estimation - Theory, Practice, and Visualization . John Wiley & Sons , 2 nd edition, 2015 . D. Scott. Multivariate Density Estimation - Theory, Practice, and Visualization. John Wiley & Sons, 2nd edition, 2015.","edition":"2"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/582095.582099"},{"key":"e_1_2_1_33_1","first-page":"19","volume-title":"PVLDB","author":"Stillger M.","year":"2001","unstructured":"M. Stillger , G. Lohman , V. Markl , and M. Kandil . LEO - db2's learning optimizer . In PVLDB , pages 19 -- 28 , 2001 . M. Stillger, G. Lohman, V. Markl, and M. Kandil. LEO - db2's learning optimizer. In PVLDB, pages 19--28, 2001."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/188573.188621"},{"issue":"11","key":"e_1_2_1_35_1","first-page":"852","article-title":"Lightweight graphical models for selectivity estimation without independence assumptions","volume":"4","author":"Tzoumas K.","year":"2011","unstructured":"K. Tzoumas , A. Deshpande , and C. Jensen . Lightweight graphical models for selectivity estimation without independence assumptions . VLDB , 4 ( 11 ): 852 -- 863 , 2011 . K. Tzoumas, A. Deshpande, and C. Jensen. Lightweight graphical models for selectivity estimation without independence assumptions. VLDB, 4(11):852--863, 2011.","journal-title":"VLDB"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824051"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3147.3165"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3151106.3151112","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,27]],"date-time":"2024-06-27T21:54:39Z","timestamp":1719525279000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3151106.3151112"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,9]]},"references-count":36,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2017,9]]}},"alternative-id":["10.14778\/3151106.3151112"],"URL":"https:\/\/doi.org\/10.14778\/3151106.3151112","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2017,9]]}}}