{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T04:41:10Z","timestamp":1773895270045,"version":"3.50.1"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2012,7]]},"abstract":"<jats:p>The ability to estimate resource consumption of SQL queries is crucial for a number of tasks in a database system such as admission control, query scheduling and costing during query optimization. Recent work has explored the use of statistical techniques for resource estimation in place of the manually constructed cost models used in query optimization. Such techniques, which require as training data examples of resource usage in queries, offer the promise of superior estimation accuracy since they can account for factors such as hardware characteristics of the system or bias in cardinality estimates. However, the proposed approaches lack robustness in that they do not generalize well to queries that are different from the training examples, resulting in significant estimation errors. Our approach aims to address this problem by combining knowledge of database query processing with statistical models. We model resource-usage at the level of individual operators, with different models and features for each operator type, and explicitly model the asymptotic behavior of each operator. This results in significantly better estimation accuracy and the ability to estimate resource usage of arbitrary plans, even when they are very different from the training instances. We validate our approach using various large scale real-life and benchmark workloads on Microsoft SQL Server.<\/jats:p>","DOI":"10.14778\/2350229.2350269","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"1555-1566","source":"Crossref","is-referenced-by-count":85,"title":["Robust estimation of resource consumption for SQL queries using statistical techniques"],"prefix":"10.14778","volume":"5","author":[{"given":"Jiexing","family":"Li","sequence":"first","affiliation":[{"name":"University of Wisconsin - Madison, Madison, WI"}]},{"given":"Arnd Christian","family":"K\u00f6nig","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]},{"given":"Vivek","family":"Narasayya","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]},{"given":"Surajit","family":"Chaudhuri","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]}],"member":"320","published-online":{"date-parts":[[2012,7]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Decision Tree Learning Bregman Divergences and the Histogram Trick. In submission.  Decision Tree Learning Bregman Divergences and the Histogram Trick. In submission."},{"key":"e_1_2_1_2_1","unstructured":"Program for TPC-H data generation with Skew. ftp:\/\/ftp.research.microsoft.com\/users\/viveknar\/TPCDSkew\/.  Program for TPC-H data generation with Skew. ftp:\/\/ftp.research.microsoft.com\/users\/viveknar\/TPCDSkew\/."},{"key":"e_1_2_1_3_1","unstructured":"TPC-H and TPC-DS Benchmarks. http:\/\/www.tpc.org.  TPC-H and TPC-DS Benchmarks. http:\/\/www.tpc.org."},{"key":"e_1_2_1_4_1","unstructured":"WEKA workbench. http:\/\/www.cs.waikato.ac.nz\/ml\/weka\/.  WEKA workbench. http:\/\/www.cs.waikato.ac.nz\/ml\/weka\/."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304198"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1951365.1951419"},{"key":"e_1_2_1_7_1","first-page":"167","volume-title":"CIDR","author":"Akdere M.","year":"2011","unstructured":"M. Akdere , U. Cetintemel , M. Riondato , E. Upfal , and S. Zdonik . The Case for Predictive Database Systems: Opportunities and Challenges . In CIDR , pages 167 -- 174 , 2011 . M. Akdere, U. Cetintemel, M. Riondato, E. Upfal, and S. Zdonik. The Case for Predictive Database Systems: Opportunities and Challenges. In CIDR, pages 167--174, 2011."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.64"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/375663.375686"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989357"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/645791.668131"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989359"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1247480.1247598"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1013203451"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2009.130"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/152610.152611"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402707.3402746"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972764.4"},{"key":"e_1_2_1_19_1","first-page":"19","volume-title":"VLDB","author":"Stillger M.","year":"2001","unstructured":"M. Stillger , G. M. Lohman , V. Markl , and M. Kandil . LEO - DB2's LEarning Optimizer . In VLDB , pages 19 -- 28 , 2001 . M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2's LEarning Optimizer. In VLDB, pages 19--28, 2001."},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-3264-1","volume-title":"Statistical Learning Theory","author":"Vapnik V.","year":"2000","unstructured":"V. Vapnik . Statistical Learning Theory . Wiley , 2000 . V. Vapnik. Statistical Learning Theory. Wiley, 2000."},{"key":"e_1_2_1_21_1","volume-title":"Microsoft Research","author":"Wu Q.","year":"2008","unstructured":"Q. Wu , C. J. Burges , K. M. Svore , and J. Gao . Ranking, Boosting, and Model Adaptation. Technical report , Microsoft Research , 2008 . Q. Wu, C. J. Burges, K. M. Svore, and J. Gao. Ranking, Boosting, and Model Adaptation. Technical report, Microsoft Research, 2008."},{"key":"e_1_2_1_22_1","first-page":"289","volume-title":"VLDB","author":"Zhang N.","year":"2005","unstructured":"N. Zhang , P. J. Haas , V. Josifovski , G. M. Lohman , and C. Zhang . Statistical Learning Techniques for Costing XML Queries . In VLDB , pages 289 -- 300 , 2005 . N. Zhang, P. J. Haas, V. Josifovski, G. M. Lohman, and C. Zhang. Statistical Learning Techniques for Costing XML Queries. In VLDB, pages 289--300, 2005."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2350229.2350269","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:30:21Z","timestamp":1672227021000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2350229.2350269"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,7]]},"references-count":22,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2012,7]]}},"alternative-id":["10.14778\/2350229.2350269"],"URL":"https:\/\/doi.org\/10.14778\/2350229.2350269","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2012,7]]}}}