{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T09:01:18Z","timestamp":1775638878012,"version":"3.50.1"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:p>Many database systems offer index tuning tools that help automatically select appropriate indexes for improving the performance of an input workload. Index tuning is a resource-intensive and time-consuming task requiring expensive optimizer calls for estimating the cost of queries over potential index configurations. In this work, we develop low-overhead techniques that can be leveraged by index tuning tools for reducing a large number of optimizer calls without making changes to the tuning algorithm or to the query optimizer. First, index tuning tools use rule-based techniques to generate a large number of syntactically-relevant indexes; however, a large proportion of such indexes are spurious and do not lead to a significant improvement in the performance of queries. We eliminate such indexes much earlier in the search by leveraging patterns in the workload, without making optimizer calls. Second, we learn cost models that exploit the similarity between query and index configuration pairs in the workload to efficiently estimate the cost of queries over a large number of index configurations using fewer optimizer calls. We perform an extensive evaluation over both real-world and synthetic benchmarks, and show that given the same set of input queries, indexes, and the search algorithm for exploration, our proposed techniques can lead to a median reduction in tuning time of 3X and a maximum of 12X compared to state-of-the-art tuning tools with similar quality of recommended indexes.<\/jats:p>","DOI":"10.14778\/3547305.3547309","type":"journal-article","created":{"date-parts":[[2022,9,7]],"date-time":"2022-09-07T16:09:53Z","timestamp":1662566993000},"page":"2019-2031","source":"Crossref","is-referenced-by-count":23,"title":["DISTILL"],"prefix":"10.14778","volume":"15","author":[{"given":"Tarique","family":"Siddiqui","sequence":"first","affiliation":[{"name":"Microsoft Research"}]},{"given":"Wentao","family":"Wu","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Vivek","family":"Narasayya","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Surajit","family":"Chaudhuri","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]}],"member":"320","published-online":{"date-parts":[[2022,9,7]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Apr 01 2022. DSB Benchmark. https:\/\/github.com\/microsoft\/dspp-benchmark.  Apr 01 2022. DSB Benchmark. https:\/\/github.com\/microsoft\/dspp-benchmark."},{"key":"e_1_2_1_2_1","unstructured":"Apr 01 2022. DTA utility. https:\/\/docs.microsoft.com\/en-us\/sql\/tools\/dta\/dta-utility?view=sql-server-ver15.  Apr 01 2022. DTA utility. https:\/\/docs.microsoft.com\/en-us\/sql\/tools\/dta\/dta-utility?view=sql-server-ver15."},{"key":"e_1_2_1_3_1","unstructured":"Apr 01 2022. MLPRegressor. https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.neural_network.MLPRegressor.html.  Apr 01 2022. MLPRegressor. https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.neural_network.MLPRegressor.html."},{"key":"e_1_2_1_4_1","unstructured":"Apr 01 2022. Query Store. https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/performance\/monitoring-performance-by-using-the-query-store?view=sql-server-ver15.  Apr 01 2022. Query Store. https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/performance\/monitoring-performance-by-using-the-query-store?view=sql-server-ver15."},{"key":"e_1_2_1_5_1","unstructured":"Apr 01 2022. Statistics. https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/statistics\/statistics?view=sql-server-ver15.  Apr 01 2022. Statistics. https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/statistics\/statistics?view=sql-server-ver15."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066292"},{"key":"e_1_2_1_7_1","volume-title":"Zihong Yuan, Pierre Senellart, and St\u00e9phane Bressan.","author":"Basu Debabrota","year":"2015","unstructured":"Debabrota Basu , Qian Lin , Weidong Chen , Hoang Tam Vo , Zihong Yuan, Pierre Senellart, and St\u00e9phane Bressan. 2015 . Cost-Model Oblivious Database Tuning with Reinforcement Learning. In DEXA. 253--268. Debabrota Basu, Qian Lin, Weidong Chen, Hoang Tam Vo, Zihong Yuan, Pierre Senellart, and St\u00e9phane Bressan. 2015. Cost-Model Oblivious Database Tuning with Reinforcement Learning. In DEXA. 253--268."},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.  Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.","DOI":"10.1145\/1066157.1066184"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 32nd international conference on Very large data bases. Citeseer, 499--510","author":"Bruno Nicolas","year":"2006","unstructured":"Nicolas Bruno and Surajit Chaudhuri . 2006 . To tune or not to tune? A Lightweight Physical Design Alerter . In Proceedings of the 32nd international conference on Very large data bases. Citeseer, 499--510 . Nicolas Bruno and Surajit Chaudhuri. 2006. To tune or not to tune? A Lightweight Physical Design Alerter. In Proceedings of the 32nd international conference on Very large data bases. Citeseer, 499--510."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1292609.1292618"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2004.75"},{"key":"e_1_2_1_12_1","volume-title":"Ashish Kumar Gupta, and Vivek R. Narasayya","author":"Chaudhuri Surajit","year":"2002","unstructured":"Surajit Chaudhuri , Ashish Kumar Gupta, and Vivek R. Narasayya . 2002 . Compressing SQL workloads. In SIGMOD. 488--499. Surajit Chaudhuri, Ashish Kumar Gupta, and Vivek R. Narasayya. 2002. Compressing SQL workloads. In SIGMOD. 488--499."},{"key":"e_1_2_1_13_1","volume-title":"Narasayya","author":"Chaudhuri Surajit","year":"1997","unstructured":"Surajit Chaudhuri and Vivek R . Narasayya . 1997 . An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146--155. Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146--155."},{"key":"e_1_2_1_14_1","volume-title":"Narasayya","author":"Chaudhuri Surajit","year":"1998","unstructured":"Surajit Chaudhuri and Vivek R . Narasayya . 1998 . AutoAdmin 'What-if' Index Analysis Utility. In SIGMOD. 367--378. Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin 'What-if' Index Analysis Utility. In SIGMOD. 367--378."},{"key":"e_1_2_1_15_1","volume-title":"Narasayya","author":"Chaudhuri Surajit","year":"1999","unstructured":"Surajit Chaudhuri and Vivek R . Narasayya . 1999 . Index Merging. In ICDE. Surajit Chaudhuri and Vivek R. Narasayya. 1999. Index Merging. In ICDE."},{"key":"e_1_2_1_16_1","unstructured":"Sudipto Das Miroslav Grbic Igor Ilic Isidora Jovandic Andrija Jovanovic Vivek R. Narasayya Miodrag Radulovic Maja Stikic Gaoxiang Xu and Surajit Chaudhuri. 2019. Automatically Indexing Millions of Databases in Microsoft Azure SQL Database. In SIGMOD. 666--679.  Sudipto Das Miroslav Grbic Igor Ilic Isidora Jovandic Andrija Jovanovic Vivek R. Narasayya Miodrag Radulovic Maja Stikic Gaoxiang Xu and Surajit Chaudhuri. 2019. Automatically Indexing Millions of Databases in Microsoft Azure SQL Database. In SIGMOD. 666--679."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3430915.3430931"},{"key":"e_1_2_1_18_1","volume-title":"Narasayya","author":"Ding Bailu","year":"2019","unstructured":"Bailu Ding , Sudipto Das , Ryan Marcus , Wentao Wu , Surajit Chaudhuri , and Vivek R . Narasayya . 2019 . AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD. 1241--1258. Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD. 1241--1258."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407820"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.14778\/3329772.3329780"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/42201.42205"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-155860869-6\/50024-X"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 31st International Conference on Very Large Data Bases. VLDB Endowment. 1228--1239","author":"Haritsa Naveen Reddy","year":"2005","unstructured":"Naveen Reddy Jayant R Haritsa . 2005 . Analyzing plan diagrams of database query optimizers . In Proceedings of the 31st International Conference on Very Large Data Bases. VLDB Endowment. 1228--1239 . Naveen Reddy Jayant R Haritsa. 2005. Analyzing plan diagrams of database query optimizers. In Proceedings of the 31st International Conference on Very Large Data Bases. VLDB Endowment. 1228--1239."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-155860869-6\/50023-8"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357223.3362726"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407832"},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Hai Lan Zhifeng Bao and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105--2108.  Hai Lan Zhifeng Bao and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105--2108.","DOI":"10.1145\/3340531.3412106"},{"key":"e_1_2_1_28_1","unstructured":"Stratos Papadomanolakis Debabrata Dash and Anastassia Ailamaki. 2007. Efficient Use of the Query Optimizer for Automated Database Design. ACM.  Stratos Papadomanolakis Debabrata Dash and Anastassia Ailamaki. 2007. Efficient Use of the Query Optimizer for Automated Database Design. ACM."},{"key":"e_1_2_1_29_1","volume-title":"DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. CoRR abs\/2010.09208","author":"Perera Malinga","year":"2020","unstructured":"Malinga Perera , Bastian Oetomo , Benjamin I. P. Rubinstein , and Renata Borovica-Gajic . 2020. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. CoRR abs\/2010.09208 ( 2020 ). Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica-Gajic. 2020. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. CoRR abs\/2010.09208 (2020)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW49219.2020.00035"},{"key":"e_1_2_1_31_1","volume-title":"Felix Martin Schuhknecht, and Jens Dittrich","author":"Sharma Ankur","year":"2018","unstructured":"Ankur Sharma , Felix Martin Schuhknecht, and Jens Dittrich . 2018 . The Case for Automatic Database Administration using Deep Reinforcement Learning. CoRR abs\/1801.05643 (2018). Ankur Sharma, Felix Martin Schuhknecht, and Jens Dittrich. 2018. The Case for Automatic Database Administration using Deep Reinforcement Learning. CoRR abs\/1801.05643 (2018)."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3380584"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526152"},{"key":"e_1_2_1_34_1","volume-title":"DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning (MSR Technical Report. To be published at VLDB","author":"Siddiqui Tarique","year":"2022","unstructured":"Tarique Siddiqui , Wentao Wu , Vivek Narasayya , and Surajit Chaudhuri . 2022 . DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning (MSR Technical Report. To be published at VLDB 2022.). https:\/\/www.microsoft.com\/en-us\/research\/publication\/distill\/. Tarique Siddiqui, Wentao Wu, Vivek Narasayya, and Surajit Chaudhuri. 2022. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning (MSR Technical Report. To be published at VLDB 2022.). https:\/\/www.microsoft.com\/en-us\/research\/publication\/distill\/."},{"key":"e_1_2_1_35_1","unstructured":"Gary Valentin Michael Zuliani Daniel C. Zilio Guy M. Lohman and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE. 101--110.  Gary Valentin Michael Zuliani Daniel C. Zilio Guy M. Lohman and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE. 101--110."},{"key":"e_1_2_1_36_1","first-page":"434","article-title":"FLAML: a fast and lightweight AutoML Library","volume":"3","author":"Wang Chi","year":"2021","unstructured":"Chi Wang , Qingyun Wu , Markus Weimer , and Erkang Zhu . 2021 . FLAML: a fast and lightweight AutoML Library . Proceedings of Machine Learning and Systems 3 (2021), 434 -- 447 . Chi Wang, Qingyun Wu, Markus Weimer, and Erkang Zhu. 2021. FLAML: a fast and lightweight AutoML Library. Proceedings of Machine Learning and Systems 3 (2021), 434--447.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526128"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3547305.3547309","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:11:03Z","timestamp":1672225863000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3547305.3547309"}},"subtitle":["low-overhead data-driven techniques for filtering and costing indexes for scalable index tuning"],"short-title":[],"issued":{"date-parts":[[2022,6]]},"references-count":37,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["10.14778\/3547305.3547309"],"URL":"https:\/\/doi.org\/10.14778\/3547305.3547309","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2022,6]]}}}