{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T05:37:47Z","timestamp":1768109867711,"version":"3.49.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,3,12]],"date-time":"2024-03-12T00:00:00Z","timestamp":1710201600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,3,12]]},"abstract":"<jats:p>Modern database systems offer index-tuning advisors that automatically identify a set of indexes to improve workload performance. Advisors leverage the optimizer's what-if API to optimize a query for a hypothetical index configuration. Because what-if calls constitute a major bottleneck of index tuning, existing techniques, such as workload compression, help reduce the number of what-if calls to speed up tuning. Unfortunately, even with small workloads and few what-if calls, tuning can still take hours due to the complexity of the queries (e.g., the number of joins, filters, group-by and order-by clauses), which increases their optimization time. This paper introduces workload reduction, a new complementary technique aimed at expediting index tuning by decreasing individual what-if call time without significantly affecting the quality of index tuning. We present an efficient workload reduction algorithm, called Wred, which rewrites each query in the original workload to eliminate column and table expressions unlikely to benefit from indexes, thereby accelerating what-if calls. We study its complexity and ability to maintain high index quality. We perform an extensive evaluation over industry benchmarks and real-world customer workloads, which shows that Wred results in a 3x median speedup in tuning efficiency over an industrial-strength state-of-the-art index advisor, with only a 3.7% median loss in improvement---where improvement is the total workload cost as estimated by the query optimizer---and results in up to 24.7x speedup with 1.8% improvement loss. Furthermore, combining Wred and Isum (a state-of-the-art workload compression technique for index tuning) results in higher speedups than either of the two techniques alone, with 10.5x median speedup and 5% median improvement loss.<\/jats:p>","DOI":"10.1145\/3639305","type":"journal-article","created":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T18:51:32Z","timestamp":1711479092000},"page":"1-26","source":"Crossref","is-referenced-by-count":7,"title":["Wred: Workload Reduction for Scalable Index Tuning"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2730-6432","authenticated-orcid":false,"given":"Matteo","family":"Brucato","sequence":"first","affiliation":[{"name":"Microsoft Research, Redmond, WA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-0866-7275","authenticated-orcid":false,"given":"Tarique","family":"Siddiqui","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-2454-7109","authenticated-orcid":false,"given":"Wentao","family":"Wu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7011-7886","authenticated-orcid":false,"given":"Vivek","family":"Narasayya","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8252-5270","authenticated-orcid":false,"given":"Surajit","family":"Chaudhuri","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA, USA"}]}],"member":"320","published-online":{"date-parts":[[2024,3,26]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066292"},{"key":"e_1_2_1_2_1","first-page":"496","article-title":"Automated selection of materialized views and indexes in SQL databases","volume":"2000","author":"Agrawal Sanjay","year":"2000","unstructured":"Sanjay Agrawal, Surajit Chaudhuri, and Vivek R Narasayya. 2000. Automated selection of materialized views and indexes in SQL databases. In VLDB, Vol. 2000. 496--505.","journal-title":"VLDB"},{"key":"e_1_2_1_3_1","volume-title":"Wred: Workload Reduction for Scalable Index Tuning (extended version). https:\/\/matteo-brucato.github.io\/files\/wred_extended.","author":"Brucato Matteo","year":"2024","unstructured":"Matteo Brucato, Tarique Siddiqui, Wentao Wu, Vivek Narasayya, and Surajit Chaudhuri. 2024. Wred: Workload Reduction for Scalable Index Tuning (extended version). https:\/\/matteo-brucato.github.io\/files\/wred_extended."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.","DOI":"10.1145\/1066157.1066184"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 32nd international conference on Very large data bases. Citeseer, 499--510","author":"Bruno Nicolas","year":"2006","unstructured":"Nicolas Bruno and Surajit Chaudhuri. 2006. To tune or not to tune? A Lightweight Physical Design Alerter. In Proceedings of the 32nd international conference on Very large data bases. Citeseer, 499--510."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-45012-2_6"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564747"},{"key":"e_1_2_1_8_1","volume-title":"Ashish Kumar Gupta, and Vivek R. Narasayya","author":"Chaudhuri Surajit","year":"2002","unstructured":"Surajit Chaudhuri, Ashish Kumar Gupta, and Vivek R. Narasayya. 2002. Compressing SQL workloads. In SIGMOD. 488--499."},{"key":"e_1_2_1_9_1","volume-title":"Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server. (June","author":"Chaudhuri Surajit","year":"2020","unstructured":"Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server. (June 2020). https:\/\/www.microsoft.com\/en-us\/research\/publication\/anytime-algorithm-of-database-tuning-advisor-for-microsoft-sql-server\/"},{"key":"e_1_2_1_10_1","volume-title":"Narasayya","author":"Chaudhuri Surajit","year":"1997","unstructured":"Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB 1997. 146--155. http:\/\/www.vldb.org\/conf\/1997\/P146.PDF"},{"key":"e_1_2_1_11_1","volume-title":"Narasayya","author":"Chaudhuri Surajit","year":"1998","unstructured":"Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin 'What-if' Index Analysis Utility. In SIGMOD. 367--378."},{"key":"e_1_2_1_12_1","volume-title":"Comprehensive and efficient workload compression. arXiv preprint arXiv:2011.05549","author":"Deep Shaleen","year":"2020","unstructured":"Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey Naughton, and Stratis Viglas. 2020. Comprehensive and efficient workload compression. arXiv preprint arXiv:2011.05549 (2020)."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.14778\/3430915.3430931"},{"key":"e_1_2_1_14_1","volume-title":"Narasayya","author":"Ding Bailu","year":"2019","unstructured":"Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD. 1241--1258."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3193548"},{"key":"e_1_2_1_16_1","volume-title":"Natural SQL: Making SQL easier to infer from natural language specifications. arXiv preprint arXiv:2109.05153","author":"Gan Yujian","year":"2021","unstructured":"Yujian Gan, Xinyun Chen, Jinxia Xie, Matthew Purver, John R Woodward, John Drake, and Qiaofu Zhang. 2021. Natural SQL: Making SQL easier to infer from natural language specifications. arXiv preprint arXiv:2109.05153 (2021)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-155860869-6\/50024-X"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/3503585.3503586"},{"key":"e_1_2_1_19_1","volume-title":"Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics. arXiv preprint arXiv:1801.05613","author":"Jain Shrainik","year":"2018","unstructured":"Shrainik Jain, Bill Howe, Jiaqi Yan, and Thierry Cruanes. 2018. Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics. arXiv preprint arXiv:1801.05613 (2018)."},{"key":"e_1_2_1_20_1","unstructured":"Andrew Kane. 2017. The automatic indexer for Postgres. https:\/\/github.com\/ankane\/dexter."},{"key":"e_1_2_1_21_1","volume-title":"the Automatic Indexer for Postgres. (June","author":"Kane Andrew","year":"2017","unstructured":"Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Postgres. (June 2017). https:\/\/medium.com\/@ankane\/introducing-dexter-the-automatic-indexer-for-postgres-5f8fa8b28f27 last accessed November 2022."},{"key":"e_1_2_1_22_1","unstructured":"Andrew Kane. 2017. Introducing Dexter the Automatic Indexer for Postgres. https:\/\/medium.com\/@ankane\/introducing-dexter-the-automatic-indexer-for-postgres-5f8fa8b28f27."},{"key":"e_1_2_1_23_1","unstructured":"Richard M Karp. 1972. Reducibility among combinatorial problems Complexity of computer computations (RE Miller and JW Thatcher editors)."},{"key":"e_1_2_1_24_1","volume-title":"Reducibility among combinatorial problems","author":"Karp Richard M","unstructured":"Richard M Karp. 2010. Reducibility among combinatorial problems. Springer."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/3401960.3401970"},{"key":"e_1_2_1_26_1","first-page":"2382","article-title":"Magic Mirror in My Hand, Which is The Best in The Land? An Experimental Evaluation of Index Selection Algorithms","volume":"13","author":"Kossmann Jan","year":"2020","unstructured":"Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic Mirror in My Hand, Which is The Best in The Land? An Experimental Evaluation of Index Selection Algorithms. VLDB 13, 12 (2020), 2382--2395. http:\/\/www.vldb.org\/pvldb\/vol13\/p2382-kossmann.pdf","journal-title":"VLDB"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2872518.2888608"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/2831360.2831369"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350269"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 33rd international conference on Very large data bases. 1093--1104","author":"Papadomanolakis Stratos","year":"2007","unstructured":"Stratos Papadomanolakis, Debabrata Dash, and Anastasia Ailamaki. 2007. Efficient use of the query optimizer for automated physical design. In Proceedings of the 33rd international conference on Very large data bases. 1093--1104."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304222"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/3503585.3503600"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCSW.2011.20"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526152"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3547305.3547309"},{"key":"e_1_2_1_36_1","unstructured":"The TPC Benchmark?DS 2023. http:\/\/www.tpc.org\/tpcds\/."},{"key":"e_1_2_1_37_1","unstructured":"The TPC Benchmark?H 2023. http:\/\/www.tpc.org\/tpch\/."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2000.839397"},{"key":"e_1_2_1_39_1","volume-title":"Stitcher: Learned Workload Synthesis from Historical Performance Footprints.","author":"Wan Chengcheng","year":"2023","unstructured":"Chengcheng Wan, Yiwen Zhu, Joyce Cahoon, Wenjing Wang, Katherine Lin, Sean Liu, Raymond Truong, Neetu Singh, Alexandra Ciortea, Konstantinos Karanasos, et al. 2023. Stitcher: Learned Workload Synthesis from Historical Performance Footprints. (2023)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3058738"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536206.2536219"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544899"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526128"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133887"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.14778\/3485450.3485456"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3639305","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3639305","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T15:13:10Z","timestamp":1755789190000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3639305"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,12]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,3,12]]}},"alternative-id":["10.1145\/3639305"],"URL":"https:\/\/doi.org\/10.1145\/3639305","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,12]]}}}