{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T22:44:44Z","timestamp":1769208284114,"version":"3.49.0"},"reference-count":21,"publisher":"Association for Computing Machinery (ACM)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2016,9]]},"abstract":"<jats:p>Real-time analytics on massive datasets has become a very common need in many enterprises. These applications require not only rapid data ingest, but also quick answers to analytical queries operating on the latest data. MemSQL is a distributed SQL database designed to exploit memory-optimized, scale-out architecture to enable real-time transactional and analytical workloads which are fast, highly concurrent, and extremely scalable. Many analytical queries in MemSQL's customer workloads are complex queries involving joins, aggregations, sub-queries, etc. over star and snowflake schemas, often ad-hoc or produced interactively by business intelligence tools. These queries often require latencies of seconds or less, and therefore require the optimizer to not only produce a high quality distributed execution plan, but also produce it fast enough so that optimization time does not become a bottleneck.<\/jats:p>\n          <jats:p>In this paper, we describe the architecture of the MemSQL Query Optimizer and the design choices and innovations which enable it quickly produce highly efficient execution plans for complex distributed queries. We discuss how query rewrite decisions oblivious of distribution cost can lead to poor distributed execution plans, and argue that to choose high-quality plans in a distributed database, the optimizer needs to be distribution-aware in choosing join plans, applying query rewrites, and costing plans. We discuss methods to make join enumeration faster and more effective, such as a rewrite-based approach to exploit bushy joins in queries involving multiple star schemas without sacrificing optimization time. We demonstrate the effectiveness of the MemSQL optimizer over queries from the TPC-H benchmark and a real customer workload.<\/jats:p>","DOI":"10.14778\/3007263.3007277","type":"journal-article","created":{"date-parts":[[2016,11,1]],"date-time":"2016-11-01T13:47:47Z","timestamp":1478008067000},"page":"1401-1412","source":"Crossref","is-referenced-by-count":45,"title":["The MemSQL query optimizer"],"prefix":"10.14778","volume":"9","author":[{"given":"Jack","family":"Chen","sequence":"first","affiliation":[{"name":"MemSQL Inc."}]},{"given":"Samir","family":"Jindel","sequence":"additional","affiliation":[{"name":"MemSQL Inc."}]},{"given":"Robert","family":"Walzer","sequence":"additional","affiliation":[{"name":"MemSQL Inc."}]},{"given":"Rajkumar","family":"Sen","sequence":"additional","affiliation":[{"name":"MemSQL Inc."}]},{"given":"Nika","family":"Jimsheleishvilli","sequence":"additional","affiliation":[{"name":"MemSQL Inc."}]},{"given":"Michael","family":"Andrews","sequence":"additional","affiliation":[{"name":"MemSQL Inc."}]}],"member":"320","published-online":{"date-parts":[[2016,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733004.2733017"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447916"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2094114.2094126"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/648311.754892"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453882"},{"key":"e_1_2_1_6_1","volume-title":"A first step towards GPU-assisted query optimization. ADMS@ VLDB","author":"Heimel M.","year":"2012","unstructured":"M. Heimel and V. Markl . A first step towards GPU-assisted query optimization. ADMS@ VLDB , 2012 :33--44, 2012. M. Heimel and V. Markl. A first step towards GPU-assisted query optimization. ADMS@ VLDB, 2012:33--44, 2012."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367518"},{"key":"e_1_2_1_8_1","volume-title":"Constructing optimal bushy processing trees for join queries is NP-hard. Technical reports, 96","author":"Moerkotte G.","year":"2004","unstructured":"G. Moerkotte and W. Scheufele . Constructing optimal bushy processing trees for join queries is NP-hard. Technical reports, 96 , 2004 . G. Moerkotte and W. Scheufele. Constructing optimal bushy processing trees for join queries is NP-hard. Technical reports, 96, 2004."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1182635.1164217"},{"key":"e_1_2_1_10_1","first-page":"314","volume-title":"VLDB","author":"Ono K.","year":"1990","unstructured":"K. Ono and G. M. Lohman . Measuring the complexity of join enumeration in query optimization . In VLDB , pages 314 -- 325 , 1990 . K. Ono and G. M. Lohman. Measuring the complexity of join enumeration in query optimization. In VLDB, pages 314--325, 1990."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/582095.582099"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2803140.2803148"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/645481.653275"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213953"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/2093889.2093965"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2016.7498332"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2595637"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/66926.66961"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559938"},{"key":"e_1_2_1_20_1","volume-title":"Oracle White Paper","author":"Weiss R.","year":"2012","unstructured":"R. Weiss . A technical overview of the Oracle Exadata database machine and exadata storage server. Oracle White Paper . Oracle Corporation , Redwood Shores , 2012 . R. Weiss. A technical overview of the Oracle Exadata database machine and exadata storage server. Oracle White Paper. Oracle Corporation, Redwood Shores, 2012."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.148"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3007263.3007277","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:38:01Z","timestamp":1672220281000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3007263.3007277"}},"subtitle":["a modern optimizer for real-time analytics in a distributed database"],"short-title":[],"issued":{"date-parts":[[2016,9]]},"references-count":21,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2016,9]]}},"alternative-id":["10.14778\/3007263.3007277"],"URL":"https:\/\/doi.org\/10.14778\/3007263.3007277","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2016,9]]}}}