{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T16:46:06Z","timestamp":1771519566441,"version":"3.50.1"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p>As analytic (OLAP) applications move to the cloud, DBMSs have shifted from employing a pure shared-nothing design with locally attached storage to a hybrid design that combines the use of shared-storage (e.g., AWS S3) with the use of shared-nothing query execution mechanisms. This paper sheds light on the resulting tradeoffs, which have not been properly identified in previous work. To this end, it evaluates the TPC-H benchmark across a variety of DBMS offerings running in a cloud environment (AWS) on fast 10Gb+ networks, specifically database-as-a-service offerings (Redshift, Athena), query engines (Presto, Hive), and a traditional cloud agnostic OLAP database (Vertica). While these comparisons cannot be apples-to-apples in all cases due to cloud configuration restrictions, we nonetheless identify patterns and design choices that are advantageous. These include prioritizing low-cost object stores like S3 for data storage, using system agnostic yet still performant columnar formats like ORC that allow easy switching to other systems for different workloads, and making features that benefit subsequent runs like query precompilation and caching remote data to faster storage optional rather than required because they disadvantage ad hoc queries.<\/jats:p>","DOI":"10.14778\/3352063.3352133","type":"journal-article","created":{"date-parts":[[2019,9,18]],"date-time":"2019-09-18T18:36:11Z","timestamp":1568831771000},"page":"2170-2182","source":"Crossref","is-referenced-by-count":29,"title":["Choosing a cloud DBMS"],"prefix":"10.14778","volume":"12","author":[{"given":"Junjay","family":"Tan","sequence":"first","affiliation":[{"name":"Brown University"}]},{"given":"Thanaa","family":"Ghanem","sequence":"additional","affiliation":[{"name":"Metropolitan State University"}]},{"given":"Matthew","family":"Perron","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}]},{"given":"Xiangyao","family":"Yu","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}]},{"given":"Michael","family":"Stonebraker","sequence":"additional","affiliation":[{"name":"MIT CSAIL and Tamr, Inc."}]},{"given":"David","family":"DeWitt","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}]},{"given":"Marco","family":"Serafini","sequence":"additional","affiliation":[{"name":"University of Massachusetts"}]},{"given":"Ashraf","family":"Aboulnaga","sequence":"additional","affiliation":[{"name":"HBKU"}]},{"given":"Tim","family":"Kraska","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}]}],"member":"320","published-online":{"date-parts":[[2019,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2509413.2509416"},{"key":"e_1_2_1_2_1","volume-title":"Factors affecting query performance","author":"WS.","year":"2012","unstructured":"A WS. Redshift documentation : Factors affecting query performance , 2012 . https:\/\/docs.aws.amazon.com\/redshift\/latest\/dg\/c-query-performance.html, Last accessed 2018-06-15. AWS. Redshift documentation: Factors affecting query performance, 2012. https:\/\/docs.aws.amazon.com\/redshift\/latest\/dg\/c-query-performance.html, Last accessed 2018-06-15."},{"key":"e_1_2_1_3_1","unstructured":"AWS. Cluster configuration guidelines and best practices 2019. https:\/\/docs.aws.amazon.com\/emr\/latest\/ManagementGuide\/emr-plan-instances-guidelines.html Last accessed 2019-02-01.  AWS. Cluster configuration guidelines and best practices 2019. https:\/\/docs.aws.amazon.com\/emr\/latest\/ManagementGuide\/emr-plan-instances-guidelines.html Last accessed 2019-02-01."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-69035-3_22"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1594156.1594168"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2903741"},{"key":"e_1_2_1_8_1","volume-title":"Benchmarking Big Data SQL Platforms in the Cloud: TPC-DS benchmarks demonstrate Databricks Runtime 3.0's superior performance","year":"2017","unstructured":"Databricks. Benchmarking Big Data SQL Platforms in the Cloud: TPC-DS benchmarks demonstrate Databricks Runtime 3.0's superior performance , 2017 . https:\/\/databricks.com\/blog\/2017\/07\/12\/benchmarking-big-data-sql-platforms-in-the-cloud.html, Last accessed 2018-07-15. Databricks. Benchmarking Big Data SQL Platforms in the Cloud: TPC-DS benchmarks demonstrate Databricks Runtime 3.0's superior performance, 2017. https:\/\/databricks.com\/blog\/2017\/07\/12\/benchmarking-big-data-sql-platforms-in-the-cloud.html, Last accessed 2018-07-15."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10515-013-0138-7"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742795"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732977.2732980"},{"key":"e_1_2_1_12_1","volume-title":"Apache Tez: Overview","year":"2018","unstructured":"Hortonworks. Apache Tez: Overview , 2018 . https:\/\/hortonworks.com\/apache\/tez\/, Last accessed 2018-08-01. Hortonworks. Apache Tez: Overview, 2018. https:\/\/hortonworks.com\/apache\/tez\/, Last accessed 2018-08-01."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807231"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367518"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2011.80"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/2002938.2002940"},{"key":"e_1_2_1_17_1","volume-title":"Bloomberg","author":"Nix N.","year":"2018","unstructured":"N. Nix . CIA tech official calls Amazon cloud project 'transformational '. Bloomberg , June 2018 . https:\/\/www.bloomberg.com\/news\/articles\/2018-06-20\/cia-tech-official-calls-amazon-cloud-project-transformational, Last accessed 2018-10-01. N. Nix. CIA tech official calls Amazon cloud project 'transformational'. Bloomberg, June 2018. https:\/\/www.bloomberg.com\/news\/articles\/2018-06-20\/cia-tech-official-calls-amazon-cloud-project-transformational, Last accessed 2018-10-01."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920902"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00196"},{"key":"e_1_2_1_20_1","volume-title":"Why we chose Redshift","author":"Shiu A.","year":"2015","unstructured":"A. Shiu . Why we chose Redshift , 2015 . https:\/\/amplitude.com\/blog\/2015\/03\/27\/why-we-chose-redshift, Last accessed 2018-11-05. A. Shiu. Why we chose Redshift, 2015. https:\/\/amplitude.com\/blog\/2015\/03\/27\/why-we-chose-redshift, Last accessed 2018-11-05."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/IC2E.2014.97"},{"key":"e_1_2_1_22_1","volume-title":"Even faster: Data at the speed of Presto ORC","author":"Sundstrom D.","year":"2015","unstructured":"D. Sundstrom . Even faster: Data at the speed of Presto ORC , 2015 . https:\/\/code.fb.com\/core-data\/even-faster-data-at-the-speed-of-presto-orc\/, Last accessed 2018-04-15. D. Sundstrom. Even faster: Data at the speed of Presto ORC, 2015. https:\/\/code.fb.com\/core-data\/even-faster-data-at-the-speed-of-presto-orc\/, Last accessed 2018-04-15."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687609"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196938"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CloudCom.2014.28"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/IC2E.2016.28"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056101"},{"key":"e_1_2_1_28_1","unstructured":"Vertica. Configuring storage (documentation). https:\/\/www.vertica.com\/docs\/9.1.x\/HTML\/index.htm#Authoring\/UsingVerticaOnAWS\/ConfiguringStorage.htm Last accessed 2019-01-13.  Vertica. Configuring storage (documentation). https:\/\/www.vertica.com\/docs\/9.1.x\/HTML\/index.htm#Authoring\/UsingVerticaOnAWS\/ConfiguringStorage.htm Last accessed 2019-01-13."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3352063.3352133","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:33:52Z","timestamp":1672223632000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3352063.3352133"}},"subtitle":["architectures and tradeoffs"],"short-title":[],"issued":{"date-parts":[[2019,8]]},"references-count":28,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2019,8]]}},"alternative-id":["10.14778\/3352063.3352133"],"URL":"https:\/\/doi.org\/10.14778\/3352063.3352133","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2019,8]]}}}