{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,18]],"date-time":"2025-09-18T10:09:16Z","timestamp":1758190156591,"version":"3.44.0"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2020,8]]},"abstract":"<jats:p>This paper introduces DIAMetrics: a novel framework for end-to-end benchmarking and performance monitoring of query engines. DIAMetrics consists of a number of components supporting tasks such as automated workload summarization, data anonymization, benchmark execution, monitoring, regression identification, and alerting. The architecture of DIAMetrics is highly modular and supports multiple systems by abstracting their implementation details and relying on common canonical formats and pluggable software drivers. The end result is a powerful unified framework that is capable of supporting every aspect of benchmarking production systems and workloads. DIAMetrics has been developed in Google and is being used to benchmark a number of internal query engines. In this paper, we give an overview of DIAMetrics and discuss its design and implementation. Furthermore, we provide details about its deployment and example use cases. Given the variety of supported systems and use cases within Google, we argue that its core concepts can be used more widely to enable comparative end-to-end benchmarking in other industrial environments.<\/jats:p>","DOI":"10.14778\/3415478.3415551","type":"journal-article","created":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T18:46:46Z","timestamp":1600109206000},"page":"3285-3298","source":"Crossref","is-referenced-by-count":12,"title":["DIAMetrics"],"prefix":"10.14778","volume":"13","author":[{"given":"Shaleen","family":"Deep","sequence":"first","affiliation":[{"name":"University of Wisconsin-Madison"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anja","family":"Gruenheid","sequence":"additional","affiliation":[{"name":"Google Inc."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kruthi","family":"Nagaraj","sequence":"additional","affiliation":[{"name":"Google Inc."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hiro","family":"Naito","sequence":"additional","affiliation":[{"name":"Google Inc."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeff","family":"Naughton","sequence":"additional","affiliation":[{"name":"Google Inc."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stratis","family":"Viglas","sequence":"additional","affiliation":[{"name":"Google Inc."}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056103"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/645911.673616"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/48751.48770"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-04936-6_5"},{"key":"e_1_2_1_5_1","volume-title":"TPCTC 2017","author":"Boncz P. A.","year":"2017","unstructured":"P. A. Boncz, A. Anatiotis, and S. Kl\u00e4be. JCC-H: adding join crossing correlations with skew to TPC-H. In Performance Evaluation and Benchmarking for the Analytics Era - 9th TPC Technology Conference, TPCTC 2017, Munich, Germany, August 28, 2017, Revised Selected Papers, pages 103--119, 2017."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/170035.170041"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/253260.253283"},{"key":"e_1_2_1_8_1","volume-title":"Data Works Summit","author":"Chattopadhyay B.","year":"2018","unstructured":"B. Chattopadhyay, P. Dutta, W. Liu, A. Mccormick, A. Mokashi, O. Tinn, N. McKay, S. Mittal, H. ching Lee, X. Zhao, N. Mikhaylin, P. Harvey, V. Lychagina, T. Xu, B. Elliott, H. Gonzalez, L. Perez, F. Shahmohammadi, D. Lomax, and A. Zheng. Procella: A fast versatile SQL query engine powering data at YouTube. Data Works Summit, 2018."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564747"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/645923.673646"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_12_1","volume-title":"OSDI","author":"Corbett J. C.","year":"2012","unstructured":"J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, D. Woodford, Y. Saito, C. Taylor, M. Szymaniak, and R. Wang. Spanner: Google's globally-distributed database. In OSDI, 2012."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-32627-1_10"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"S. Deep A. Gruenheid J. Naughton S. Viglas and P. Koutris. Comprehensive and efficient workload compression. http:\/\/pages.cs.wisc.edu\/~shaleen\/drafts\/compression_fullversion.pdf 2020.","DOI":"10.14778\/3430915.3430931"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW.2014.6818330"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376732"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945450"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2445583.2445586"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732977.2732999"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 11th International Workshop on Quality in Databases, QDB 2016, at the VLDB 2016 conference","author":"Jain S.","year":"2016","unstructured":"S. Jain and B. Howe. Data cleaning in the wild: Reusable curation idioms from a multi-year sql workload. In Proceedings of the 11th International Workshop on Quality in Databases, QDB 2016, at the VLDB 2016 conference, New Delhi, India, September 5, 2016, 2016."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920886"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2749454"},{"key":"e_1_2_1_23_1","volume-title":"Google Cloud Blog","author":"Pasumanskyl M.","year":"2016","unstructured":"M. Pasumanskyl. Inside capacitor, bigquery's next-generation columnar storage format. In Google Cloud Blog, 2016."},{"issue":"12","key":"e_1_2_1_24_1","first-page":"1835","article-title":"F1 query: Declarative querying at scale","volume":"11","author":"Samwel B.","year":"2018","unstructured":"B. Samwel, J. Cieslewicz, B. Handy, J. Govig, P. Venetis, C. Yang, K. Peters, J. Shute, D. Tenedorio, H. Apte, F. Weigel, D. Wilhite, J. Yang, J. Xu, J. Li, Z. Yuan, C. Chasseur, Q. Zeng, I. Rae, A. Biyani, A. Harn, Y. Xia, A. Gubichev, A. El-Helw, O. Erling, Z. Yan, M. Yang, Y. Wei, T. Do, C. Zheng, G. Graefe, S. Sardashti, A. M. Aly, D. Agrawal, A. Gupta, and S. Venkataraman. F1 query: Declarative querying at scale. PVLDB, 11(12):1835--1848, 2018.","journal-title":"PVLDB"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536222.2536232"},{"key":"e_1_2_1_26_1","unstructured":"Transaction Processing Performance Council. TPC Benchmark C 2010."},{"key":"e_1_2_1_27_1","unstructured":"Transaction Processing Performance Council. TPC Benchmark H (decision support) 2017."},{"key":"e_1_2_1_28_1","volume-title":"TPC Benchmark DS","author":"Transaction Processing Performance Council","year":"2018","unstructured":"Transaction Processing Performance Council. TPC Benchmark DS, 2018."},{"key":"e_1_2_1_29_1","volume-title":"Oracle's SQL Performance Analyzer","author":"Yagoub K.","year":"2008","unstructured":"K. Yagoub, P. Belknap, B. Dageville, K. Dias, S. Joshi, and H. Yu. Oracle's SQL Performance Analyzer, 2008."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209950.3209958"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.129222"},{"issue":"8","key":"e_1_2_1_32_1","first-page":"889","article-title":"Looking ahead makes query plans robust: Making the initial case with in-memory star schema data warehouse workloads","volume":"10","author":"Zhu J.","year":"2017","unstructured":"J. Zhu, N. Potti, S. Saurabh, and J. M. Patel. Looking ahead makes query plans robust: Making the initial case with in-memory star schema data warehouse workloads. PVLDB, 10(8):889--900, 2017.","journal-title":"PVLDB"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3415478.3415551","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T02:40:19Z","timestamp":1758076819000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3415478.3415551"}},"subtitle":["benchmarking query engines at scale"],"short-title":[],"issued":{"date-parts":[[2020,8]]},"references-count":32,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,8]]}},"alternative-id":["10.14778\/3415478.3415551"],"URL":"https:\/\/doi.org\/10.14778\/3415478.3415551","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2020,8]]}}}