{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:19:22Z","timestamp":1750220362613,"version":"3.41.0"},"reference-count":26,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,6,15]],"date-time":"2021-06-15T00:00:00Z","timestamp":1623715200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMOD Rec."],"published-print":{"date-parts":[[2021,6,15]]},"abstract":"<jats:p>This paper introduces DIAMetrics: a novel framework for end-to-end benchmarking and performance monitoring of query engines. DIAMetrics consists of a number of components supporting tasks such as automated workload summarization, data anonymization, benchmark execution, monitoring, regression identification, and alerting. The architecture of DIAMetrics is highly modular and supports multiple systems by abstracting their implementation details and relying on common canonical formats and pluggable software drivers. The end result is a powerful unified framework that is capable of supporting every aspect of benchmarking production systems and workloads. DIAMetrics has been developed in Google and is being used to benchmark various internal query engines. In this paper, we give an overview of DIAMetrics and discuss its design and implementation. Furthermore, we provide details about its deployment and example use cases. Given the variety of supported systems and use cases within Google, we argue that its core concepts can be used more widely to enable comparative end-to-end benchmarking in other industrial environments.<\/jats:p>","DOI":"10.1145\/3471485.3471492","type":"journal-article","created":{"date-parts":[[2021,6,18]],"date-time":"2021-06-18T05:22:06Z","timestamp":1623993726000},"page":"24-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["DIAMetrics"],"prefix":"10.1145","volume":"50","author":[{"given":"Shaleen","family":"Deep","sequence":"first","affiliation":[{"name":"University of Wisconsin-Madison"}]},{"given":"Anja","family":"Gruenheid","sequence":"additional","affiliation":[{"name":"Google Inc."}]},{"given":"Kruthi","family":"Nagaraj","sequence":"additional","affiliation":[{"name":"Google Inc."}]},{"given":"Hiro","family":"Naito","sequence":"additional","affiliation":[{"name":"Google Inc."}]},{"given":"Jeff","family":"Naughton","sequence":"additional","affiliation":[{"name":"Google Inc."}]},{"given":"Stratis","family":"Viglas","sequence":"additional","affiliation":[{"name":"Google Inc."}]}],"member":"320","published-online":{"date-parts":[[2021,6,17]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056103"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/645911.673616"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-04936-6_5"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/170036.170041"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/253262.253283"},{"key":"e_1_2_1_6_1","volume-title":"Data Works Summit","author":"Chattopadhyay B.","year":"2018","unstructured":"B. Chattopadhyay , P. Dutta , W. Liu , A. Mccormick , A. Mokashi , O. Tinn , N. McKay , S. Mittal , H. ching Lee , X. Zhao , N. Mikhaylin , P. Harvey , V. Lychagina , T. Xu , B. Elliott , H. Gonzalez , L. Perez , F. Shahmohammadi , D. Lomax , and A. Zheng . Procella: A fast versatile SQL query engine powering data at YouTube . Data Works Summit , 2018 . B. Chattopadhyay, P. Dutta, W. Liu, A. Mccormick, A. Mokashi, O. Tinn, N. McKay, S. Mittal, H. ching Lee, X. Zhao, N. Mikhaylin, P. Harvey, V. Lychagina, T. Xu, B. Elliott, H. Gonzalez, L. Perez, F. Shahmohammadi, D. Lomax, and A. Zheng. Procella: A fast versatile SQL query engine powering data at YouTube. Data Works Summit, 2018."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564747"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/645923.673646"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-32627-1_10"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/3430915.3442439"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW.2014.6818330"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376732"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2445583.2445586"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732977.2732999"},{"key":"e_1_2_1_16_1","volume-title":"QDB","author":"Jain S.","year":"2016","unstructured":"S. Jain and B. Howe . Data cleaning in the wild: Reusable curation idioms from a multi-year sql workload . In QDB , 2016 . S. Jain and B. Howe. Data cleaning in the wild: Reusable curation idioms from a multi-year sql workload. In QDB, 2016."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920886"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2749454"},{"key":"e_1_2_1_19_1","volume-title":"Google Cloud Blog","author":"Pasumanskyl M.","year":"2016","unstructured":"M. Pasumanskyl . Inside capacitor, bigquery's next-generation columnar storage format . In Google Cloud Blog , 2016 . M. Pasumanskyl. Inside capacitor, bigquery's next-generation columnar storage format. In Google Cloud Blog, 2016."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3229871"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536222.2536232"},{"key":"e_1_2_1_22_1","unstructured":"Transaction Processing Performance Council. TPC Benchmark H (decision support) 2017.  Transaction Processing Performance Council. TPC Benchmark H (decision support) 2017."},{"key":"e_1_2_1_23_1","volume-title":"Oracle's SQL Performance Analyzer","author":"Yagoub K.","year":"2008","unstructured":"K. Yagoub , P. Belknap , B. Dageville , K. Dias , S. Joshi , and H. Yu . Oracle's SQL Performance Analyzer , 2008 . K. Yagoub, P. Belknap, B. Dageville, K. Dias, S. Joshi, and H. Yu. Oracle's SQL Performance Analyzer, 2008."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209950.3209958"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.129222"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.14778\/3090163.3090167"}],"container-title":["ACM SIGMOD Record"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3471485.3471492","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3471485.3471492","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:26Z","timestamp":1750191446000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3471485.3471492"}},"subtitle":["Benchmarking Query Engines at Scale"],"short-title":[],"issued":{"date-parts":[[2021,6,15]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,6,15]]}},"alternative-id":["10.1145\/3471485.3471492"],"URL":"https:\/\/doi.org\/10.1145\/3471485.3471492","relation":{},"ISSN":["0163-5808"],"issn-type":[{"type":"print","value":"0163-5808"}],"subject":[],"published":{"date-parts":[[2021,6,15]]},"assertion":[{"value":"2021-06-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}