{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,15]],"date-time":"2023-01-15T00:48:53Z","timestamp":1673743733350},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2016,9]]},"abstract":"<jats:p>Turn's online advertising campaigns produce petabytes of data. This data is composed of trillions of events, e.g. impressions, clicks, etc., spanning multiple years. In addition to a timestamp, each event includes hundreds of fields describing the user's attributes, campaign's attributes, attributes of where the ad was served, etc.<\/jats:p>\n          <jats:p>Advertisers need advanced analytics to monitor their running campaigns' performance, as well as to optimize future campaigns. This involves slicing and dicing the data over tens of dimensions over arbitrary time ranges. Many of these queries need to power the web portal to provide reports and dashboards. For an interactive response time, they have to have tens of milliseconds latency. At Turn's scale of operations, no existing system was able to deliver this performance in a cost effective manner.<\/jats:p>\n          <jats:p>Kodiak, a distributed analytical data platform for web-scale high-dimensional data, was built to serve this need. It relies on pre-computations to materialize thousands of views to serve these advanced queries. These views are partitioned and replicated across Kodiak's storage nodes for scalability and reliability. They are system maintained as new events arrive. At query time, the system auto-selects the most suitable view to serve each query.<\/jats:p>\n          <jats:p>Kodiak has been used in production for over a year. It hosts 2490 views for over three petabytes of raw data serving over 200K queries daily. It has median and 99% query latencies of 8 ms and 252 ms respectively. Our experiments show that its query latency is 3 orders of magnitude faster than leading big data platforms on head-to-head comparisons using Turn's query workload. Moreover, Kodiak uses 4 orders of magnitude less resources to run the same workload.<\/jats:p>","DOI":"10.14778\/3007263.3007266","type":"journal-article","created":{"date-parts":[[2016,11,1]],"date-time":"2016-11-01T13:47:47Z","timestamp":1478008067000},"page":"1269-1280","source":"Crossref","is-referenced-by-count":6,"title":["Kodiak"],"prefix":"10.14778","volume":"9","author":[{"given":"Shaosu","family":"Liu","sequence":"first","affiliation":[{"name":"Turn, Inc"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bin","family":"Song","sequence":"additional","affiliation":[{"name":"Turn, Inc"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sriharsha","family":"Gangam","sequence":"additional","affiliation":[{"name":"Turn, Inc"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lawrence","family":"Lo","sequence":"additional","affiliation":[{"name":"Turn, Inc"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Khaled","family":"Elmeleegy","sequence":"additional","affiliation":[{"name":"Turn, Inc"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559866"},{"key":"e_1_2_1_2_1","unstructured":"HBase: the Hadoop database. http:\/\/\/hbase.apache.org\/.  HBase: the Hadoop database. http:\/\/\/hbase.apache.org\/."},{"key":"e_1_2_1_3_1","unstructured":"The Apache Cassandra database. http:\/\/\/cassandra.apache.org\/.  The Apache Cassandra database. http:\/\/\/cassandra.apache.org\/."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742797"},{"key":"e_1_2_1_5_1","volume-title":"CIDR","author":"Behm A.","year":"2015","unstructured":"A. Behm , V. Bittorf , T. Bobrovytsky , C. Ching , A. Choi , J. Erickson , M. Grund , D. Hecht , M. Jacobs , I. Joshi , L. Kuff , D. Kumar , A. Leblang , N. Li , I. Pandis , H. Robinson , D. Rorke , S. Rus , J. Russell , D. Tsirogiannis , S. Wanderman-Milne , and M. Yoder . Impala: A modern, open-source sql engine for hadoop . In CIDR , 2015 . A. Behm, V. Bittorf, T. Bobrovytsky, C. Ching, A. Choi, J. Erickson, M. Grund, D. Hecht, M. Jacobs, I. Joshi, L. Kuff, D. Kumar, A. Leblang, N. Li, I. Pandis, H. Robinson, D. Rorke, S. Rus, J. Russell, D. Tsirogiannis, S. Wanderman-Milne, and M. Yoder. Impala: A modern, open-source sql engine for hadoop. In CIDR, 2015."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1365815.1365816"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1921020"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/233269.233364"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454167"},{"key":"e_1_2_1_10_1","first-page":"251","volume-title":"Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12","author":"Corbett J. C.","year":"2012","unstructured":"J. C. Corbett , J. Dean , M. Epstein , A. Fikes , C. Frost , J. J. Furman , S. Ghemawat , A. Gubarev , C. Heiser , P. Hochschild , W. Hsieh , S. Kanthak , E. Kogan , H. Li , A. Lloyd , S. Melnik , D. Mwaura , D. Nagle , S. Quinlan , R. Rao , L. Rolig , Y. Saito , M. Szymaniak , C. Taylor , R. Wang , and D. Woodford . Spanner: Google's globally-distributed database . In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12 , pages 251 -- 264 , Berkeley, CA, USA , 2012 . USENIX Association. J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's globally-distributed database. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, pages 251--264, Berkeley, CA, USA, 2012. USENIX Association."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1294261.1294281"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536222.2536238"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536222.2536225"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"145","DOI":"10.7551\/mitpress\/4472.001.0001","volume-title":"Materialized views","author":"Gupta A.","year":"1999","unstructured":"A. Gupta and I. S. Mumick . Materialized views . chapter Maintenance of Materialized Views: Problems, Techniques, and Applications, pages 145 -- 157 . MIT Press , Cambridge, MA, USA, 1999 . A. Gupta and I. S. Mumick. Materialized views. chapter Maintenance of Materialized Views: Problems, Techniques, and Applications, pages 145--157. MIT Press, Cambridge, MA, USA, 1999."},{"key":"e_1_2_1_15_1","first-page":"11","volume-title":"USENIXATC'10: Proceedings of the 2010 USENIX conference on USENIX annual technical conference","author":"Hunt P.","year":"2010","unstructured":"P. Hunt , M. Konar , F. P. Junqueira , and B. Reed . Zookeeper: wait-free coordination for internet-scale systems . In USENIXATC'10: Proceedings of the 2010 USENIX conference on USENIX annual technical conference , pages 11 -- 11 , 2010 . P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: wait-free coordination for internet-scale systems. In USENIXATC'10: Proceedings of the 2010 USENIX conference on USENIX annual technical conference, pages 11--11, 2010."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/1182635.1164198"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2452376.2452378"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/253260.253277"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376726"},{"key":"e_1_2_1_20_1","unstructured":"Oracle. Automatic Storage Management. http:\/\/docs.oracle.com\/cd\/E11882_01\/server.112\/e18951\/asmcon.htm.  Oracle. Automatic Storage Management. http:\/\/docs.oracle.com\/cd\/E11882_01\/server.112\/e18951\/asmcon.htm."},{"key":"e_1_2_1_21_1","unstructured":"Oracle Clusterware. http:\/\/www.oracle.com\/technetwork\/database\/database-technologies\/clusterware\/overview\/index.html.  Oracle Clusterware. http:\/\/www.oracle.com\/technetwork\/database\/database-technologies\/clusterware\/overview\/index.html."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465298"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/233269.233361"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335393"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536222.2536232"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigDataService.2015.32"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687609"},{"key":"e_1_2_1_28_1","unstructured":"M. Traverso. Presto: Interacting with petabytes of data at Facebook. https:\/\/www.facebook.com\/notes\/facebook-engineering\/presto-interacting-with-petabytes-of-data-at-facebook\/10151786197628920.  M. Traverso. Presto: Interacting with petabytes of data at Facebook. https:\/\/www.facebook.com\/notes\/facebook-engineering\/presto-interacting-with-petabytes-of-data-at-facebook\/10151786197628920."},{"key":"e_1_2_1_29_1","unstructured":"Big Data Benchmark - AMPLab. https:\/\/amplab.cs.berkeley.edu\/benchmark.  Big Data Benchmark - AMPLab. https:\/\/amplab.cs.berkeley.edu\/benchmark."},{"key":"e_1_2_1_30_1","volume-title":"VMware","author":"Inc.","year":"2015","unstructured":"VMware, Inc. Tungsten Replicator 3.0 Manual. Technical report , VMware , Inc , 2015 . VMware, Inc. Tungsten Replicator 3.0 Manual. Technical report, VMware, Inc, 2015."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465288"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2595631"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3007263.3007266","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:36:07Z","timestamp":1672220167000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3007263.3007266"}},"subtitle":["leveraging materialized views for very low-latency analytics over high-dimensional web-scale data"],"short-title":[],"issued":{"date-parts":[[2016,9]]},"references-count":32,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2016,9]]}},"alternative-id":["10.14778\/3007263.3007266"],"URL":"https:\/\/doi.org\/10.14778\/3007263.3007266","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2016,9]]}}}