{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:00:44Z","timestamp":1758268844056},"reference-count":7,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p>Interactive data analytics is often inundated with common computations across multiple queries. These redundancies result in poor query performance and higher overall cost for the interactive query sessions. Obviously, reusing these common computations could lead to cost savings. However, it is difficult for the users to manually detect and reuse the common computations in their fast moving interactive sessions. In the paper, we propose to demonstrate SparkCruise, a computation reuse system that automatically selects the most useful common computations to materialize based on the past query workload. SparkCruise materializes these computations as part of query processing, so the users can continue with their query processing just as before and computation reuse is automatically applied in the background --- all without any modifications to the Spark code. We will invite the audience to play with several scenarios, such as workload redundancy insights and pay-as-you-go materialization, highlighting the utility of SparkCruise.<\/jats:p>","DOI":"10.14778\/3352063.3352082","type":"journal-article","created":{"date-parts":[[2019,9,18]],"date-time":"2019-09-18T18:36:11Z","timestamp":1568831771000},"page":"1850-1853","source":"Crossref","is-referenced-by-count":7,"title":["SparkCruise"],"prefix":"10.14778","volume":"12","author":[{"given":"Abhishek","family":"Roy","sequence":"first","affiliation":[{"name":"Microsoft"}]},{"given":"Alekh","family":"Jindal","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Hiren","family":"Patel","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Ashit","family":"Gosalia","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Subru","family":"Krishnan","sequence":"additional","affiliation":[{"name":"Microsoft"}]},{"given":"Carlo","family":"Curino","sequence":"additional","affiliation":[{"name":"Microsoft"}]}],"member":"320","published-online":{"date-parts":[[2019,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Azure HDinsight. https:\/\/azure.microsoft.com\/en-us\/services\/hdinsight\/. Accessed: 2019-03-06.  Azure HDinsight. https:\/\/azure.microsoft.com\/en-us\/services\/hdinsight\/. Accessed: 2019-03-06."},{"key":"e_1_2_1_2_1","unstructured":"Spark SQL performance tests. https:\/\/github.com\/databricks\/spark-sql-perf. Accessed: 2019-03-06.  Spark SQL performance tests. https:\/\/github.com\/databricks\/spark-sql-perf. Accessed: 2019-03-06."},{"key":"e_1_2_1_3_1","unstructured":"S. Agarwal. {SPARK-18127} Add hooks and extension points to spark. https:\/\/issues.apache.org\/jira\/browse\/SPARK-18127 April 2017. Accessed: 2019-03-06.  S. Agarwal. {SPARK-18127} Add hooks and extension points to spark. https:\/\/issues.apache.org\/jira\/browse\/SPARK-18127 April 2017. Accessed: 2019-03-06."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/2480856"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/3192965.3192971"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3190656"},{"key":"e_1_2_1_7_1","volume-title":"DARLI-AP Workshop","author":"Michiardi P.","year":"2019"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3352063.3352082","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:43:20Z","timestamp":1672224200000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3352063.3352082"}},"subtitle":["handsfree computation reuse in Spark"],"short-title":[],"issued":{"date-parts":[[2019,8]]},"references-count":7,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2019,8]]}},"alternative-id":["10.14778\/3352063.3352082"],"URL":"https:\/\/doi.org\/10.14778\/3352063.3352082","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2019,8]]}}}