{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T02:36:55Z","timestamp":1774579015077,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":27,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T00:00:00Z","timestamp":1597881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,8,23]]},"DOI":"10.1145\/3394486.3403299","type":"proceedings-article","created":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T23:17:27Z","timestamp":1597965447000},"page":"2494-2504","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":39,"title":["To Tune or Not to Tune?"],"prefix":"10.1145","author":[{"given":"Ayat","family":"Fekry","sequence":"first","affiliation":[{"name":"University of Cambridge, Cambridge, United Kingdom"}]},{"given":"Lucian","family":"Carata","sequence":"additional","affiliation":[{"name":"University of Cambridge, cambridge, United Kingdom"}]},{"given":"Thomas","family":"Pasquier","sequence":"additional","affiliation":[{"name":"University of Bristol, Bristol, United Kingdom"}]},{"given":"Andrew","family":"Rice","sequence":"additional","affiliation":[{"name":"University of Cambridge, cambridge, United Kingdom"}]},{"given":"Andy","family":"Hopper","sequence":"additional","affiliation":[{"name":"University of Cambridge, cambridge, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2020,8,20]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"TPC-H SQL benchmark 2014. http:\/\/www.tpc.org\/tpch\/.  TPC-H SQL benchmark 2014. http:\/\/www.tpc.org\/tpch\/."},{"key":"e_1_3_2_1_2_1","volume-title":"fast and general engine for large-scale data processing","author":"Spark Apache","year":"2015","unstructured":"Apache Spark : fast and general engine for large-scale data processing , 2015 . https:\/\/spark.apache.org\/. Apache Spark: fast and general engine for large-scale data processing, 2015. https:\/\/spark.apache.org\/."},{"key":"e_1_3_2_1_3_1","unstructured":"Amazon EC2 instance Pricing 2018. https:\/\/aws.amazon.com\/ec2\/pricing\/on-demand\/.  Amazon EC2 instance Pricing 2018. https:\/\/aws.amazon.com\/ec2\/pricing\/on-demand\/."},{"key":"e_1_3_2_1_4_1","unstructured":"Hadoop distributed file system 2018. https:\/\/hadoop.apache.org\/docs\/r1.2.1\/hdfs_design.html.  Hadoop distributed file system 2018. https:\/\/hadoop.apache.org\/docs\/r1.2.1\/hdfs_design.html."},{"key":"e_1_3_2_1_5_1","volume-title":"Experiment data repository","author":"Tuneful","year":"2020","unstructured":"Tuneful : Experiment data repository , 2020 . https:\/\/github.com\/ayat-khairy\/tuneful-data.git. Tuneful: Experiment data repository, 2020. https:\/\/github.com\/ayat-khairy\/tuneful-data.git."},{"key":"e_1_3_2_1_6_1","volume-title":"project repository","author":"Tuneful","year":"2020","unstructured":"Tuneful : project repository , 2020 . https:\/\/github.com\/ayat-khairy\/tuneful-code.git. Tuneful: project repository, 2020. https:\/\/github.com\/ayat-khairy\/tuneful-code.git."},{"key":"e_1_3_2_1_7_1","first-page":"4","volume-title":"NSDI","volume":"2","author":"Alipourfard Omid","year":"2017","unstructured":"Omid Alipourfard , Hongqiang Harry Liu , Jianshu Chen , Shivaram Venkataraman , Minlan Yu , and Ming Zhang . Cherrypick : Adaptively unearthing the best cloud configurations for big data analytics . In NSDI , volume 2 , pages 4 -- 2 , 2017 . Omid Alipourfard, Hongqiang Harry Liu, Jianshu Chen, Shivaram Venkataraman, Minlan Yu, and Ming Zhang. Cherrypick: Adaptively unearthing the best cloud configurations for big data analytics. In NSDI, volume 2, pages 4--2, 2017."},{"key":"e_1_3_2_1_8_1","first-page":"303","volume-title":"Parallel Architecture and Compilation Techniques (PACT), 2014 23rd International Conference on","author":"Ansel Jason","year":"2014","unstructured":"Jason Ansel , Shoaib Kamil , Kalyan Veeramachaneni , Jonathan Ragan-Kelley , Jeffrey Bosboom , Una-May O'Reilly , and Saman Amarasinghe . Opentuner : An extensible framework for program autotuning . In Parallel Architecture and Compilation Techniques (PACT), 2014 23rd International Conference on , pages 303 -- 315 . IEEE, 2014 . Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. Opentuner: An extensible framework for program autotuning. In Parallel Architecture and Compilation Techniques (PACT), 2014 23rd International Conference on, pages 303--315. IEEE, 2014."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2015.06.032"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.218"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687767"},{"key":"e_1_3_2_1_12_1","volume-title":"Tuneful: An online significance-aware configuration tuner for big data analytics","author":"Fekry Ayat","year":"2020","unstructured":"Ayat Fekry , Lucian Carata , Thomas Pasquier , Andrew Rice , and Andy Hopper . Tuneful: An online significance-aware configuration tuner for big data analytics , 2020 . https:\/\/arxiv.org\/pdf\/2001.08002.pdf. Ayat Fekry, Lucian Carata, Thomas Pasquier, Andrew Rice, and Andy Hopper. Tuneful: An online significance-aware configuration tuner for big data analytics, 2020. https:\/\/arxiv.org\/pdf\/2001.08002.pdf."},{"key":"e_1_3_2_1_13_1","volume-title":"Gene selection for cancer classification using support vector machines. Machine learning, 46(1--3):389--422","author":"Guyon Isabelle","year":"2002","unstructured":"Isabelle Guyon , Jason Weston , Stephen Barnhill , and Vladimir Vapnik . Gene selection for cancer classification using support vector machines. Machine learning, 46(1--3):389--422 , 2002 . Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classification using support vector machines. Machine learning, 46(1--3):389--422, 2002."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW.2010.5452747"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4842-1004-8_4"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2371536.2371547"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40047-6_42"},{"issue":"3","key":"e_1_3_2_1_18_1","first-page":"18","article-title":"Classification and regression by randomforest","volume":"2","author":"Liaw Andy","year":"2002","unstructured":"Andy Liaw , Matthew Wiener , Classification and regression by randomforest . R news , 2 ( 3 ): 18 -- 22 , 2002 . Andy Liaw, Matthew Wiener, et al. Classification and regression by randomforest. R news, 2(3):18--22, 2002.","journal-title":"R news"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3135974.3135991"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2015.2494218"},{"key":"e_1_3_2_1_21_1","volume-title":"On quasi-monte carlo integrations. Mathematics and computers in simulation, 47(2--5):103--112","author":"Sobol Ilya M","year":"1998","unstructured":"Ilya M Sobol . On quasi-monte carlo integrations. Mathematics and computers in simulation, 47(2--5):103--112 , 1998 . Ilya M Sobol. On quasi-monte carlo integrations. Mathematics and computers in simulation, 47(2--5):103--112, 1998."},{"key":"e_1_3_2_1_22_1","first-page":"2004","volume-title":"Advances in neural information processing systems","author":"Swersky Kevin","year":"2013","unstructured":"Kevin Swersky , Jasper Snoek , and Ryan P Adams . Multi-task bayesian optimization . In Advances in neural information processing systems , pages 2004 -- 2012 , 2013 . Kevin Swersky, Jasper Snoek, and Ryan P Adams. Multi-task bayesian optimization. In Advances in neural information processing systems, pages 2004--2012, 2013."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064029"},{"key":"e_1_3_2_1_24_1","first-page":"586","volume-title":"High Performance Computing and Communications","author":"Wang Guolu","year":"2016","unstructured":"Guolu Wang , Jungang Xu , and Ben He . A novel method for tuning configuration parameters of Spark based on machine learning . In High Performance Computing and Communications ; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC\/SmartCity\/DSS), 2016 IEEE 18th International Conference on, pages 586 -- 593 . IEEE , 2016. Guolu Wang, Jungang Xu, and Ben He. A novel method for tuning configuration parameters of Spark based on machine learning. In High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC\/SmartCity\/DSS), 2016 IEEE 18th International Conference on, pages 586--593. IEEE, 2016."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/0165-4896(81)90018-4"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173162.3173187"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3127479.3128605"}],"event":{"name":"KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","location":"Virtual Event CA USA","acronym":"KDD '20","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"]},"container-title":["Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394486.3403299","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394486.3403299","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:48Z","timestamp":1750197708000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394486.3403299"}},"subtitle":["In Search of Optimal Configurations for Data Analytics"],"short-title":[],"issued":{"date-parts":[[2020,8,20]]},"references-count":27,"alternative-id":["10.1145\/3394486.3403299","10.1145\/3394486"],"URL":"https:\/\/doi.org\/10.1145\/3394486.3403299","relation":{},"subject":[],"published":{"date-parts":[[2020,8,20]]},"assertion":[{"value":"2020-08-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}