{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:19:00Z","timestamp":1759133940509,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":31,"publisher":"ACM","license":[{"start":{"date-parts":[[2016,5,31]],"date-time":"2016-05-31T00:00:00Z","timestamp":1464652800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100008972","name":"U.S. Department of Energy","doi-asserted-by":"publisher","award":["DE-AC02-05CH11231"],"award-info":[{"award-number":["DE-AC02-05CH11231"]}],"id":[{"id":"10.13039\/100008972","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2016,5,31]]},"DOI":"10.1145\/2907294.2907310","type":"proceedings-article","created":{"date-parts":[[2016,6,2]],"date-time":"2016-06-02T19:23:42Z","timestamp":1464895422000},"page":"97-110","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":38,"title":["Scaling Spark on HPC Systems"],"prefix":"10.1145","author":[{"given":"Nicholas","family":"Chaimov","sequence":"first","affiliation":[{"name":"University of Oregon, Eugene, OR, USA"}]},{"given":"Allen","family":"Malony","sequence":"additional","affiliation":[{"name":"University of Oregon, Eugene, OR, USA"}]},{"given":"Shane","family":"Canon","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, Berkeley, CA, USA"}]},{"given":"Costin","family":"Iancu","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, Berkeley, CA, USA"}]},{"given":"Khaled Z.","family":"Ibrahim","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, Berkeley, CA, USA"}]},{"given":"Jay","family":"Srinivasan","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, Berkeley, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2016,5,31]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Cori Phase 1. https:\/\/www.nersc.gov\/users\/computational-systems\/cori\/.  Cori Phase 1. https:\/\/www.nersc.gov\/users\/computational-systems\/cori\/."},{"key":"e_1_3_2_1_2_1","unstructured":"National Energy Research Scientific Computing Center. https:\/\/www.nersc.gov.  National Energy Research Scientific Computing Center. https:\/\/www.nersc.gov."},{"key":"e_1_3_2_1_3_1","unstructured":"spark-perf benchmark. https:\/\/github.com\/databricks\/spark-perf.  spark-perf benchmark. https:\/\/github.com\/databricks\/spark-perf."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2159352.2159356"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742797"},{"key":"e_1_3_2_1_6_1","volume-title":"Spark Summit","author":"Babu S.","year":"2015","unstructured":"S. Babu and L. Co Ting Keh. Better visibility into spark execution for faster application development . In Spark Summit , 2015 . S. Babu and L. Co Ting Keh. Better visibility into spark execution for faster application development. In Spark Summit, 2015."},{"key":"e_1_3_2_1_7_1","unstructured":"P. J. Braam and others. The Lustre storage architecture.  P. J. Braam and others. The Lustre storage architecture."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5161029"},{"key":"e_1_3_2_1_9_1","volume-title":"HPC computation on Hadoop storage with PLFS","author":"Cranor C.","year":"2012","unstructured":"C. Cranor , M. Polte , and G. Gibson . HPC computation on Hadoop storage with PLFS . Technical Report Carnegie Mellon University-PDL-12-115, Carnegie Mellon University , 2012 . C. Cranor, M. Polte, and G. Gibson. HPC computation on Hadoop storage with PLFS. Technical Report Carnegie Mellon University-PDL-12-115, Carnegie Mellon University, 2012."},{"key":"e_1_3_2_1_10_1","volume-title":"Scalability testing of dne2 in lustre 2.7","author":"Crowe T.","year":"2015","unstructured":"T. Crowe , N. Lavender , and S. Simms . Scalability testing of dne2 in lustre 2.7 . In Lustre Users Group , 2015 . T. Crowe, N. Lavender, and S. Simms. Scalability testing of dne2 in lustre 2.7. In Lustre Users Group, 2015."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_3_2_1_13_1","volume-title":"Lustre User Group Meeting.","author":"Gallegos J. M.","year":"2015","unstructured":"J. M. Gallegos , Z. Tao , and Q. Ta-Dell . Deploying hadoop on lustre storage: Lessons learned and best practices . Lustre User Group Meeting. , 2015 . J. M. Gallegos, Z. Tao, and Q. Ta-Dell. Deploying hadoop on lustre storage: Lessons learned and best practices. Lustre User Group Meeting., 2015."},{"key":"e_1_3_2_1_14_1","first-page":"599","volume-title":"Proceedings of OSDI","author":"Gonzalez J. E.","unstructured":"J. E. Gonzalez , R. S. Xin , A. Dave , D. Crankshaw , M. J. Franklin , and I. Stoica . Graphx: Graph processing in a distributed dataflow framework . In Proceedings of OSDI , pages 599 -- 613 . J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. Graphx: Graph processing in a distributed dataflow framework. In Proceedings of OSDI, pages 599--613."},{"key":"e_1_3_2_1_15_1","unstructured":"A. Hadoop. Pluggable Shuffle and Pluggable Sort. https:\/\/hadoop.apache.org\/docs\/current\/hadoop-mapreduce-client\/hadoop-mapreduce-client-core\/PluggableShuffleAndPluggableSort.html.  A. Hadoop. Pluggable Shuffle and Pluggable Sort. https:\/\/hadoop.apache.org\/docs\/current\/hadoop-mapreduce-client\/hadoop-mapreduce-client-core\/PluggableShuffleAndPluggableSort.html."},{"key":"e_1_3_2_1_16_1","first-page":"295","volume-title":"Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI'11","author":"Hindman B.","year":"2011","unstructured":"B. Hindman , A. Konwinski , M. Zaharia , A. Ghodsi , A. D. Joseph , R. Katz , S. Shenker , and I. Stoica . Mesos: A platform for fine-grained resource sharing in the data center . In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI'11 , pages 295 -- 308 , Berkeley, CA, USA , 2011 . USENIX Association. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI'11, pages 295--308, Berkeley, CA, USA, 2011. USENIX Association."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/2388996.2389044"},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings of the Cray User Group","author":"Jacobsen D. M.","year":"2015","unstructured":"D. M. Jacobsen and R. S. Canon . Contain this, unleashing docker for hpc . Proceedings of the Cray User Group , 2015 . D. M. Jacobsen and R. S. Canon. Contain this, unleashing docker for hpc. Proceedings of the Cray User Group, 2015."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2670979.2670985"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTI.2014.15"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749246.2749269"},{"key":"e_1_3_2_1_22_1","unstructured":"X. Meng J. Bradley B. Yavuz E. Sparks S. Venkataraman D. Liu J. Freeman D. B. Tsai M. Amde S. Owen D. Xin R. Xin M. J. Franklin R. Zadeh M. Zaharia and A. Talwalkar. MLlib: Machine learning in apache spark.  X. Meng J. Bradley B. Yavuz E. Sparks S. Venkataraman D. Liu J. Freeman D. B. Tsai M. Amde S. Owen D. Xin R. Xin M. J. Franklin R. Zadeh M. Zaharia and A. Talwalkar. MLlib: Machine learning in apache spark."},{"volume-title":"Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI)","author":"Ousterhout K.","key":"e_1_3_2_1_23_1","unstructured":"K. Ousterhout , R. Rasti , S. Ratnasamy , S. Shenker , B.-G. Chun , and V. ICSI. Making sense of performance in data analytics frameworks . In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI) ( Oakland, CA, pages 293--307. K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, B.-G. Chun, and V. ICSI. Making sense of performance in data analytics frameworks. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI)(Oakland, CA, pages 293--307."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564759"},{"key":"e_1_3_2_1_25_1","volume-title":"6th International Workshop on Big Data Analytics: Challenges and Opportunities (BDAC-15)","author":"Roman R.-I.","year":"2015","unstructured":"R.-I. Roman , B. Nicolae , A. Costan , and G. Antoniu . Understanding spark performance in hybrid and multi-site clouds . In 6th International Workshop on Big Data Analytics: Challenges and Opportunities (BDAC-15) , 2015 . R.-I. Roman, B. Nicolae, A. Costan, and G. Antoniu. Understanding spark performance in hybrid and multi-site clouds. In 6th International Workshop on Big Data Analytics: Challenges and Opportunities (BDAC-15), 2015."},{"volume-title":"The cray framework for hadoop for the cray XC30","author":"Sparks J.","key":"e_1_3_2_1_26_1","unstructured":"J. Sparks , H. Pritchard , and M. Dumler . The cray framework for hadoop for the cray XC30 . J. Sparks, H. Pritchard, and M. Dumler. The cray framework for hadoop for the cray XC30."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063474"},{"key":"e_1_3_2_1_28_1","unstructured":"UC Berkeley AmpLab. Big data benchmark.  UC Berkeley AmpLab. Big data benchmark."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063461"},{"key":"e_1_3_2_1_30_1","volume-title":"Hadoop: The Definitive Guide","author":"White T.","year":"2009","unstructured":"T. White . Hadoop: The Definitive Guide . O'Reilly Media, Inc. , 1st edition, 2009 . T. White. Hadoop: The Definitive Guide. O'Reilly Media, Inc., 1st edition, 2009."},{"key":"e_1_3_2_1_31_1","first-page":"2","volume-title":"Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12","author":"Zaharia M.","unstructured":"M. Zaharia , M. Chowdhury , T. Das , A. Dave , J. Ma , M. McCauley , M. J. Franklin , S. Shenker , and I. Stoica . Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing . In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12 , pages 2 -- 2 . USENIX Association. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, pages 2--2. USENIX Association."},{"key":"e_1_3_2_1_32_1","first-page":"10","volume-title":"Proceedings of the 2nd USENIX conference on Hot topics in cloud computing","volume":"10","author":"Zaharia M.","unstructured":"M. Zaharia , M. Chowdhury , M. J. Franklin , S. Shenker , and I. Stoica . Spark: cluster computing with working sets . In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing , volume 10 , page 10 . M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, volume 10, page 10."}],"event":{"name":"HPDC'16: The 25th International Symposium on High-Performance Parallel and Distributed Computing","sponsor":["University of Arizona University of Arizona","SIGARCH ACM Special Interest Group on Computer Architecture","SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing"],"location":"Kyoto Japan","acronym":"HPDC'16"},"container-title":["Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2907294.2907310","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2907294.2907310","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:54:26Z","timestamp":1750222466000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2907294.2907310"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,5,31]]},"references-count":31,"alternative-id":["10.1145\/2907294.2907310","10.1145\/2907294"],"URL":"https:\/\/doi.org\/10.1145\/2907294.2907310","relation":{},"subject":[],"published":{"date-parts":[[2016,5,31]]},"assertion":[{"value":"2016-05-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}