{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:23:52Z","timestamp":1759134232814,"version":"3.41.0"},"reference-count":20,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2012,10,8]],"date-time":"2012-10-08T00:00:00Z","timestamp":1349654400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMETRICS Perform. Eval. Rev."],"published-print":{"date-parts":[[2012,10,8]]},"abstract":"<jats:p>The Gemini interconnect on the Cray XE6 platform provides for lightweight remote direct memory access (RDMA) between nodes, which is useful for implementing partitioned global address space (PGAS) languages like UPC and Co-Array Fortran. In this paper, we perform a study of Gemini performance using a set of communication microbenchmarks and compare the performance of one-sided communication in PGAS languages with two-sided MPI. Our results demonstrate the performance benefits of the PGAS model on Gemini hardware, showing in what circumstances and by how much one-sided communication outperforms two-sided in terms of messaging rate, aggregate bandwidth, and computation and communication overlap capability. For example, for 8-byte and 2KB messages the one-sided messaging rate is 5 and 10 times greater respectively than the twosided one. The study also reveals important information about how to optimize one-sided Gemini communication.<\/jats:p>","DOI":"10.1145\/2381056.2381077","type":"journal-article","created":{"date-parts":[[2012,10,11]],"date-time":"2012-10-11T14:55:16Z","timestamp":1349967316000},"page":"92-98","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["A preliminary evaluation of the hardware acceleration of the Cray Gemini interconnect for PGAS languages and comparison with MPI"],"prefix":"10.1145","volume":"40","author":[{"given":"Hongzhang","family":"Shan","sequence":"first","affiliation":[{"name":"CRD and NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA"}]},{"given":"Nicholas J.","family":"Wright","sequence":"additional","affiliation":[{"name":"CRD and NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA"}]},{"given":"John","family":"Shalf","sequence":"additional","affiliation":[{"name":"CRD and NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA"}]},{"given":"Katherine","family":"Yelick","sequence":"additional","affiliation":[{"name":"CRD and NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA"}]},{"given":"Marcus","family":"Wagner","sequence":"additional","affiliation":[{"name":"Cray Inc. 380 Jackson Street, Paul, MN"}]},{"given":"Nathan","family":"Wichmann","sequence":"additional","affiliation":[{"name":"Cray Inc. 380 Jackson Street, Paul, MN"}]}],"member":"320","published-online":{"date-parts":[[2012,10,8]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"CUG 2010","author":"Alam S.","year":"2010","unstructured":"S. Alam , W. Sawyer , T. Stitt , N. Stringfellow , and A. Tineo . Evaluation of productivity and performance characteristics of CCE CAF and UPC compilers . In CUG 2010 , Edinburgh, Scotland , May 2010 . S. Alam, W. Sawyer, T. Stitt, N. Stringfellow, and A. Tineo. Evaluation of productivity and performance characteristics of CCE CAF and UPC compilers. In CUG 2010, Edinburgh, Scotland, May 2010."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1155\/2001\/829792"},{"key":"e_1_2_1_3_1","volume-title":"The 48th Cray User Group meeting","author":"Barrett R.","year":"2006","unstructured":"R. Barrett . Co-array fortran experiences with finite differencing methods . In The 48th Cray User Group meeting , Lugano, Italy , May 2006 . R. Barrett. Co-array fortran experiences with finite differencing methods. In The 48th Cray User Group meeting, Lugano, Italy, May 2006."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/1898953.1899016"},{"key":"e_1_2_1_6_1","volume-title":"In Proc. of the 16th Intl. Workshop on Languages and Compilers for Parallel Computing","author":"Coarfa C.","year":"2003","unstructured":"C. Coarfa , Y. Dotsenko , J. Eckhardt , and J. M. Crummey . Co-Array fortran performance and potential: An NPB experimental study . In In Proc. of the 16th Intl. Workshop on Languages and Compilers for Parallel Computing , 2003 . C. Coarfa, Y. Dotsenko, J. Eckhardt, and J. M. Crummey. Co-Array fortran performance and potential: An NPB experimental study. In In Proc. of the 16th Intl. Workshop on Languages and Compilers for Parallel Computing, 2003."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-006-7952-7"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/762761.762821"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-75416-9_2"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1809961.1809973"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1809961.1809969"},{"key":"e_1_2_1_12_1","unstructured":"Osu micro-benchmark. http:\/\/mvapich.cse.ohio-state.edu\/benchmarks\/.  Osu micro-benchmark. http:\/\/mvapich.cse.ohio-state.edu\/benchmarks\/."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5161076"},{"key":"e_1_2_1_14_1","unstructured":"NAS Parallel Benchmarks. http:\/\/www.nas.nasa.gov\/Resources\/Software\/npb.html.  NAS Parallel Benchmarks. http:\/\/www.nas.nasa.gov\/Resources\/Software\/npb.html."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2005.03.003"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/289918.289920"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2071033"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/645783.666998"},{"key":"e_1_2_1_19_1","volume-title":"Scientific Programming-Exploring Languages for Expressing Medium to Massive On-Chip Parallelism","author":"Shan H.","year":"2010","unstructured":"H. Shan , F. Blagojevic , S. J. Min , P. Hargrove , H. Jin , K. Fuerlinger , A. Koniges , and N. J. Wright . A programming model performance study using the nas parallel benchmarks . In Scientific Programming-Exploring Languages for Expressing Medium to Massive On-Chip Parallelism , Vol. 18 , Issue 3-4, August 2010 . H. Shan, F. Blagojevic, S. J. Min, P. Hargrove, H. Jin, K. Fuerlinger, A. Koniges, and N. J. Wright. A programming model performance study using the nas parallel benchmarks. In Scientific Programming-Exploring Languages for Expressing Medium to Massive On-Chip Parallelism, Vol. 18, Issue 3-4, August 2010."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362671"},{"key":"e_1_2_1_21_1","unstructured":"Challenges for the message passing interface in the petaflops era. www.cs.uiuc.edu\/homes\/wgropp\/bib\/talks\/tdata\/2007\/mpifuture-uiuc.pdf.  Challenges for the message passing interface in the petaflops era. www.cs.uiuc.edu\/homes\/wgropp\/bib\/talks\/tdata\/2007\/mpifuture-uiuc.pdf."}],"container-title":["ACM SIGMETRICS Performance Evaluation Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2381056.2381077","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2381056.2381077","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:00:40Z","timestamp":1750276840000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2381056.2381077"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,10,8]]},"references-count":20,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2012,10,8]]}},"alternative-id":["10.1145\/2381056.2381077"],"URL":"https:\/\/doi.org\/10.1145\/2381056.2381077","relation":{},"ISSN":["0163-5999"],"issn-type":[{"type":"print","value":"0163-5999"}],"subject":[],"published":{"date-parts":[[2012,10,8]]},"assertion":[{"value":"2012-10-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}