{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,4]],"date-time":"2025-03-04T05:35:55Z","timestamp":1741066555088,"version":"3.38.0"},"reference-count":34,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2017,3,26]],"date-time":"2017-03-26T00:00:00Z","timestamp":1490486400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2019,1]]},"abstract":"<jats:p> To improve productivity for developing parallel applications on high performance computing systems, the XcalableMP PGAS language has been proposed. XcalableMP supports both a typical parallelization under the \u201cglobal-view memory model\u201d which uses directives and a flexible parallelization under the \u201clocal-view memory model\u201d which uses coarray features. The goal of the present paper is to clarify XcalableMP\u2019s productivity and performance. To do so, we implement and evaluate the high performance computing challenge benchmark, namely, EP STREAM Triad, High Performance Linpack, Global fast Fourier transform, and RandomAccess on the K computer using up to 16,384 compute nodes and a generic cluster system using up to 128 compute nodes. We found that we could more easily implement the benchmarks using XcalableMP rather than using MPI. Moreover, most of the performance results using XcalableMP were almost the same as those using MPI. <\/jats:p>","DOI":"10.1177\/1094342017698214","type":"journal-article","created":{"date-parts":[[2017,3,27]],"date-time":"2017-03-27T06:10:05Z","timestamp":1490595005000},"page":"110-123","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":4,"title":["Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language"],"prefix":"10.1177","volume":"33","author":[{"given":"Masahiro","family":"Nakao","sequence":"first","affiliation":[{"name":"RIKEN Advanced Institute for Computational Science, Japan"}]},{"given":"Hitoshi","family":"Murai","sequence":"additional","affiliation":[{"name":"RIKEN Advanced Institute for Computational Science, Japan"}]},{"given":"Hidetoshi","family":"Iwashita","sequence":"additional","affiliation":[{"name":"RIKEN Advanced Institute for Computational Science, Japan"}]},{"given":"Taisuke","family":"Boku","sequence":"additional","affiliation":[{"name":"Center for Computational Sciences, University of Tsukuba, Japan"}]},{"given":"Mitsuhisa","family":"Sato","sequence":"additional","affiliation":[{"name":"RIKEN Advanced Institute for Computational Science, Japan"}]}],"member":"179","published-online":{"date-parts":[[2017,3,26]]},"reference":[{"key":"bibr1-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2009.370"},{"key":"bibr2-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1007\/BF00162341"},{"key":"bibr3-1094342017698214","unstructured":"Basic Linear Algebra Subprograms (2016) BLAS (Basic Linear Algebra Subprograms). Available at: http:\/\/www.netlib.org\/blas\/ (accessed 26 August 2016)."},{"key":"bibr4-1094342017698214","unstructured":"Bonachea D (2002) GASNet specification. Report CSD-02-1207, University of California, USA."},{"key":"bibr5-1094342017698214","unstructured":"Chamberlain B, Choi SE, Dumler M, (2012) Chapel HPC challenge entry. Available at: http:\/\/www.hpcchallenge.org\/presentations\/sc2012\/ChapelHPCC2012.pdf (acce- ssed 26 August 2016)."},{"key":"bibr6-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1177\/1094342007078442"},{"key":"bibr7-1094342017698214","doi-asserted-by":"publisher","DOI":"10.21236\/ADA439315"},{"key":"bibr8-1094342017698214","unstructured":"Dongarra J, Luszczek P (2005) HPC challenge awards class 2 specification. Available at: http:\/\/www.hpcchallenge.org\/class2specs.pdf (accessed 26 August 2016)."},{"key":"bibr9-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2002.10034"},{"key":"bibr10-1094342017698214","unstructured":"HPC Challenge Benchmarks (2014) HPC Challenge. Available at: http:\/\/icl.cs.utk.edu\/hpcc\/ (accessed 26 August 2016)."},{"key":"bibr11-1094342017698214","unstructured":"HPL Algorithm Panel Broadcast (2016) HPL Algorithm. Available at: http:\/\/www.netlib.org\/benchmark\/hpl\/algorithm.html (accessed 26 August 2016)."},{"key":"bibr12-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.104"},{"key":"bibr13-1094342017698214","first-page":"1","volume-title":"Proceedings of the 3rd ACM SIGPLAN conference on history of programming languages","author":"Ken K","year":"2007"},{"key":"bibr14-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1063\/1.4823319"},{"volume-title":"MPI: The Complete Reference, Volume 1. The MPI Core","year":"1998","author":"Marc S","key":"bibr15-1094342017698214"},{"key":"bibr16-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid.2012.118"},{"key":"bibr17-1094342017698214","first-page":"157","volume-title":"7th international conference on PGAS programming model","author":"Nakao M","year":"2013"},{"key":"bibr18-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/WACCPD.2014.6"},{"key":"bibr19-1094342017698214","unstructured":"Nakao M, Murai H, Iwashita H, (2014b) XcalableMP and XcalableACC for productivity and performance in HPC challenge award competition. Available at: http:\/\/www.hpcchallenge.org\/presentations\/sc2014\/hpcc2014-presentation.pdf (accessed 26 August 2016)."},{"key":"bibr20-1094342017698214","unstructured":"Numwich R, Reid J (1998) Co-Array Fortran for parallel programming. Technical Report RAL-TR-1998-060, Rutherford Appleton Laboratory, UK."},{"key":"bibr21-1094342017698214","first-page":"53","volume-title":"Proceedings of the 19th ACM SIGPLAN symposium on principles and practice of parallel programming","author":"Olivier T","year":"2014"},{"key":"bibr22-1094342017698214","unstructured":"Omni Compiler Project (2016) Omni Compiler Available at: http:\/\/omni-compiler.org (accessed 26 August 2016)."},{"key":"bibr23-1094342017698214","unstructured":"OpenACC-standard.org (2016) Available at: http:\/\/www.openacc.org (accessed 26 August 2016)."},{"key":"bibr24-1094342017698214","unstructured":"OpenMP.org (2016) Home - OpenMP Available at: http:\/\/openmp.org (accessed 26 August 2016)."},{"key":"bibr25-1094342017698214","unstructured":"Pccluster.org (2016) PC Cluster Consortium Available at: http:\/\/www.pccluster.org (accessed 26 August 2016)."},{"key":"bibr26-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1145\/1103845.1094852"},{"key":"bibr27-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/FMPC.1992.234899"},{"issue":"3","key":"bibr28-1094342017698214","first-page":"324","volume":"48","author":"Shida N","year":"2012","journal-title":"FUJITSU Scientific & Technical Journal"},{"key":"bibr29-1094342017698214","unstructured":"Takahashi D (2014) A fast fourier transform package. Available at: http:\/\/www.ffte.jp (accessed 26 August 2016)."},{"key":"bibr30-1094342017698214","doi-asserted-by":"crossref","unstructured":"Tardieu O, Grove D, Bloom B, (2012) X10 for productivity and performance at scale. Available at: http:\/\/www.hpcchallenge.org\/presentations\/sc2012\/x10-hpcc.pdf (accessed 26 August 2016).","DOI":"10.1145\/2481268.2481276"},{"key":"bibr31-1094342017698214","first-page":"57","volume-title":"The 6th AICS international symposium","author":"Tsugane K.","year":"2016"},{"key":"bibr32-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611970999"},{"key":"bibr33-1094342017698214","unstructured":"XcalableMP Specification Working Group (2014) XcalableMP Language Specification. Available at: http:\/\/xcalablemp.org\/download\/spec\/xmp-spec-1.2.1.pdf (accessed 26 August 2016)."},{"key":"bibr34-1094342017698214","doi-asserted-by":"publisher","DOI":"10.1109\/ISLPED.2011.5993668"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017698214","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342017698214","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017698214","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T11:08:20Z","timestamp":1741000100000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342017698214"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,3,26]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,1]]}},"alternative-id":["10.1177\/1094342017698214"],"URL":"https:\/\/doi.org\/10.1177\/1094342017698214","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2017,3,26]]}}}