{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T04:08:10Z","timestamp":1774325290295,"version":"3.50.1"},"reference-count":7,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2015,3,30]],"date-time":"2015-03-30T00:00:00Z","timestamp":1427673600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2015,8]]},"abstract":"<jats:p> We present a case study of porting NekBone, a skeleton version of the Nek5000 code, to a parallel GPU-accelerated system. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. The original NekBone Fortran source code has been used as the base and enhanced by OpenACC directives. The profiling of NekBone provided an assessment of the suitability of the code for GPU systems, and indicated possible kernel optimizations. To port NekBone to GPU systems required little effort and a small number of additional lines of code (approximately one OpenACC directive per 1000 lines of code). The na\u00efve implementation using OpenACC leads to little performance improvement: on a single node, from 16 Gflops obtained with the version without OpenACC, we reached 20 Gflops with the na\u00efve OpenACC implementation. An optimized NekBone version leads to a 43 Gflop performance on a single node. In addition, we ported and optimized NekBone to parallel GPU systems, reaching a parallel efficiency of 79.9% on 1024 GPUs of the Titan XK7 supercomputer at the Oak Ridge National Laboratory. <\/jats:p>","DOI":"10.1177\/1094342015576846","type":"journal-article","created":{"date-parts":[[2015,4,1]],"date-time":"2015-04-01T15:17:05Z","timestamp":1427901425000},"page":"311-319","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":30,"title":["OpenACC acceleration of the Nek5000 spectral element code"],"prefix":"10.1177","volume":"29","author":[{"given":"Stefano","family":"Markidis","sequence":"first","affiliation":[{"name":"HPCViz Department, KTH Royal Institute of Technology, Sweden"}]},{"given":"Jing","family":"Gong","sequence":"additional","affiliation":[{"name":"HPCViz Department, KTH Royal Institute of Technology, Sweden"}]},{"given":"Michael","family":"Schliephake","sequence":"additional","affiliation":[{"name":"HPCViz Department, KTH Royal Institute of Technology, Sweden"}]},{"given":"Erwin","family":"Laure","sequence":"additional","affiliation":[{"name":"HPCViz Department, KTH Royal Institute of Technology, Sweden"}]},{"given":"Alistair","family":"Hart","sequence":"additional","affiliation":[{"name":"Cray Exascale Research Initiative Europe, UK"}]},{"given":"David","family":"Henty","sequence":"additional","affiliation":[{"name":"Edinburgh Parallel Computing Centre, Edinburgh University, UK"}]},{"given":"Katherine","family":"Heisey","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, USA"}]},{"given":"Paul","family":"Fischer","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, USA"}]}],"member":"179","published-online":{"date-parts":[[2015,3,30]]},"reference":[{"key":"bibr1-1094342015576846","first-page":"557","volume-title":"Applications, Tools and Techniques on the Road to Exascale Computing","volume":"22","author":"Ansaloni R","year":"2011"},{"key":"bibr2-1094342015576846","doi-asserted-by":"publisher","DOI":"10.1088\/1749-4699\/2\/1\/015001"},{"key":"bibr3-1094342015576846","doi-asserted-by":"publisher","DOI":"10.1177\/1094342009347714"},{"key":"bibr4-1094342015576846","first-page":"167","volume-title":"Applications, Tools and Techniques on the Road to Exascale Computing","volume":"22","author":"Gray A","year":"2011"},{"key":"bibr5-1094342015576846","unstructured":"Kogge P, Bergman K, Borkar S, Campbell D, Carson W, Dally W, (2008) Exascale computing study: Technology challenges in achieving exascale systems. Report, Defense Advanced Research Projects Agency Information Processing Techniques Office."},{"key":"bibr6-1094342015576846","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9991(84)90128-1"},{"key":"bibr7-1094342015576846","doi-asserted-by":"publisher","DOI":"10.1145\/331532.331599"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342015576846","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342015576846","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342015576846","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T01:15:13Z","timestamp":1740964513000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342015576846"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,30]]},"references-count":7,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2015,8]]}},"alternative-id":["10.1177\/1094342015576846"],"URL":"https:\/\/doi.org\/10.1177\/1094342015576846","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,3,30]]}}}