{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T10:44:35Z","timestamp":1779360275240,"version":"3.51.4"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"14","license":[{"start":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T00:00:00Z","timestamp":1717632000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T00:00:00Z","timestamp":1717632000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/100031060","name":"European High Performance Computing Joint Undertaking","doi-asserted-by":"publisher","award":["101092621"],"award-info":[{"award-number":["101092621"]}],"id":[{"id":"10.13039\/100031060","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2024,9]]},"DOI":"10.1007\/s11227-024-06254-y","type":"journal-article","created":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T08:02:21Z","timestamp":1717660941000},"page":"21094-21127","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["OpenMP offload toward the exascale using Intel\u00ae GPU Max 1550: evaluation of STREAmS compressible solver"],"prefix":"10.1007","volume":"80","author":[{"given":"Francesco","family":"Salvadore","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Giacomo","family":"Rossi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Srikanth","family":"Sathyanarayana","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Matteo","family":"Bernardini","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,6,6]]},"reference":[{"key":"6254_CR1","unstructured":"TOP500 (2023). https:\/\/www.top500.org\/lists\/top500\/2023\/11\/. Accessed 5 March 2024"},{"key":"6254_CR2","unstructured":"EUROHPC JU (2024). https:\/\/eurohpc-ju.europa.eu\/about\/our-supercomputers_en. Accessed 5 March 2024"},{"key":"6254_CR3","unstructured":"LUMI (2024). https:\/\/lumi-supercomputer.eu\/. Accessed 5 March 2024"},{"key":"6254_CR4","unstructured":"LEONARDO (2024). https:\/\/leonardo-supercomputer.cineca.eu\/. Accessed 5 March 2024"},{"key":"6254_CR5","unstructured":"CUDA, 2023 (2024). https:\/\/docs.nvidia.com\/cuda\/cuda-c-programming-guide\/. Accessed 25 Feb 2023"},{"key":"6254_CR6","doi-asserted-by":"publisher","unstructured":"Jacobsen D, Thibault J, Senocak I (2010) An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition. American Institute of Aeronautics and Astronautics, Orlando, Florida. https:\/\/doi.org\/10.2514\/6.2010-522","DOI":"10.2514\/6.2010-522"},{"key":"6254_CR7","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1016\/j.cpc.2018.03.026","volume":"229","author":"X Zhu","year":"2018","unstructured":"Zhu X, Phillips E, Spandan V, Donners J, Ruetsch G, Romero J, Ostilla-M\u00f3nico R, Yang Y, Lohse D, Verzicco R, Fatica M, Stevens RJAM (2018) AFiD-GPU: A versatile Navier\u2013Stokes solver for wall-bounded turbulent flows on GPU clusters. Comput Phys Commun 229:199\u2013210. https:\/\/doi.org\/10.1016\/j.cpc.2018.03.026","journal-title":"Comput Phys Commun"},{"issue":"9","key":"6254_CR8","doi-asserted-by":"publisher","first-page":"9604","DOI":"10.1007\/s11227-022-05020-2","volume":"79","author":"J Wei","year":"2023","unstructured":"Wei J, Jiang J, Liu H, Zhang F, Lin P, Wang P, Yu Y, Chi X, Zhao L, Ding M, Li Y, Yu Z, Zheng W, Wang Y (2023) LICOM3-CUDA: a GPU version of LASG\/IAP climate system ocean model version 3 based on CUDA. J Supercomput 79(9):9604\u20139634. https:\/\/doi.org\/10.1007\/s11227-022-05020-2","journal-title":"J Supercomput"},{"key":"6254_CR9","unstructured":"kokkos (2024). https:\/\/github.com\/kokkos\/kokkos. Accessed 5 March 2024"},{"key":"6254_CR10","unstructured":"RAJA (2024). https:\/\/computing.llnl.gov\/projects\/raja-managing-application-portability-next-generation-platforms. Accessed 5 March 2024"},{"key":"6254_CR11","unstructured":"alpaka (2024). https:\/\/github.com\/alpaka-group\/alpaka. Accessed 5 March 2024"},{"key":"6254_CR12","unstructured":"OpenMP, 2024 (2024). https:\/\/www.openmp.org\/wp-content\/uploads\/OpenMP-API-Specification-5-2.pdf\/. Accessed 25 Feb 2024"},{"key":"6254_CR13","unstructured":"OpenACC, 2024 (2024). https:\/\/docs.nvidia.com\/hpc-sdk\/compilers\/openacc-gs\/. Accessed 25 Feb 2024"},{"key":"6254_CR14","unstructured":"SYCL 2020 Specification (revision 8) (2024). https:\/\/registry.khronos.org\/SYCL\/specs\/sycl-2020\/pdf\/sycl-2020.pdf. Accessed 24 March 2024"},{"key":"6254_CR15","unstructured":"ISO\/IEC: Programming Languages\u2014Technical Specification for C++ Extensions for Parallelism. Technical report (2015)"},{"key":"6254_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-024-05907-2","author":"M Costanzo","year":"2024","unstructured":"Costanzo M, Rucci E, Garc\u00eda-Sanchez C, Naiouf M, Prieto-Mat\u00edas M (2024) Assessing opportunities of SYCL for biological sequence alignment on GPU-based systems. J Supercomput. https:\/\/doi.org\/10.1007\/s11227-024-05907-2","journal-title":"J Supercomput"},{"key":"6254_CR17","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-024-06011-1","author":"G Malenza","year":"2024","unstructured":"Malenza G, Cesare V, Aldinucci M, Becciani U, Vecchiato A (2024) Toward HPC application portability via C++ PSTL: the Gaia AVU-GSR code assessment. J Supercomput. https:\/\/doi.org\/10.1007\/s11227-024-06011-1","journal-title":"J Supercomput"},{"key":"6254_CR18","unstructured":"HIP: C++ Heterogeneous-Compute Interface for Portability, 2023 (2024). https:\/\/github.com\/ROCm-Developer-Tools\/HIP\/. Accessed 25 Feb 2023"},{"key":"6254_CR19","unstructured":"Jansson N, Karp M, Podobas A, Markidis S, Schlatter P (2021) Neko: A modern, portable, and scalable framework for high-fidelity computational fluid dynamics. arXiv preprint arXiv:2107.01243"},{"issue":"6","key":"6254_CR20","doi-asserted-by":"publisher","DOI":"10.1063\/5.0046327","volume":"28","author":"K Germaschewski","year":"2021","unstructured":"Germaschewski K, Allen B, Dannert T, Hrywniak M, Donaghy J, Merlo G, Ethier S, D\u2019Azevedo E, Jenko F, Bhattacharjee A (2021) Toward exascale whole-device modeling of fusion devices: porting the GENE gyrokinetic microturbulence code to GPU. Phys Plasmas 28(6):062501","journal-title":"Phys Plasmas"},{"issue":"20","key":"6254_CR21","doi-asserted-by":"publisher","first-page":"6992","DOI":"10.1021\/acs.jctc.3c00249","volume":"19","author":"I Carnimeo","year":"2023","unstructured":"Carnimeo I, Affinito F, Baroni S, Baseggio O, Bellentani L, Bertossa R, Delugas PD, Ruffino FF, Orlandini S, Spiga F, Giannozzi P (2023) Quantum ESPRESSO: one further step toward the Exascale. J Chem Theory Comput 19(20):6992\u20137006","journal-title":"J Chem Theory Comput"},{"issue":"6","key":"6254_CR22","doi-asserted-by":"publisher","DOI":"10.1088\/1361-651X\/acdf06","volume":"31","author":"V Gavini","year":"2023","unstructured":"Gavini V, Baroni S, Blum V, Bowler DR, Buccheri A, Chelikowsky JR, Das S, Dawson W, Delugas P, Dogan M et al (2023) Roadmap on electronic structure codes in the exascale era. Modell Simul Mater Sci Eng 31(6):063301","journal-title":"Modell Simul Mater Sci Eng"},{"key":"6254_CR23","doi-asserted-by":"publisher","first-page":"502","DOI":"10.1016\/j.camwa.2020.01.002","volume":"81","author":"P Costa","year":"2021","unstructured":"Costa P, Phillips E, Brandt L, Fatica M (2021) GPU acceleration of CaNS for massively-parallel direct numerical simulations of canonical fluid flows. Comput Math Appl 81:502\u2013511. https:\/\/doi.org\/10.1016\/j.camwa.2020.01.002","journal-title":"Comput Math Appl"},{"key":"6254_CR24","doi-asserted-by":"crossref","unstructured":"Zubair M, Walden A, Nastac G, Nielsen E, Bauinger C, Zhu X (2023) Optimization of Ported cfd kernels on intel data center GPU Max 1550 using oneAPI ESIMD. In: Proceedings of the SC\u201923 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis. SC-W\u201923. Association for Computing Machinery, New York, pp 1705\u20131712","DOI":"10.1145\/3624062.3624251"},{"key":"6254_CR25","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-024-05989-y","author":"H Owen","year":"2024","unstructured":"Owen H, Lehmkuhl O, D\u2019Ambra P, Durastante F, Filippone S (2024) Alya toward exascale: algorithmic scalability using PSCToolkit. J Supercomput. https:\/\/doi.org\/10.1007\/s11227-024-05989-y","journal-title":"J Supercomput"},{"key":"6254_CR26","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2021.107906","volume":"263","author":"M Bernardini","year":"2021","unstructured":"Bernardini M, Modesti D, Salvadore F, Pirozzoli S (2021) STREAmS: a high-fidelity accelerated solver for direct numerical simulation of compressible turbulent flows. Comput Phys Commun 263:107906. https:\/\/doi.org\/10.1016\/j.cpc.2021.107906","journal-title":"Comput Phys Commun"},{"key":"6254_CR27","doi-asserted-by":"publisher","unstructured":"Bernardini M, Modesti D, Salvadore F, Sathyanarayana S, Della\u00a0Posta G, Pirozzoli S (2023) STREAmS-2.0: Supersonic turbulent accelerated Navier\u2013Stokes solver version 2.0. Comput Phys Commun 108644. https:\/\/doi.org\/10.1016\/j.cpc.2022.108644","DOI":"10.1016\/j.cpc.2022.108644"},{"key":"6254_CR28","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1017\/jfm.2022.393","volume":"942","author":"D Modesti","year":"2022","unstructured":"Modesti D, Sathyanarayana S, Salvadore F, Bernardini M (2022) Direct numerical simulation of supersonic turbulent flows over rough surfaces. J Fluid Mech 942:44. https:\/\/doi.org\/10.1017\/jfm.2022.393","journal-title":"J Fluid Mech"},{"key":"6254_CR29","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1017\/jfm.2022.1038","volume":"954","author":"M Bernardini","year":"2023","unstructured":"Bernardini M, Della Posta G, Salvadore F, Martelli E (2023) Unsteadiness characterisation of shock wave\/turbulent boundary-layer interaction at moderate Reynolds number. J Fluid Mech 954:43. https:\/\/doi.org\/10.1017\/jfm.2022.1038","journal-title":"J Fluid Mech"},{"key":"6254_CR30","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevFluids.8.110508","volume":"8","author":"F Salvadore","year":"2023","unstructured":"Salvadore F, Memmolo A, Modesti D, Della Posta G, Bernardini M (2023) Direct numerical simulation of a microramp in a high-Reynolds number supersonic turbulent boundary layer. Phys Rev Fluids 8:110508. https:\/\/doi.org\/10.1103\/PhysRevFluids.8.110508","journal-title":"Phys Rev Fluids"},{"key":"6254_CR31","unstructured":"Sathyanarayana S, Bernardini M, Modesti D, Pirozzoli S, Salvadore F (2023) High-speed turbulent flows towards the exascale: STREAmS-2 porting and performance. Preprint at https:\/\/arxiv.org\/abs\/2304.05494"},{"key":"6254_CR32","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1017\/S0022112010001710","volume":"657","author":"S Pirozzoli","year":"2010","unstructured":"Pirozzoli S, Bernardini M, Grasso F (2010) Direct numerical simulation of transonic shock\/boundary layer interaction under conditions of incipient separation. J Fluid Mech 657:361\u2013393. https:\/\/doi.org\/10.1017\/S0022112010001710","journal-title":"J Fluid Mech"},{"key":"6254_CR33","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2022.111494","volume":"468","author":"Y Tamaki","year":"2022","unstructured":"Tamaki Y, Kuya Y, Kawai S (2022) Comprehensive analysis of entropy conservation property of non-dissipative schemes for compressible flows: KEEP scheme redefined. J Comput Phys 468:111494. https:\/\/doi.org\/10.1016\/j.jcp.2022.111494","journal-title":"J Comput Phys"},{"key":"6254_CR34","unstructured":"OpenMP, 2013 (2013) https:\/\/www.openmp.org\/wp-content\/uploads\/OpenMP4.0.0.pdf\/. Accessed 25 Feb 2024"},{"key":"6254_CR35","doi-asserted-by":"publisher","unstructured":"Bercea G-T, Bertolli C, Antao SF, Jacob AC, Eichenberger AE, Chen T, Sura Z, Sung H, Rokos G, Appelhans D, O\u2019Brien K (2015) Performance analysis of OpenMP on a GPU using a CORAL proxy application. In: Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems. PMBS\u201915. Association for Computing Machinery, New York, NY, USA. https:\/\/doi.org\/10.1145\/2832087.2832089 . https:\/\/doi.org\/10.1145\/2832087.2832089","DOI":"10.1145\/2832087.2832089"},{"key":"6254_CR36","unstructured":"Larrea VV, Joubert W, Lopez MG, Hernandez O (2016) Early experiences writing performance portable openmp 4 codes. In: Proc. Cray User Group Meeting, London, England"},{"key":"6254_CR37","doi-asserted-by":"publisher","unstructured":"Martineau M, McIntosh-Smith S, Gaudin W (2016) Evaluating OpenMP 4.0\u2019s effectiveness as a heterogeneous parallel programming model. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp 338\u2013347. https:\/\/doi.org\/10.1109\/IPDPSW.2016.70","DOI":"10.1109\/IPDPSW.2016.70"},{"key":"6254_CR38","doi-asserted-by":"publisher","unstructured":"\u00d6zen G, Atzeni S, Wolfe M, Southwell A, Klimowicz G (2018) OpenMP GPU Offload in Flang and LLVM. In: 2018 IEEE\/ACM 5th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), pp 1\u20139. https:\/\/doi.org\/10.1109\/LLVM-HPC.2018.8639434","DOI":"10.1109\/LLVM-HPC.2018.8639434"},{"key":"6254_CR39","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1007\/978-3-031-40843-4_28","volume-title":"High Performance Computing","author":"Y Fridman","year":"2023","unstructured":"Fridman Y, Tamir G, Oren G (2023) Portability and scalability of OpenMP offloading on state-of-the-art accelerators. In: Bienz A, Weiland M, Baboulin M, Kruse C (eds) High Performance Computing. Springer, Cham, pp 378\u2013390"},{"key":"6254_CR40","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2021.102856","volume":"109","author":"S Bak","year":"2022","unstructured":"Bak S, Bertoni C, Boehm S, Budiardja R, Chapman BM, Doerfert J, Eisenbach M, Finkel H, Hernandez O, Huber J, Iwasaki S, Kale V, Kent PRC, Kwack J, Lin M, Luszczek P, Luo Y, Pham B, Pophale S, Ravikumar K, Sarkar V, Scogland T, Tian S, Yeung PK (2022) OpenMP application experiences: porting to accelerated nodes. Parallel Comput 109:102856. https:\/\/doi.org\/10.1016\/j.parco.2021.102856","journal-title":"Parallel Comput"},{"issue":"2","key":"6254_CR41","doi-asserted-by":"publisher","first-page":"2381","DOI":"10.1007\/s11227-023-05422-w","volume":"80","author":"H Guo","year":"2023","unstructured":"Guo H, Zhang L, Zhang Y, Li J, Xu X, Liu L, Cai K, Wu D, Yang S, Kong L, Gao X (2023) OpenMP offloading data transfer optimization for DCUs. J Supercomput 80(2):2381\u20132402. https:\/\/doi.org\/10.1007\/s11227-023-05422-w","journal-title":"J Supercomput"},{"key":"6254_CR42","doi-asserted-by":"publisher","unstructured":"Tian S, Scogland T, Chapman B, Doerfert J (2023) OpenMP kernel language extensions for performance portable GPU codes. In: Proceedings of the SC\u201923 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis. SC-W\u201923. Association for Computing Machinery, New York, pp 876\u2013883. https:\/\/doi.org\/10.1145\/3624062.3624164","DOI":"10.1145\/3624062.3624164"},{"key":"6254_CR43","unstructured":"GPUFORT, 2021 (2021). https:\/\/github.com\/ROCmSoftwarePlatform\/gpufort\/. Accessed 25 Feb 2023"},{"issue":"1","key":"6254_CR44","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1109\/l-ca.2013.6","volume":"13","author":"A Ilic","year":"2014","unstructured":"Ilic A, Pratas F, Sousa L (2014) Cache-aware roofline model: upgrading the loft. IEEE Comput Archit Lett 13(1):21\u201324. https:\/\/doi.org\/10.1109\/l-ca.2013.6","journal-title":"IEEE Comput Archit Lett"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-024-06254-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-024-06254-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-024-06254-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,2]],"date-time":"2024-08-02T14:01:32Z","timestamp":1722607292000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-024-06254-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,6]]},"references-count":44,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["6254"],"URL":"https:\/\/doi.org\/10.1007\/s11227-024-06254-y","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"value":"0920-8542","type":"print"},{"value":"1573-0484","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,6]]},"assertion":[{"value":"19 May 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}