{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T15:02:33Z","timestamp":1770994953475,"version":"3.50.1"},"reference-count":20,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2017,4,21]],"date-time":"2017-04-21T00:00:00Z","timestamp":1492732800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2017,9]]},"abstract":"<jats:p> Heterogeneity among the computational resources within a single machine has significantly increased in high performance computing to exploit the tremendous potential of graphics processing units (GPUs). Portability in terms of code development and performance has been a challenge due to major differences between GPU programming and memory models from one side and conventional central processing units (CPUs) from another side. Performance characteristics of compilers and processors also vary between machines. Emerging high-level directive-based programming models such as OpenACC has been proposed to target this challenge. In this work, we develop OpenACC implementations for both seismic modelling and reverse time migration algorithms that solve the isotropic, acoustic, and elastic wave equations. We employ OpenACC to take advantage of the computational power of two Nvidia GPU cards: (1) M2090 and (2) K40, residing in IBM and CRAY XC30 clusters respectively. We also explore the main aspects of hybridization seismic modelling and reverse time migration by implementing an Message Passing Interface (MPI)+OpenACC approach. We expose various mapping techniques to develop a portable code that maximizes performance regardless of compiler or platform. Depending on the intensity of the computations, different propagators exhibited different speedup behaviours against a full socket CPU MPI implementation. A performance enhancement of ~10\u00d7 was obtained, when the acoustic model was ported to a single GPU, compared with a 1.7\u00d7 speedup obtained using the isotropic model. Our MPI+OpenACC implementation of reverse time migration and seismic modelling shows promising scaling when multiple GPUs were used. <\/jats:p>","DOI":"10.1177\/1094342016675678","type":"journal-article","created":{"date-parts":[[2017,4,21]],"date-time":"2017-04-21T11:21:46Z","timestamp":1492773706000},"page":"422-440","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":16,"title":["Performance portability in reverse time migration and seismic modelling via OpenACC"],"prefix":"10.1177","volume":"31","author":[{"given":"Ahmad","family":"Qawasmeh","sequence":"first","affiliation":[{"name":"Department of Computer Science, The Hashemite University, Zarqa, Jordan"}]},{"given":"Maxime R","family":"Hugues","sequence":"additional","affiliation":[{"name":"Advanced Computing Department, TOTAL E&P R&T, Houston, USA"}]},{"given":"Henri","family":"Calandra","sequence":"additional","affiliation":[{"name":"Advanced Computing Department, TOTAL E&P R&T, Houston, USA"}]},{"given":"Barbara M","family":"Chapman","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Houston, Houston, USA"}]}],"member":"179","published-online":{"date-parts":[[2017,4,21]]},"reference":[{"key":"bibr1-1094342016675678","doi-asserted-by":"crossref","unstructured":"Abdelkhalek R, Calandra H, Coulaud O, et al. (2009) Fast seismic modeling and reverse time migration on a GPU cluster. In: International conference on high performance computing and simulation, Leipzig, Germany, June 2009, pp.36\u201343. USA: IEEE. Available at: http:\/\/ieeexplore.ieee.org","DOI":"10.1109\/HPCSIM.2009.5192786"},{"key":"bibr2-1094342016675678","unstructured":"Advisory Review Board (ARB) O (2014) The OpenMP API specification for parallel programming. Available at: http:\/\/openmp.org\/wp\/."},{"key":"bibr3-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1190\/1.1441434"},{"key":"bibr4-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1006\/jcph.1994.1159"},{"key":"bibr5-1094342016675678","author":"Bland AS","year":"2012","journal-title":"Proceedings of cray user group conference (CUG 2012)"},{"key":"bibr6-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1190\/1.1500393"},{"key":"bibr7-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1190\/1.1442041"},{"key":"bibr8-1094342016675678","unstructured":"CUDA N (2007) Compute Unified Device Architecture Programming Guide. Available at: http:\/\/developer.download.nvidia.com\/compute\/cuda\/1.0\/NVIDIA_CUDA_Programming_Guide_1.0.pdf"},{"key":"bibr9-1094342016675678","unstructured":"CAPS Enterprise (2010) HMPP directives. Available at: https:\/\/www.olcf.ornl.gov\/wp-content\/uploads\/2012\/02\/HMPPWorkbench-3.0_HMPP_Directives_ReferenceManual.pdf."},{"key":"bibr10-1094342016675678","doi-asserted-by":"crossref","unstructured":"Feki S, Al\u2013Jarro A, Bagci H (2013) Multi-GPU-based acceleration of the explicit time domain volume integral equation solver using MPI-openacc. In: Radio science meeting (joint with AP-S symposium), Egypt, 2013, pp.90\u201390. USA : IEEE. Available at: http:\/\/ieeexplore.ieee.org","DOI":"10.1109\/USNC-URSI.2013.6715396"},{"key":"bibr11-1094342016675678","doi-asserted-by":"crossref","unstructured":"Ghosh S, Liao T, Calandra H, et al. (2012) Experiences with OpenMP, PGI, HMPP and OpenACC directives on ISO\/TTI kernels. In: High Performance Computing, Networking, Storage and Analysis. USA: IEEE, pp.691\u2013700. Available at: http:\/\/ieeexplore.ieee.org","DOI":"10.1109\/SC.Companion.2012.95"},{"key":"bibr12-1094342016675678","doi-asserted-by":"crossref","unstructured":"Herdman J, Gaudin W, McIntosh\u2013Smith S, et al. (2012) Accelerating hydrocodes with openacc, opecl and cuda. In: High Performance Computing, Networking, Storage and Analysis. USA: IEEE, pp. 465\u2013471. Available at: http:\/\/ieeexplore.ieee.org","DOI":"10.1109\/SC.Companion.2012.66"},{"key":"bibr13-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.69"},{"key":"bibr14-1094342016675678","unstructured":"Munshi A, Gaster B, Mattson TG, et al. (2011) OpenCL Programming Guide. Pearson Education. Available at: http:\/\/ptgmedia.pearsoncmg.com\/images\/9780321749642\/samplepages\/0321749642.pdf"},{"key":"bibr15-1094342016675678","unstructured":"OpenACC-Standardorg (2013) The OpenACC application programming interface. Available at: http:\/\/www.openacc.org\/sites\/default\/files\/OpenACC.2.0a_1.pdf."},{"key":"bibr16-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1145\/2712386.2712401"},{"key":"bibr17-1094342016675678","first-page":"511","volume-title":"Supercomputing","author":"Siddiqui S","year":"2014"},{"key":"bibr18-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1190\/1.3627917"},{"key":"bibr19-1094342016675678","unstructured":"The Portland Group (2010) PGI accelerator programming model for Fortran & C. Available at: http:\/\/www.pgroup.com\/lit\/whitepapers\/pgi_accel_prog_model_1.3.pdf."},{"key":"bibr20-1094342016675678","doi-asserted-by":"publisher","DOI":"10.1145\/1735688.1735697"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016675678","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342016675678","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016675678","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T13:23:53Z","timestamp":1740835433000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342016675678"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,4,21]]},"references-count":20,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2017,9]]}},"alternative-id":["10.1177\/1094342016675678"],"URL":"https:\/\/doi.org\/10.1177\/1094342016675678","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,4,21]]}}}