{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,14]],"date-time":"2026-05-14T08:20:53Z","timestamp":1778746853001,"version":"3.51.4"},"reference-count":40,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2020,5,19]],"date-time":"2020-05-19T00:00:00Z","timestamp":1589846400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:p>The aim of this study is to design and implement an asynchronous computational scheme for solving the acoustic wave propagation equation with absorbing boundary conditions (ABCs) in the context of seismic imaging applications. While the convolutional perfectly matched layer (CPML) is typically used for ABCs in the oil and gas industry, its formulation further stresses memory accesses and decreases the arithmetic intensity at the physical domain boundaries. The challenges with CPML are twofold: (1) the strong, inherent data dependencies imposed on the explicit time-stepping scheme render asynchronous time integration cumbersome and (2) the idle time is further exacerbated by the load imbalance introduced among processing units. In fact, the CPML formulation of the ABCs requires expensive synchronization points, which may hinder the parallel performance of the overall asynchronous time integration. In particular, when deployed in conjunction with the multicore-optimized wavefront diamond temporal blocking (MWD-TB) approach for the inner domain points, it results in a major performance slow down. To relax CPML\u2019s synchrony and mitigate the resulting load imbalance, we embed CPML\u2019s calculation into MWD-TB\u2019s inner loop and carry on the time integration with fine-grained computations in an asynchronous, holistic way. This comes at the price of storing transient results to alleviate dependencies from critical data hazards while maintaining the numerical accuracy of the original scheme. Performance and scalability results on various x86 architectures demonstrate the superiority of MWD-TB with CPML support against the standard spatial blocking on various grid sizes. To our knowledge, this is the first practical study that highlights the consolidation of CPML ABCs with asynchronous temporal blocking stencil computations.<\/jats:p>","DOI":"10.1177\/1094342020923027","type":"journal-article","created":{"date-parts":[[2020,5,19]],"date-time":"2020-05-19T06:58:50Z","timestamp":1589871530000},"page":"377-393","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":10,"title":["Asynchronous computations for solving the acoustic wave propagation equation"],"prefix":"10.1177","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1057-1590","authenticated-orcid":false,"given":"Kadir","family":"Akbudak","sequence":"first","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences and Engineering Division, Extreme Computing Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6897-1095","authenticated-orcid":false,"given":"Hatem","family":"Ltaief","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences and Engineering Division, Extreme Computing Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vincent","family":"Etienne","sequence":"additional","affiliation":[{"name":"Exploration and Petroleum Engineering Center\u2013Advanced Research Center, Saudi Aramco, Dhahran, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rached","family":"Abdelkhalak","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences and Engineering Division, Extreme Computing Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thierry","family":"Tonellot","sequence":"additional","affiliation":[{"name":"Exploration and Petroleum Engineering Center\u2013Advanced Research Center, Saudi Aramco, Dhahran, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Keyes","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences and Engineering Division, Extreme Computing Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,5,19]]},"reference":[{"key":"bibr1-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.107"},{"key":"bibr2-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/1.1441434"},{"key":"bibr3-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/1379022.1375595"},{"key":"bibr4-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1111\/j.1365-2478.1990.tb01872.x"},{"key":"bibr5-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.70"},{"key":"bibr6-1094342020923027","unstructured":"Datta K (2009) Auto-tuning stencil codes for cache-based multicore platforms. PhD Thesis, EECS Department, University of California, Berkeley."},{"key":"bibr7-1094342020923027","doi-asserted-by":"publisher","DOI":"10.3997\/2214-4609.201702324"},{"key":"bibr8-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/segam2014-0176.1"},{"key":"bibr9-1094342020923027","first-page":"699","volume":"51","author":"Fornberg B","year":"1988","journal-title":"Geophysics"},{"key":"bibr10-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088197"},{"key":"bibr11-1094342020923027","first-page":"66","volume-title":"IEEE\/ACM international symposium on code generation and optimization","author":"Grosser T","year":"2014"},{"key":"bibr12-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626414410023"},{"key":"bibr13-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/2464996.2467268"},{"key":"bibr14-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304619"},{"key":"bibr15-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/1.3627855"},{"key":"bibr16-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/1.2757586"},{"key":"bibr17-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/1.1442422"},{"key":"bibr18-1094342020923027","unstructured":"Malas T (2015) Girih stencil optimization framework. Available at: https:\/\/github.com\/ecrc\/girih (accessed 3 May 2020)."},{"key":"bibr19-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1137\/140991133"},{"key":"bibr20-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/3155290"},{"key":"bibr21-1094342020923027","first-page":"19","volume":"2","author":"McCalpin JD","year":"1995","journal-title":"IEEE Computer Society Technical Committee on Computer Architecture Newsletter"},{"key":"bibr22-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.2"},{"key":"bibr23-1094342020923027","volume-title":"Diamond Tiling: A Tiling Framework for Time-Iterated Scientific Applications","author":"Orozco D","year":"2009"},{"key":"bibr24-1094342020923027","first-page":"77","volume-title":"International workshop on languages and compilers for parallel computing","author":"Orozco D","year":"2010"},{"key":"bibr25-1094342020923027","first-page":"2925","volume-title":"SEG technical program expanded abstracts","author":"Pasalic D","year":"2010"},{"key":"bibr26-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/eScience.2011.62"},{"key":"bibr27-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2011.47"},{"key":"bibr28-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/1989493.1989508"},{"key":"bibr29-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2017.02.022"},{"key":"bibr30-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/ICPPW.2010.38"},{"key":"bibr31-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/1.1442147"},{"key":"bibr32-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1190\/1.3238367"},{"key":"bibr33-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC.2009.82"},{"key":"bibr34-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/1498765.1498785"},{"key":"bibr35-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2000.845979"},{"key":"bibr36-1094342020923027","first-page":"3","volume-title":"3rd International workshop on polyhedral compilation techniques","author":"Wonnacott DG","year":"2013"},{"key":"bibr37-1094342020923027","unstructured":"Yang C (2018) LIKWID at NERSC. Exascale computing project (ECP) 2nd annual meeting. Available at: https:\/\/crd.lbl.gov\/assets\/Uploads\/ECP18-Roofline-3-LIKWID.pdf (accessed 1 March 2020)."},{"key":"bibr38-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1016\/j.cageo.2014.04.004"},{"key":"bibr39-1094342020923027","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126920"},{"key":"bibr40-1094342020923027","unstructured":"Zhou X (2013) Tiling optimizations for stencil computations. PhD Thesis, University of Illinois at Urbana-Champaign, Illinois."}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020923027","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342020923027","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020923027","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:15:58Z","timestamp":1777450558000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342020923027"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,19]]},"references-count":40,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["10.1177\/1094342020923027"],"URL":"https:\/\/doi.org\/10.1177\/1094342020923027","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,19]]}}}