{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T22:45:43Z","timestamp":1777675543432,"version":"3.51.4"},"reference-count":19,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2019,4,15]],"date-time":"2019-04-15T00:00:00Z","timestamp":1555286400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"name":"EU H2020","award":["671698"],"award-info":[{"award-number":["671698"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2019,9]]},"abstract":"<jats:p>We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific focus on memory characteristics and energy needs. ExaHyPE combines dynamically adaptive mesh refinement (AMR) with ADER-DG. It is parallelized using tasks, and it is cache efficient. AMR plus ADER-DG yields a task graph which is highly dynamic in nature and comprises both arithmetically expensive tasks and tasks which challenge the memory\u2019s latency. The expensive tasks and thus the whole code benefit from AVX vectorization, although we suffer from memory access bursts. A frequency reduction of the chip improves the code\u2019s energy-to-solution. Yet, it does not mitigate burst effects. The bursts\u2019 latency penalty becomes worse once we add Intel Optane technology, increase the core count significantly or make individual, computationally heavy tasks fall out of close caches. Thread overbooking to hide away these latency penalties becomes contra-productive with noninclusive caches as it destroys the cache and vectorization character. In cases where memory-intense and computationally expensive tasks overlap, ExaHyPE\u2019s cache-oblivious implementation nevertheless can exploit deep, noninclusive, heterogeneous memory effectively, as main memory misses arise infrequently and slow down only few cores. We thus propose that upcoming supercomputing simulation codes with dynamic, inhomogeneous task graphs are actively supported by thread runtimes in intermixing tasks of different compute character, and we propose that future hardware actively allows codes to downclock the cores running particular task types.<\/jats:p>","DOI":"10.1177\/1094342019842645","type":"journal-article","created":{"date-parts":[[2019,4,16]],"date-time":"2019-04-16T00:08:16Z","timestamp":1555373296000},"page":"973-986","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":12,"title":["Studies on the energy and deep memory behaviour of a cache-oblivious, task-based hyperbolic PDE solver"],"prefix":"10.1177","volume":"33","author":[{"given":"Dominic E","family":"Charrier","sequence":"first","affiliation":[{"name":"Department of Computer Science, Durham University, Durham, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Benjamin","family":"Hazelwood","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Durham University, Durham, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ekaterina","family":"Tutlyaeva","sequence":"additional","affiliation":[{"name":"RSC Group, Moscow, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Bader","sequence":"additional","affiliation":[{"name":"Department of Informatics, Technical University of Munich, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Dumbser","sequence":"additional","affiliation":[{"name":"Dipartimento di Ingegneria Civile Ambientale e Meccanica, Universita degli Studi di Trento, Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrey","family":"Kudryavtsev","sequence":"additional","affiliation":[{"name":"Intel, Folsom, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Moskovsky","sequence":"additional","affiliation":[{"name":"RSC Group, Moscow, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6208-1841","authenticated-orcid":false,"given":"Tobias","family":"Weinzierl","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Durham University, Durham, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2019,4,15]]},"reference":[{"key":"bibr1-1094342019842645","unstructured":"Bader M, Dumbser M, Rezzolla L, et al. (2014\u20132019) ExaHyPE\u2014an exascale hyperbolic PDE solver engine. Available at: http:\/\/www.exahype.eu (accessed 2 April 2019)."},{"key":"bibr2-1094342019842645","unstructured":"Boyandin K (2018) Guest post: Intel Optane and in-memory databases. Available at: https:\/\/blog.selectel.com\/guest-post-intel-optane-and-in-memory-databases (accessed 29 August 2018)."},{"key":"bibr3-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-20119-1_25"},{"key":"bibr4-1094342019842645","unstructured":"Charrier D, Hazelwood B, Weinzierl T (2018) Enclave tasking for discontinuous Galerkin methods on dynamically adaptive meshes (arXiv:1806.07984)."},{"key":"bibr5-1094342019842645","unstructured":"Charrier D, Weinzierl T (2018) Stop talking to me\u2014a communication-avoiding ADER-DG realisation (arXiv:1801.08682)."},{"key":"bibr6-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1785\/0120000103"},{"key":"bibr7-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2014.07.001"},{"key":"bibr8-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1111\/j.1365-246X.2006.03120.x"},{"key":"bibr9-1094342019842645","unstructured":"Glesser D (2016) Road to exascale: improving scheduling performances and reducing energy consumption with the help of end-users. PhD Thesis, Grenoble Alpes."},{"key":"bibr10-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-41321-1_23"},{"key":"bibr11-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-36574-5_10"},{"key":"bibr12-1094342019842645","unstructured":"Kudryavtsev A (2018) Optane and Intel memory drive technology, big surprise. Available at: https:\/\/itpeernetwork.intel.com\/optane-intel-memory-drive-technology (accessed 10 December 2018)."},{"key":"bibr13-1094342019842645","first-page":"19","volume-title":"IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter","author":"McCalpin J","year":"1995"},{"key":"bibr14-1094342019842645","unstructured":"Microway (2018) Detailed specifications of the Intel Xeon E5-2600v4 Broadwell-EP processors. Available at: https:\/\/www.microway.com\/knowledge-center-articles\/detailed-specifications-of-the-intel-xeon-e5-2600v4-broadwell-ep-processors (accessed 14 December 2018)."},{"key":"bibr15-1094342019842645","unstructured":"Reinders J (2007) Intel threading building blocks. O\u2019Reilly. The SPICE Code Validation (2006) Problem wp2_loh1. Available at: http:\/\/www.sismowine.org\/model\/WP2_LOH1.pdf (accessed 2 April 2019)."},{"key":"bibr16-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1109\/ICPPW.2010.38"},{"key":"bibr17-1094342019842645","author":"Weinzierl T","year":"2018","journal-title":"ACM Transactions on Mathematical Software"},{"key":"bibr18-1094342019842645","doi-asserted-by":"publisher","DOI":"10.1137\/100799071"},{"key":"bibr19-1094342019842645","unstructured":"WikiChip (2018) Intel Xeon Gold 6150. Available at: https:\/\/en.wikichip.org\/wiki\/intel\/xeon_gold\/6150 (accessed 10 December 2018)."}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342019842645","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342019842645","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342019842645","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:15:48Z","timestamp":1777450548000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342019842645"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,15]]},"references-count":19,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2019,9]]}},"alternative-id":["10.1177\/1094342019842645"],"URL":"https:\/\/doi.org\/10.1177\/1094342019842645","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,4,15]]}}}