{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,20]],"date-time":"2025-02-20T05:17:01Z","timestamp":1740028621459,"version":"3.37.3"},"reference-count":0,"publisher":"IOS Press","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016]]},"abstract":"<jats:p>Accelerators such as NVIDIA GPUs and Intel MICs are currently provided as co-processor devices, usable only through a CPU host. For Intel MICs it is planned that this constraint will be lifted in the near future: CPU and accelerator(s) will then form a single, many-core, processor capable of peak performance of several Teraflops with high energy efficiency. In order to exploit the available computational power, the user will be compelled to write a code more &amp;ldquo;hardware-aware&amp;rdquo;, in contrast to the common philosophy of hiding hardware details as much as possible. The simple two-sided communication approach often used in message-passing applications introduces synchronization costs that may limit the performance on the next generation machines. PGAS languages, like coarray Fortran and UPC, propose a one-sided approach where a process accesses directly the remote memory of another process without interrupting its execution. In this paper, we propose a CUDA-aware coarray implementation, capable of merging the expressive syntax of coarrays with the computational power of GPUs. We propose a new keyword for the Fortran language, which allows the user to map with a high-level syntax some hardware features of the many-core machines. Our hybrid coarray implementation is based on OpenCoarrays, the coarray transport layer currently adopted by the GNU Fortran compiler.<\/jats:p>","DOI":"10.3233\/978-1-61499-621-7-175","type":"book-chapter","created":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T15:30:51Z","timestamp":1739979051000},"source":"Crossref","is-referenced-by-count":0,"title":["Hybrid Coarrays: a PGAS Feature for Many-Core Architectures"],"prefix":"10.3233","author":[{"family":"Cardellini Valeria","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Fanfarillo Alessandro","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Filippone Salvatore","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"family":"Rouson Damian","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"7437","container-title":["Advances in Parallel Computing","Parallel Computing: On the Road to Exascale"],"original-title":[],"deposited":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T15:42:31Z","timestamp":1739979751000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.medra.org\/servlet\/aliasResolver?alias=iospressISBN&isbn=978-1-61499-620-0&spage=175&doi=10.3233\/978-1-61499-621-7-175"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016]]},"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/978-1-61499-621-7-175","relation":{},"ISSN":["0927-5452"],"issn-type":[{"value":"0927-5452","type":"print"}],"subject":[],"published":{"date-parts":[[2016]]}}}