{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T22:46:50Z","timestamp":1777675610025,"version":"3.51.4"},"reference-count":29,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2020,7,10]],"date-time":"2020-07-10T00:00:00Z","timestamp":1594339200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2020,11]]},"abstract":"<jats:p>With the acquisition and widespread use of more resources that rely on accelerator\/wide vector\u2013based computing, there has been a strong demand for science and engineering applications to take advantage of these latest assets. This, however, has been extremely challenging due to the diversity of systems to support their extreme concurrency, complex memory hierarchies, costly data movement, and heterogeneous node architectures. To address these challenges, we design a programming model and describe its ease of use in the development of a new MAGMA Templates library that delivers high-performance scalable linear algebra portable on current and emerging architectures. MAGMA Templates derives its performance and portability by (1) building on existing state-of-the-art linear algebra libraries, like MAGMA, SLATE, Trilinos, and vendor-optimized math libraries, and (2) providing access (seamlessly to the users) to the latest algorithms and architecture-specific optimizations through a single, easy-to-use C++-based API.<\/jats:p>","DOI":"10.1177\/1094342020938421","type":"journal-article","created":{"date-parts":[[2020,7,10]],"date-time":"2020-07-10T07:31:56Z","timestamp":1594366316000},"page":"645-658","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":12,"title":["MAGMA templates for scalable linear algebra on emerging architectures"],"prefix":"10.1177","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4988-4674","authenticated-orcid":false,"given":"Mohammed","family":"Al Farhan","sequence":"first","affiliation":[{"name":"The University of Tennessee, Knoxville, TN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ahmad","family":"Abdelfattah","sequence":"additional","affiliation":[{"name":"The University of Tennessee, Knoxville, TN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stanimire","family":"Tomov","sequence":"additional","affiliation":[{"name":"The University of Tennessee, Knoxville, TN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mark","family":"Gates","sequence":"additional","affiliation":[{"name":"The University of Tennessee, Knoxville, TN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dalal","family":"Sukkari","sequence":"additional","affiliation":[{"name":"The University of Tennessee, Knoxville, TN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Azzam","family":"Haidar","sequence":"additional","affiliation":[{"name":"Nvidia Corporation, Santa Clara, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert","family":"Rosenberg","sequence":"additional","affiliation":[{"name":"Naval Research Laboratory, Washington, DC, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jack","family":"Dongarra","sequence":"additional","affiliation":[{"name":"The University of Tennessee, Knoxville, TN, USA"},{"name":"Oak Ridge National Laboratory, Oak Ridge, TN, USA"},{"name":"University of Manchester, Manchester, England, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,7,10]]},"reference":[{"issue":"4","key":"bibr1-1094342020938421","first-page":"67","volume":"2","author":"Abalenkovs M","year":"2015","journal-title":"Supercomputing Frontiers and Innovations"},{"key":"bibr2-1094342020938421","unstructured":"Abdelfattah A, Anzt H, Bouteiller A, et al. (2017) Roadmap for the development of a linear algebra library for exascale computing: SLATE: software for linear algebra targeting Exascale. SLATE Working Notes 1, ICL-UT-17-02."},{"key":"bibr3-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1137\/18M1173599"},{"key":"bibr4-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-64203-1_40"},{"key":"bibr5-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/180\/1\/012037"},{"key":"bibr6-1094342020938421","unstructured":"Al Farhan MA (2019) Unstructured computations on emerging architectures. DOI:10.25781\/KAUST. Available at: http:\/\/hdl.handle.net\/10754\/644902 (accessed 2 September 2019)."},{"key":"bibr7-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2018.2826533"},{"key":"bibr8-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2016.06.001"},{"key":"bibr9-1094342020938421","unstructured":"Anzt H, Boman E, Dongarra J, et al. (2017) Magma-Sparse Interface Design Whitepaper. Technical Report ICL-UT-17-05."},{"key":"bibr10-1094342020938421","unstructured":"Anzt H, Tomov S, Dongarra J (2014) Implementing a Sparse Matrix Vector Product for the Sell-C\/Sell-C-\u03c3 Formats on NVIDIA GPUs. Technical Report UT-EECS-14-727."},{"key":"bibr12-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"bibr13-1094342020938421","doi-asserted-by":"crossref","unstructured":"Edwards HC, Trott CR, Sunderland D (2014) Kokkos: enabling manycore performance portability through polymorphic memory access patterns. Journal of Parallel and Distributed Computing 74(12): 3202\u20133216. Available at: http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0743731514001257 (accessed 2 September 2019). Domain-Specific Languages and High-Level Frameworks for High-Performance Computing.","DOI":"10.1016\/j.jpdc.2014.07.003"},{"key":"bibr14-1094342020938421","unstructured":"EIGEN (2018) Available at: https:\/\/eigen.tuxfamily.org\/dox-devel\/GettingStarted.html (accessed 2 September 2019)."},{"key":"bibr15-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/3295500.3356223"},{"key":"bibr16-1094342020938421","volume-title":"C++ API for BLAS and LAPACK","author":"Gates M","year":"2017"},{"key":"bibr17-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1177\/1094342015593156"},{"key":"bibr18-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00050"},{"key":"bibr19-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/3148226.3148237"},{"key":"bibr20-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/1089014.1089021"},{"key":"bibr21-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/3330345.3330356"},{"key":"bibr22-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/3337821.3337908"},{"key":"bibr23-1094342020938421","unstructured":"Kurzak J, Wu P, Gates M, et al. (2017) Designing SLATE: software for linear algebra targeting Exascale. SLATE Working Notes 3, ICL-UT-17-06."},{"key":"bibr25-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-012-0825-3"},{"key":"bibr26-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-01970-8_89"},{"key":"bibr27-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751209"},{"key":"bibr28-1094342020938421","unstructured":"Medina DS, St-Cyr A, Warburton T (2014) OCCA: a unified approach to multi-threading languages. arXiv preprint arXiv:1403.0968."},{"key":"bibr29-1094342020938421","unstructured":"RAJA (2016) Available at: https:\/\/raja.readthedocs.io\/en\/master\/getting_started.html (accessed 2 September 2019)."},{"key":"bibr31-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2009.12.005"},{"key":"bibr32-1094342020938421","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(00)00087-9"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020938421","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342020938421","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020938421","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:16:00Z","timestamp":1777450560000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342020938421"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,10]]},"references-count":29,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,11]]}},"alternative-id":["10.1177\/1094342020938421"],"URL":"https:\/\/doi.org\/10.1177\/1094342020938421","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,10]]}}}