{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T07:27:42Z","timestamp":1768030062216,"version":"3.49.0"},"reference-count":28,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2013,8,30]],"date-time":"2013-08-30T00:00:00Z","timestamp":1377820800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2014,5]]},"abstract":"<jats:p> The adoption of hybrid CPU\u2013GPU nodes in traditional supercomputing platforms such as the Cray-XK6 opens acceleration opportunities for electronic structure calculations in materials science and chemistry applications, where medium-sized generalized eigenvalue problems must be solved many times. These eigenvalue problems are too small to effectively solve on distributed systems, but can benefit from the massive computing power concentrated on a single-node, hybrid CPU\u2013GPU system. However, hybrid systems call for the development of new algorithms that efficiently exploit heterogeneity and massive parallelism of not just GPUs, but of multicore\/manycore CPUs as well. Addressing these demands, we developed a generalized eigensolver featuring novel algorithms of increased computational intensity (compared with the standard algorithms), decomposition of the computation into fine-grained memory aware tasks, and their hybrid execution. The resulting eigensolvers are state-of-the-art in high-performance computing, significantly outperforming existing libraries. We describe the algorithm and analyze its performance impact on applications of interest when different fractions of eigenvectors are needed by the host electronic structure code. <\/jats:p>","DOI":"10.1177\/1094342013502097","type":"journal-article","created":{"date-parts":[[2013,8,31]],"date-time":"2013-08-31T01:26:44Z","timestamp":1377912404000},"page":"196-209","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":23,"title":["A novel hybrid CPU\u2013GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks"],"prefix":"10.1177","volume":"28","author":[{"given":"Azzam","family":"Haidar","sequence":"first","affiliation":[{"name":"Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA"}]},{"given":"Stanimire","family":"Tomov","sequence":"additional","affiliation":[{"name":"Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA"}]},{"given":"Jack","family":"Dongarra","sequence":"additional","affiliation":[{"name":"Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA"},{"name":"Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA"},{"name":"School of Mathematics and School of Computer Science, University of Manchester, Manchester, UK"}]},{"given":"Raffaele","family":"Solc\u00e0","sequence":"additional","affiliation":[{"name":"Institut for Theoretical Physics, ETH Zurich, Switzerland"}]},{"given":"Thomas","family":"Schulthess","sequence":"additional","affiliation":[{"name":"Swiss National Supercomputer Center, Lugano, Switzerland"}]}],"member":"179","published-online":{"date-parts":[[2013,8,30]]},"reference":[{"key":"bibr1-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1007\/BF01931804"},{"key":"bibr2-1094342013502097","unstructured":"Anderson E, Bai Z, Bischof C, (1992) LAPACK Users\u2019 Guide. Philadelphia, PA: Society for Industrial and Applied Mathematics. Available at: http:\/\/www.netlib.org\/lapack\/lug\/."},{"key":"bibr3-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719604"},{"key":"bibr4-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2011.05.002"},{"key":"bibr5-1094342013502097","volume-title":"Communication-optimal parallel and sequential eigenvalue and singular value algorithms","author":"Ballard G","year":"2011"},{"key":"bibr6-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-14390-8_40"},{"key":"bibr7-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1145\/365723.365736"},{"key":"bibr8-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1137\/0908009"},{"key":"bibr9-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719642"},{"key":"bibr10-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611971446"},{"key":"bibr11-1094342013502097","volume-title":"Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide","author":"Demmel J","year":"2000"},{"key":"bibr12-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1016\/0377-0427(89)90367-1"},{"key":"bibr13-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1007\/10703040_3"},{"key":"bibr14-1094342013502097","volume-title":"Matrix Computations","author":"Golub GH","year":"1989","edition":"2"},{"key":"bibr15-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1145\/44128.44130"},{"key":"bibr16-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063394"},{"key":"bibr17-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1007\/s10543-008-0180-1"},{"key":"bibr18-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2011.05.001"},{"key":"bibr19-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/125\/1\/012058"},{"key":"bibr20-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(99)00021-6"},{"key":"bibr21-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2009.79"},{"issue":"3","key":"bibr22-1094342013502097","first-page":"16","volume":"39","author":"Ltaief H","year":"2011","journal-title":"ACM Transactions on Mathematical Software"},{"key":"bibr23-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.91"},{"key":"bibr24-1094342013502097","volume-title":"The Symmetric Eigenvalue Problem","author":"Parlett BN","year":"1980"},{"key":"bibr25-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2312-0"},{"key":"bibr26-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2010.06.001"},{"key":"bibr27-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1145\/79173.79181"},{"key":"bibr28-1094342013502097","doi-asserted-by":"publisher","DOI":"10.1137\/100806783"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342013502097","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342013502097","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342013502097","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,4]],"date-time":"2025-03-04T15:52:59Z","timestamp":1741103579000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342013502097"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,8,30]]},"references-count":28,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2014,5]]}},"alternative-id":["10.1177\/1094342013502097"],"URL":"https:\/\/doi.org\/10.1177\/1094342013502097","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,8,30]]}}}