{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,13]],"date-time":"2025-09-13T16:08:51Z","timestamp":1757779731433,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,9]],"date-time":"2021-08-09T00:00:00Z","timestamp":1628467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,9]]},"DOI":"10.1145\/3472456.3472484","type":"proceedings-article","created":{"date-parts":[[2021,10,5]],"date-time":"2021-10-05T18:39:57Z","timestamp":1633459197000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Tridiagonal GPU Solver with Scaled Partial Pivoting at Maximum Bandwidth"],"prefix":"10.1145","author":[{"given":"Christoph","family":"Klein","sequence":"first","affiliation":[{"name":"University of Heidelberg, ZITI, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert","family":"Strzodka","sequence":"additional","affiliation":[{"name":"University of Heidelberg, ZITI, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,5]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/SCCC.2011.29"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2017.10.003"},{"key":"e_1_3_2_1_3_1","volume-title":"Optimizing Krylov Subspace Solvers on Graphics Processing Units. In Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS","author":"Anzt Hartwig","year":"2014","unstructured":"Hartwig Anzt , William Sawyer , Stanimire Tomov , Piotr Luszczek , Ichitaro Yamazaki , and Jack Dongarra . 2014 . Optimizing Krylov Subspace Solvers on Graphics Processing Units. In Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014. IEEE, IEEE, Phoenix, AZ. Hartwig Anzt, William Sawyer, Stanimire Tomov, Piotr Luszczek, Ichitaro Yamazaki, and Jack Dongarra. 2014. Optimizing Krylov Subspace Solvers on Graphics Processing Units. In Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014. IEEE, IEEE, Phoenix, AZ."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Daniele Bertaccini and Fabio Durastante. 2018. Iterative Methods and Preconditioning for Large and Sparse Linear Systems with Applications. Vol.\u00a0369. arxiv:arXiv:1011.1669v3  Daniele Bertaccini and Fabio Durastante. 2018. Iterative Methods and Preconditioning for Large and Sparse Linear Systems with Applications. Vol.\u00a0369. arxiv:arXiv:1011.1669v3","DOI":"10.1201\/9781315153575"},{"key":"e_1_3_2_1_6_1","volume-title":"IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings(2011)","author":"Chang Li\u00a0Wen","year":"2011","unstructured":"Li\u00a0Wen Chang , Men\u00a0Tzung Lo , Nasser Anssari , Ke\u00a0Hsin Hsu , Norden\u00a0 E. Huang , and Wen Mei\u00a0W. Hwu . 2011 . Parallel implementation of multi-dimensional ensemble empirical mode decomposition. ICASSP , IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings(2011) , 1621\u20131624. https:\/\/doi.org\/10.1109\/ICASSP.2011.5946808 Li\u00a0Wen Chang, Men\u00a0Tzung Lo, Nasser Anssari, Ke\u00a0Hsin Hsu, Norden\u00a0E. Huang, and Wen Mei\u00a0W. Hwu. 2011. Parallel implementation of multi-dimensional ensemble empirical mode decomposition. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings(2011), 1621\u20131624. https:\/\/doi.org\/10.1109\/ICASSP.2011.5946808"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/2477183.2478156"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.92"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"e_1_3_2_1_10_1","volume-title":"Proceedings - 22nd IEEE International Conference on High Performance Computing, HiPC 2015 (2016","author":"Dieguez Adrian\u00a0Perez","year":"2016","unstructured":"Adrian\u00a0Perez Dieguez , Margarita Amor , and Ramon Doallo . 2016 . New Tridiagonal Systems Solvers on GPU Architectures . Proceedings - 22nd IEEE International Conference on High Performance Computing, HiPC 2015 (2016 ), 85\u201393. https:\/\/doi.org\/10.1109\/HiPC.2015.17 Adrian\u00a0Perez Dieguez, Margarita Amor, and Ramon Doallo. 2016. New Tridiagonal Systems Solvers on GPU Architectures. Proceedings - 22nd IEEE International Conference on High Performance Computing, HiPC 2015 (2016), 85\u201393. https:\/\/doi.org\/10.1109\/HiPC.2015.17"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-018-2676-z"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3328731"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2017.2723879"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/PDP2018.2018.00123"},{"key":"e_1_3_2_1_15_1","first-page":"1","article-title":"Generalized diagonal pivoting methods for tridiagonal systems without interchanges","volume":"40","author":"Erway B.","year":"2010","unstructured":"Jennifer\u00a0 B. Erway , Roummel\u00a0 F. Marcia , and Joseph\u00a0 A. Tyson . 2010 . Generalized diagonal pivoting methods for tridiagonal systems without interchanges . IAENG International Journal of Applied Mathematics 40 , 4(2010), 1 \u2013 7 . Jennifer\u00a0B. Erway, Roummel\u00a0F. Marcia, and Joseph\u00a0A. Tyson. 2010. Generalized diagonal pivoting methods for tridiagonal systems without interchanges. IAENG International Journal of Applied Mathematics 40, 4(2010), 1\u20137.","journal-title":"IAENG International Journal of Applied Mathematics"},{"key":"e_1_3_2_1_16_1","volume-title":"Networking, Storage and Analysis","author":"Giles Mike","year":"2014","unstructured":"Mike Giles , Endre Laszlo , Istvan Reguly , Jeremy Appleyard , and Julien Demouth . 2014. GPU Implementation of Finite Difference Solvers. Proceedings of WHPCF 2014: 7th Workshop on High Performance Computational Finance - Held in conjunction with SC 2014: The International Conference for High Performance Computing , Networking, Storage and Analysis ( 2014 ), 1\u20138. https:\/\/doi.org\/10.1109\/WHPCF.2014.10 Mike Giles, Endre Laszlo, Istvan Reguly, Jeremy Appleyard, and Julien Demouth. 2014. GPU Implementation of Finite Difference Solvers. Proceedings of WHPCF 2014: 7th Workshop on High Performance Computational Finance - Held in conjunction with SC 2014: The International Conference for High Performance Computing, Networking, Storage and Analysis (2014), 1\u20138. https:\/\/doi.org\/10.1109\/WHPCF.2014.10"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2010.61"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Anne Greenbaum. 1997. Iterative methods for solving linear systems.  Anne Greenbaum. 1997. Iterative methods for solving linear systems.","DOI":"10.1137\/1.9781611970937"},{"key":"e_1_3_2_1_19_1","unstructured":"Ga\u00ebl Guennebaud Beno\u00eet Jacob and Others. 2010. Eigen v3. http:\/\/eigen.tuxfamily.org.  Ga\u00ebl Guennebaud Beno\u00eet Jacob and Others. 2010. Eigen v3. http:\/\/eigen.tuxfamily.org."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ocemod.2003.10.002"},{"key":"e_1_3_2_1_21_1","volume-title":"A Fast Direct Solution of Poisson\u2019s Equation Using Fourier Analysis. 12, 1","author":"Hockney R.W.","year":"1965","unstructured":"R.W. Hockney . 1965. A Fast Direct Solution of Poisson\u2019s Equation Using Fourier Analysis. 12, 1 ( 1965 ), 95\u2013113. R.W. Hockney. 1965. A Fast Direct Solution of Poisson\u2019s Equation Using Fourier Analysis. 12, 1 (1965), 95\u2013113."},{"volume-title":"Parallel computers : architecture, programming and algorithms","author":"Hockney R.W.","key":"e_1_3_2_1_22_1","unstructured":"R.W. Hockney and C.\u00a0 R. Jesshope . 1981. Parallel computers : architecture, programming and algorithms . Bristol . R.W. Hockney and C.\u00a0R. Jesshope. 1981. Parallel computers : architecture, programming and algorithms. Bristol."},{"key":"e_1_3_2_1_23_1","first-page":"1","article-title":"Interactive Depth of Field Using Simulated Diffusion on a GPU","volume":"2006","author":"Kass Michael","year":"2006","unstructured":"Michael Kass , Aaron Lefohn , and John Owens . 2006 . Interactive Depth of Field Using Simulated Diffusion on a GPU . ComputingJanuary 2006 (2006), 1 \u2013 8 . Michael Kass, Aaron Lefohn, and John Owens. 2006. Interactive Depth of Field Using Simulated Diffusion on a GPU. ComputingJanuary 2006(2006), 1\u20138.","journal-title":"ComputingJanuary"},{"key":"e_1_3_2_1_24_1","volume-title":"stable fluid dynamics for computer graphics. 24, 4","author":"Kass Michael","year":"1990","unstructured":"Michael Kass and Gavin Miller . 1990. Rapid , stable fluid dynamics for computer graphics. 24, 4 ( 1990 ), 49\u201357. https:\/\/doi.org\/10.1145\/97879.97884 Michael Kass and Gavin Miller. 1990. Rapid, stable fluid dynamics for computer graphics. 24, 4 (1990), 49\u201357. https:\/\/doi.org\/10.1145\/97879.97884"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2011.41"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830568"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2017.07.014"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.2000.1666"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Yousef Saad. 2003. Iterative Methods for Sparse Linear Systems. Iterative Methods for Sparse Linear Systems(2003). https:\/\/doi.org\/10.1137\/1.9780898718003  Yousef Saad. 2003. Iterative Methods for Sparse Linear Systems. Iterative Methods for Sparse Linear Systems(2003). https:\/\/doi.org\/10.1137\/1.9780898718003","DOI":"10.1137\/1.9780898718003"},{"key":"e_1_3_2_1_30_1","volume-title":"Parallel Thomas approach development for solving tridiagonal systems in GPU programming-Steady and unsteady flow simulation. Mechanics and Industry 21, 3","author":"Souri Milad","year":"2020","unstructured":"Milad Souri , Pooria Akbarzadeh , and Hossein Mahmoodi Darian . 2020. Parallel Thomas approach development for solving tridiagonal systems in GPU programming-Steady and unsteady flow simulation. Mechanics and Industry 21, 3 ( 2020 ). https:\/\/doi.org\/10.1051\/meca\/2020013 Milad Souri, Pooria Akbarzadeh, and Hossein Mahmoodi Darian. 2020. Parallel Thomas approach development for solving tridiagonal systems in GPU programming-Steady and unsteady flow simulation. Mechanics and Industry 21, 3 (2020). https:\/\/doi.org\/10.1051\/meca\/2020013"},{"key":"e_1_3_2_1_31_1","volume-title":"Elliptic problems in linear difference equations over a network. Watson Sci. Comput. Lab. Rept","author":"Thomas Llewellyn\u00a0Hilleth","year":"1949","unstructured":"Llewellyn\u00a0Hilleth Thomas . 1949. Elliptic problems in linear difference equations over a network. Watson Sci. Comput. Lab. Rept ., Columbia University , New York 1 ( 1949 ). Llewellyn\u00a0Hilleth Thomas. 1949. Elliptic problems in linear difference equations over a network. Watson Sci. Comput. Lab. Rept., Columbia University, New York 1 (1949)."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2015.03.008"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/355945.355947"},{"key":"e_1_3_2_1_34_1","volume-title":"Proceedings of the 3rd IEEE Symposium on Parallel and Distributed Processing 1991 (1991","author":"Wang Xiaojing","year":"1991","unstructured":"Xiaojing Wang and Z.\u00a0 G. Mou . 1991 . A divide-and-conquer method of solving tridiagonal systems on hypercube massively parallel computers . Proceedings of the 3rd IEEE Symposium on Parallel and Distributed Processing 1991 (1991 ), 810\u2013817. https:\/\/doi.org\/10.1109\/SPDP.1991.218237 Xiaojing Wang and Z.\u00a0G. Mou. 1991. A divide-and-conquer method of solving tridiagonal systems on hypercube massively parallel computers. Proceedings of the 3rd IEEE Symposium on Parallel and Distributed Processing 1991 (1991), 810\u2013817. https:\/\/doi.org\/10.1109\/SPDP.1991.218237"},{"key":"e_1_3_2_1_35_1","volume-title":"Networking, Storage and Analysis","author":"Wang Xinliang","year":"2014","unstructured":"Xinliang Wang , Yangtong Xu , and Wei Xue . 2014. A Hierarchical Tridiagonal System Solver for Heterogenous Supercomputers. Proceedings of ScalA 2014: 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - held in conjunction with SC 2014: The International Conference for High Performance Computing , Networking, Storage and Analysis ( 2014 ), 69\u201376. https:\/\/doi.org\/10.1109\/ScalA.2014.12 Xinliang Wang, Yangtong Xu, and Wei Xue. 2014. A Hierarchical Tridiagonal System Solver for Heterogenous Supercomputers. Proceedings of ScalA 2014: 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - held in conjunction with SC 2014: The International Conference for High Performance Computing, Networking, Storage and Analysis (2014), 69\u201376. https:\/\/doi.org\/10.1109\/ScalA.2014.12"}],"event":{"name":"ICPP 2021: 50th International Conference on Parallel Processing","acronym":"ICPP 2021","location":"Lemont IL USA"},"container-title":["50th International Conference on Parallel Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472484","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472456.3472484","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:11Z","timestamp":1750193291000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472484"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,9]]},"references-count":34,"alternative-id":["10.1145\/3472456.3472484","10.1145\/3472456"],"URL":"https:\/\/doi.org\/10.1145\/3472456.3472484","relation":{},"subject":[],"published":{"date-parts":[[2021,8,9]]},"assertion":[{"value":"2021-10-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}