{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T14:48:13Z","timestamp":1774622893273,"version":"3.50.1"},"reference-count":14,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2019,11,29]],"date-time":"2019-11-29T00:00:00Z","timestamp":1574985600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/100010661","name":"Horizon 2020 Framework Programme","doi-asserted-by":"publisher","award":["671633"],"award-info":[{"award-number":["671633"]}],"id":[{"id":"10.13039\/100010661","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2020,5]]},"abstract":"<jats:p> We describe the parallelization of the solve phase in the sparse Cholesky solver SpLLT when using a sequential task flow model. In the context of direct methods, the solution of a sparse linear system is achieved through three main phases: the analyse, the factorization and the solve phases. In the last two phases, which involve numerical computation, the factorization corresponds to the most computationally costly phase, and it is therefore crucial to parallelize this phase in order to reduce the time-to-solution on modern architectures. As a consequence, the solve phase is often not as optimized as the factorization in state-of-the-art solvers, and opportunities for parallelism are often not exploited in this phase. However, in some applications, the time spent in the solve phase is comparable to or even greater than the time for the factorization, and the user could dramatically benefit from a faster solve routine. This is the case, for example, for a conjugate gradient (CG) solver using a block Jacobi preconditioner. The diagonal blocks are factorized once only, but their factors are used to solve subsystems at each CG iteration. In this study, we design and implement a parallel version of a task-based solve routine for an OpenMP version of the SpLLT solver. We show that we can obtain good scalability on a multicore architecture enabling a dramatic reduction of the overall time-to-solution in some applications. <\/jats:p>","DOI":"10.1177\/1094342019888567","type":"journal-article","created":{"date-parts":[[2019,11,29]],"date-time":"2019-11-29T10:49:41Z","timestamp":1575024581000},"page":"340-356","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["Parallelization of the solve phase in a task-based Cholesky solver using a sequential task flow model"],"prefix":"10.1177","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3740-8985","authenticated-orcid":false,"given":"S\u00e9bastien","family":"Cayrols","sequence":"first","affiliation":[]},{"given":"Iain S","family":"Duff","sequence":"additional","affiliation":[]},{"given":"Florent","family":"Lopez","sequence":"additional","affiliation":[{"name":"Scientific Computing Department, STFC Rutherford Appleton Laboratory, Harwell Campus, Oxfordshire, UK"}]}],"member":"179","published-online":{"date-parts":[[2019,11,29]]},"reference":[{"key":"bibr1-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1137\/S0895479894278952"},{"key":"bibr2-1094342019888567","doi-asserted-by":"crossref","unstructured":"Augonnet C, Thibault S, Namyst R, et al. (2011) StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009 23: 187\u2013198. DOI:10.1002\/cpe.1631. Available at: http:\/\/hal.inria.fr\/inria-00550877 (accessed 12 November 2019).","DOI":"10.1002\/cpe.1631"},{"key":"bibr3-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1145\/567806.567807"},{"key":"bibr4-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2013.98"},{"key":"bibr5-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28145-7_23"},{"key":"bibr6-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"bibr7-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-78024-5_18"},{"key":"bibr8-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780198508380.001.0001"},{"key":"bibr9-1094342019888567","first-page":"235","volume":"8","author":"Duff IS","year":"2018","journal-title":"Numerical Algebra, Control and Optimization"},{"key":"bibr10-1094342019888567","unstructured":"Grigori L, Tissot O (2017) Reducing the Communication and Computational Costs of Enlarged Krylov Subspaces Conjugate Gradient: Research Report RR-9023. Paris: Inria. Available at: https:\/\/hal.inria.fr\/hal-01451199v2\/document (accessed 12 November 2019)."},{"key":"bibr11-1094342019888567","doi-asserted-by":"crossref","unstructured":"H\u00e9non P, Ramet P, Roman J (2002) PaStiX: a high-performance parallel direct solver for sparse symmetric definite systems. Parallel Computing 28(2): 301\u2013321. Available at: https:\/\/hal.inria.fr\/inria-00346017 (accessed 12 November 2019).","DOI":"10.1016\/S0167-8191(01)00141-7"},{"key":"bibr12-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1137\/090757216"},{"key":"bibr13-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1997.1404"},{"key":"bibr14-1094342019888567","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(01)00135-1"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342019888567","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342019888567","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342019888567","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,28]],"date-time":"2025-02-28T14:34:18Z","timestamp":1740753258000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342019888567"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,29]]},"references-count":14,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2020,5]]}},"alternative-id":["10.1177\/1094342019888567"],"URL":"https:\/\/doi.org\/10.1177\/1094342019888567","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,11,29]]}}}