{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,2]],"date-time":"2026-07-02T23:40:59Z","timestamp":1783035659964,"version":"3.54.6"},"reference-count":39,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2021,6,11]],"date-time":"2021-06-11T00:00:00Z","timestamp":1623369600000},"content-version":"vor","delay-in-days":365,"URL":"http:\/\/www.sagepub.com\/licence-information-for-chorus"}],"funder":[{"DOI":"10.13039\/100006168","name":"National Nuclear Security Administration","doi-asserted-by":"publisher","award":["DE-NA0002374"],"award-info":[{"award-number":["DE-NA0002374"]}],"id":[{"id":"10.13039\/100006168","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:p>Algebraic multigrid (AMG) is often viewed as a scalable [Formula: see text] solver for sparse linear systems. Yet, AMG lacks parallel scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy and in the iterative solve phase. This work introduces a parallel implementation of AMG that reduces the cost of communication, yielding improved parallel scalability. It is common in Message Passing Interface (MPI), particularly in the MPI-everywhere approach, to arrange inter-process communication, so that communication is transported regardless of the location of the send and receive processes. Performance tests show notable differences in the cost of intra- and internode communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and the size of internode messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scaling of the entire method.<\/jats:p>","DOI":"10.1177\/1094342020925535","type":"journal-article","created":{"date-parts":[[2020,6,11]],"date-time":"2020-06-11T04:18:59Z","timestamp":1591849139000},"page":"547-561","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":24,"title":["Reducing communication in algebraic multigrid with multi-step node aware communication"],"prefix":"10.1177","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8891-934X","authenticated-orcid":false,"given":"Amanda","family":"Bienz","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"William D","family":"Gropp","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Luke N","family":"Olson","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana\u2013Champaign, Urbana, Illinois, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"179","published-online":{"date-parts":[[2020,6,11]]},"reference":[{"key":"bibr1-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2004.62"},{"key":"bibr2-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1145\/3093172.3093230"},{"key":"bibr3-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.35"},{"key":"bibr4-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-19328-6_12"},{"key":"bibr5-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2008.4536348"},{"key":"bibr6-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.75"},{"key":"bibr7-1094342020925535","unstructured":"Bienz A, Olson LN (2017) RAPtor: parallel algebraic multigrid v0.1. Available at: https:\/\/github.com\/lukeolson\/raptor. Release 0.1 (accessed February 2019)."},{"key":"bibr8-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/15M1026341"},{"key":"bibr9-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1145\/3236367.3236368"},{"key":"bibr10-1094342020925535","doi-asserted-by":"crossref","unstructured":"Bienz A, Gropp WD, Olson LN (2019) Node aware sparse matrix-vector multiplication. Journal of Parallel and Distributed Computing 130: 166\u2013178. DOI:10.1016\/j.jpdc.2019.03.016.","DOI":"10.1016\/j.jpdc.2019.03.016"},{"key":"bibr11-1094342020925535","doi-asserted-by":"crossref","unstructured":"Bode B, Butler M, Dunning T, et al. (2013) The Blue Waters super-system for super-science. In: Vetter JS (ed) Contemporary High Performance Computing: From Petascale Toward Exascale\n                      ,\n                      CRC Computational Science Series, Vol. 1. 1st ed. Boca Raton: Taylor and Francis, pp. 339\u2013366. ISBN 9781466568341.","DOI":"10.1201\/9781351104005-13"},{"key":"bibr12-1094342020925535","first-page":"257","volume-title":"Sparsity and Its Applications","author":"Brandt A","year":"1984"},{"key":"bibr13-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/71.780863"},{"key":"bibr14-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/080737770"},{"key":"bibr15-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/130931539"},{"key":"bibr16-1094342020925535","doi-asserted-by":"crossref","unstructured":"Gropp W, Lusk E, Doss N, et al. (1996) A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing 22(6): 789\u2013828. DOI:10.1016\/0167-8191(96)00024-5.","DOI":"10.1016\/0167-8191(96)00024-5"},{"key":"bibr17-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1145\/2966884.2966919"},{"key":"bibr18-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(00)00048-X"},{"key":"bibr19-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1016\/S0168-9274(01)00115-5"},{"key":"bibr20-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5160935"},{"key":"bibr21-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2000.846009"},{"key":"bibr22-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1145\/329366.301116"},{"key":"bibr23-1094342020925535","unstructured":"Lawrence Livermore National Laboratory (LLNL) (2008) HYPRE: high performance preconditioners. Available at: http:\/\/www.llnl.gov\/CASC\/hypre\/ (accessed February 2019)."},{"key":"bibr24-1094342020925535","unstructured":"Lawrence Livermore National Laboratory (LLNL) (2010) MFEM: modular finite element methods. Available at: mfem.org (accessed February 2019)."},{"key":"bibr25-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/0719067"},{"key":"bibr26-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/HiPC.2017.00047"},{"key":"bibr27-1094342020925535","unstructured":"National Center for Supercomputing Applications (2012) Blue Waters. Available at: https:\/\/bluewaters.ncsa.illinois.edu\/."},{"key":"bibr28-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2004.05.003"},{"key":"bibr29-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611971057.ch4"},{"key":"bibr30-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1145\/2145816.2145823"},{"key":"bibr31-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063487"},{"key":"bibr32-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1002\/nla.559"},{"key":"bibr33-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/040615729"},{"key":"bibr34-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/140952570"},{"key":"bibr35-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2000.10008"},{"key":"bibr36-1094342020925535","doi-asserted-by":"crossref","unstructured":"Vassilevski PS, Yang UM (2014) Reducing communication in algebraic multigrid using additive variants. Numerical Linear Algebra with Applications 21(2): 275\u2013296. DOI: 10.1002\/nla.1928.","DOI":"10.1002\/nla.1928"},{"key":"bibr37-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1137\/S0036144502409019"},{"key":"bibr38-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2014.30"},{"key":"bibr39-1094342020925535","doi-asserted-by":"publisher","DOI":"10.1002\/nla.689"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020925535","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342020925535","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020925535","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020925535","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:15:58Z","timestamp":1777450558000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342020925535"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,11]]},"references-count":39,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["10.1177\/1094342020925535"],"URL":"https:\/\/doi.org\/10.1177\/1094342020925535","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,11]]}}}