{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T16:55:44Z","timestamp":1770742544283,"version":"3.49.0"},"reference-count":102,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2023,3,29]],"date-time":"2023-03-29T00:00:00Z","timestamp":1680048000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Bayerisches Kompetenznetzwerk f\u00fcr Technisch-Wissenschaftliches Hoch- und H\u00f6chstleistungsrechnen"},{"name":"Performance tuning of high-order discontinuous Galerkin solvers for SuperMUC-NG"},{"name":"National Science Foundation","award":["DMS-2028346, OAC-2015848, EAR-1925575"],"award-info":[{"award-number":["DMS-2028346, OAC-2015848, EAR-1925575"]}]},{"name":"Computational Infrastructure in Geodynamics initiative","award":["EAR-0949446 and EAR-1550901"],"award-info":[{"award-number":["EAR-0949446 and EAR-1550901"]}]},{"name":"The University of California \u2013 Davis, and by Technical Data Analysis, Inc.","award":["N68335-18-C-0011"],"award-info":[{"award-number":["N68335-18-C-0011"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2023,3,31]]},"abstract":"<jats:p>\n            This work studies three multigrid variants for matrix-free finite-element computations on locally refined meshes: geometric local smoothing, geometric global coarsening (both\n            <jats:italic>h<\/jats:italic>\n            -multigrid), and polynomial global coarsening (a variant of\n            <jats:italic>p<\/jats:italic>\n            -multigrid). We have integrated the algorithms into the same framework\u2014the open source finite-element library\n            <jats:monospace>deal.II<\/jats:monospace>\n            \u2014, which allows us to make fair comparisons regarding their implementation complexity, computational efficiency, and parallel scalability as well as to compare the measurements with theoretically derived performance metrics. Serial simulations and parallel weak and strong scaling on up to 147,456 CPU cores on 3,072 compute nodes are presented. The results obtained indicate that global-coarsening algorithms show a better parallel behavior for comparable smoothers due to the better load balance, particularly on the expensive fine levels. In the serial case, the costs of applying hanging-node constraints might be significant, leading to advantages of local smoothing, even though the number of solver iterations needed is slightly higher. When using\n            <jats:italic>p<\/jats:italic>\n            - and\n            <jats:italic>h<\/jats:italic>\n            -multigrid in sequence (\n            <jats:italic>hp<\/jats:italic>\n            -multigrid), the results indicate that it makes sense to decrease the degree of the elements first from a performance point of view due to the cheaper transfer.\n          <\/jats:p>","DOI":"10.1145\/3580314","type":"journal-article","created":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T12:03:32Z","timestamp":1674561812000},"page":"1-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Efficient Distributed Matrix-free Multigrid Methods on Locally Refined Meshes for FEM Computations"],"prefix":"10.1145","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2368-8533","authenticated-orcid":false,"given":"Peter","family":"Munch","sequence":"first","affiliation":[{"name":"Institute of Mathematics, University of Augsburg, Germany and Institute of Material Systems Modeling, Helmholtz-Zentrum Hereon, Germany and Institute for Computational Mechanics, Technical University of Munich, M\u00fcnchen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8137-3903","authenticated-orcid":false,"given":"Timo","family":"Heister","sequence":"additional","affiliation":[{"name":"Mathematical and Statistical Sciences, Clemson University, Clemson, South Carolina, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9086-2058","authenticated-orcid":false,"given":"Laura","family":"Prieto Saavedra","sequence":"additional","affiliation":[{"name":"Department of Chemical Engineering, Polytechnique Montr\u00e9al, Montreal, QC, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8406-835X","authenticated-orcid":false,"given":"Martin","family":"Kronbichler","sequence":"additional","affiliation":[{"name":"Institute of Mathematics, University of Augsburg and Department of Information Technology, Uppsala University, Uppsala, Sweden"}]}],"member":"320","published-online":{"date-parts":[[2023,3,29]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1002\/nme.506"},{"key":"e_1_3_4_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0021-9991(03)00194-3"},{"key":"e_1_3_4_4_2","article-title":"Chombo Software Package for AMR Applications Design Document","author":"Adams Mark","year":"2014","unstructured":"Mark Adams, Phillip Colella, Daniel T. Graves, Jeff N. Johnson, Noel D. Keen, Terry J. Ligocki, Daniel F. Martin, Peter W. McCorquodale, David Modiano, Peter O. Schwartz, T.D. Sternberg, and Brian Van Straalen. 2014. Chombo Software Package for AMR Applications Design Document. Lawrence Berkeley National Laboratory Technical Report LBNL-6616E (2014).","journal-title":"Lawrence Berkeley National Laboratory Technical Report LBNL-6616E"},{"key":"e_1_3_4_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2020.06.009"},{"key":"e_1_3_4_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10915-016-0259-9"},{"key":"e_1_3_4_7_2","doi-asserted-by":"publisher","DOI":"10.1515\/jnma-2020-0043"},{"key":"e_1_3_4_8_2","doi-asserted-by":"publisher","DOI":"10.1515\/jnma-2021-0081"},{"key":"e_1_3_4_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2020.02.022"},{"key":"e_1_3_4_10_2","doi-asserted-by":"publisher","DOI":"10.1515\/jnma-2022-0054"},{"key":"e_1_3_4_11_2","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1007\/978-3-030-47956-5_8","volume-title":"Software for Exascale Computing\u2014SPPEXA 2016\u20132019","author":"Arndt Daniel","year":"2020","unstructured":"Daniel Arndt, Niklas Fehn, Guido Kanschat, Katharina Kormann, Martin Kronbichler, Peter Munch, Wolfgang A. Wall, and Julius Witte. 2020. ExaDG: High-order discontinuous Galerkin for the exa-scale. In Software for Exascale Computing\u2014SPPEXA 2016\u20132019, Hans-Joachim Bungartz, Severin Reiz, Benjamin Uekermann, Philipp Neumann, and Wolfgang E. Nagel (Eds.). Springer International Publishing, Cham, 189\u2013224."},{"key":"e_1_3_4_12_2","doi-asserted-by":"publisher","DOI":"10.1090\/S0025-5718-97-00826-0"},{"key":"e_1_3_4_13_2","doi-asserted-by":"publisher","DOI":"10.2514\/6.2005-5110"},{"key":"e_1_3_4_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2018.05.003"},{"key":"e_1_3_4_15_2","doi-asserted-by":"publisher","DOI":"10.1137\/18M1175409"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/2049673.2049678"},{"key":"e_1_3_4_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/1486525.1486529"},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.1002\/fld.1917"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-59334-5_27"},{"key":"e_1_3_4_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2006.116"},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.1002\/1099-1506(200009)7:6<363::AID-NLA202>3.0.CO;2-V"},{"key":"e_1_3_4_22_2","volume-title":"Reactive Flows, Diffusion and Transport","author":"Becker Roland","year":"2007","unstructured":"Roland Becker, Malte Braack, and Thomas Richter. 2007. Parallel multigrid on locally refined meshes. In Reactive Flows, Diffusion and Transport, Willi J\u00e4ger, Rolf Rannacher, and J\u00fcrgen Warnatz (Eds.). Springer, Berlin, 77\u201392."},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.1002\/num.1013"},{"key":"e_1_3_4_24_2","doi-asserted-by":"publisher","DOI":"10.1090\/S0025-5718-1991-1052086-4"},{"key":"e_1_3_4_25_2","doi-asserted-by":"publisher","DOI":"10.1090\/S0025-5718-1977-0431719-X"},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10915-010-9396-8"},{"key":"e_1_3_4_27_2","doi-asserted-by":"publisher","DOI":"10.1137\/100791634"},{"key":"e_1_3_4_28_2","doi-asserted-by":"publisher","DOI":"10.1002\/nla.2375"},{"key":"e_1_3_4_29_2","doi-asserted-by":"publisher","DOI":"10.1002\/nla.2375"},{"key":"e_1_3_4_30_2","doi-asserted-by":"publisher","DOI":"10.2514\/6.2004-436"},{"key":"e_1_3_4_31_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511546792"},{"key":"e_1_3_4_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-31619-1_8"},{"key":"e_1_3_4_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2020.109538"},{"key":"e_1_3_4_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2005.01.005"},{"key":"e_1_3_4_35_2","volume-title":"ML 5.0 Smoothed Aggregation User\u2019s Guide","author":"Gee Michael W.","year":"2006","unstructured":"Michael W. Gee, Christopher M. Siefert, Jonathan J. Hu, Ray S. Tuminaro, and Marzio G. Sala. 2006. ML 5.0 Smoothed Aggregation User\u2019s Guide. Technical Report SAND2006-2649, Sandia National Laboratories, Albuquerque."},{"key":"e_1_3_4_36_2","doi-asserted-by":"publisher","DOI":"10.1002\/fld.3888"},{"key":"e_1_3_4_37_2","doi-asserted-by":"publisher","DOI":"10.1137\/15M1010798"},{"key":"e_1_3_4_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0898-1221(98)00136-9"},{"key":"e_1_3_4_39_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0898-1221(00)00091-2"},{"key":"e_1_3_4_40_2","doi-asserted-by":"publisher","DOI":"10.1093\/gji\/ggx195"},{"key":"e_1_3_4_41_2","doi-asserted-by":"publisher","DOI":"10.2514\/1.15497"},{"key":"e_1_3_4_42_2","doi-asserted-by":"publisher","DOI":"10.2514\/1.31163"},{"key":"e_1_3_4_43_2","doi-asserted-by":"publisher","DOI":"10.2514\/1.31163"},{"key":"e_1_3_4_44_2","doi-asserted-by":"publisher","DOI":"10.2514\/6.2016-3494"},{"key":"e_1_3_4_45_2","doi-asserted-by":"publisher","DOI":"10.2514\/6.2003-3989"},{"key":"e_1_3_4_46_2","doi-asserted-by":"crossref","unstructured":"Magnus Rudolph Hestenes and Eduard Stiefel. 1952. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Standards 49 6 (1952) 409\u2013436.","DOI":"10.6028\/jres.049.044"},{"key":"e_1_3_4_47_2","unstructured":"Koen Hillewaert Jean-Fran\u00e7ois Remacle Nicolas Cheveaugeon Paul-Emile Bernard and Philippe Geuzaine. 2006. Analysis of a hybrid p-multigrid method for the discontinuous Galerkin discretisation of the Euler equations. In Proceedings of the European Conference on Computational Fluid Dynamics Pieter Wesseling Eugenio O\u00f1ate and Jacques P\u00e9riaux (Eds.). ECCOMAS CFD 2006 Egmond aan Zee."},{"key":"e_1_3_4_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/1837853.1693476"},{"key":"e_1_3_4_49_2","doi-asserted-by":"publisher","DOI":"10.1137\/S1064827595279368"},{"key":"e_1_3_4_50_2","doi-asserted-by":"publisher","DOI":"10.1137\/0916076"},{"key":"e_1_3_4_51_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45346-6_38"},{"key":"e_1_3_4_52_2","doi-asserted-by":"publisher","DOI":"10.1137\/090778523"},{"key":"e_1_3_4_53_2","doi-asserted-by":"publisher","DOI":"10.1002\/fld.4035"},{"key":"e_1_3_4_54_2","doi-asserted-by":"publisher","DOI":"10.1002\/fld.804"},{"key":"e_1_3_4_55_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compstruc.2004.04.015"},{"key":"e_1_3_4_56_2","doi-asserted-by":"publisher","DOI":"10.1515\/jnma-2015-0005"},{"key":"e_1_3_4_57_2","doi-asserted-by":"publisher","DOI":"10.1002\/nme.1620191103"},{"key":"e_1_3_4_58_2","doi-asserted-by":"publisher","DOI":"10.1177\/10943420211020803"},{"key":"e_1_3_4_59_2","doi-asserted-by":"crossref","unstructured":"Martin Kronbichler and Momme Allalen. 2018. Efficient high-order discontinuous Galerkin finite elements with matrix-free implementations. In Advances and New Trends in Environmental Informatics Hans-Joachim Bungartz Dieter Kranzlm\u00fcller Volker Weinberg Jens Weism\u00fcller and Volker Wohlgemuth (Eds.). Springer Berlin 89\u2013110.","DOI":"10.1007\/978-3-319-99654-7_7"},{"key":"e_1_3_4_60_2","doi-asserted-by":"publisher","DOI":"10.1177\/1094342016671790"},{"key":"e_1_3_4_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476171"},{"key":"e_1_3_4_62_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1365-246X.2012.05609.x"},{"key":"e_1_3_4_63_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compfluid.2012.04.012"},{"key":"e_1_3_4_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3325864"},{"key":"e_1_3_4_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3322813"},{"key":"e_1_3_4_66_2","doi-asserted-by":"publisher","DOI":"10.1177\/10943420221107880"},{"key":"e_1_3_4_67_2","doi-asserted-by":"publisher","DOI":"10.1137\/16M110455X"},{"key":"e_1_3_4_68_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compfluid.2008.02.004"},{"key":"e_1_3_4_69_2","unstructured":"Karl Ljungkvist. 2014. Matrix-free finite-element operator application on graphics processing units. In European Conference on Parallel Processing Lu\u00eds Lopes Julius \u017dilinskas Alexandru Costan Roberto G. Cascella Gabor Kecskemeti Emmanuel Jeannot Mario Cannataro Laura Ricci Siegfried Benkner Salvador Petit Vittorio Scarano Jos\u00e9 Gracia Sascha Hunold Stephen L. Scott Stefan Lankes Christian Lengauer Jes\u00fas Carretero Jens Breitbart and Michael Alexander (Eds.). Springer Berlin 450\u2013461."},{"key":"e_1_3_4_70_2","unstructured":"Karl Ljungkvist. 2017. Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes. In SpringSim (HPC) The Society for Modeling and Simulation International San Diego CA 1\u201312."},{"key":"e_1_3_4_71_2","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-0207(19970315)40:5<919::AID-NME95>3.0.CO;2-U"},{"key":"e_1_3_4_72_2","doi-asserted-by":"publisher","DOI":"10.1002\/nla.1925"},{"key":"e_1_3_4_73_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2005.06.019"},{"key":"e_1_3_4_74_2","doi-asserted-by":"publisher","DOI":"10.2514\/1.28314"},{"key":"e_1_3_4_75_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01065177"},{"key":"e_1_3_4_76_2","doi-asserted-by":"publisher","DOI":"10.2514\/1.39765"},{"key":"e_1_3_4_77_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2010.01.020"},{"key":"e_1_3_4_78_2","doi-asserted-by":"publisher","DOI":"10.5555\/130651"},{"key":"e_1_3_4_79_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0045-7825(00)00322-4"},{"key":"e_1_3_4_80_2","doi-asserted-by":"publisher","DOI":"10.1002\/nla.700"},{"key":"e_1_3_4_81_2","doi-asserted-by":"publisher","DOI":"10.1145\/3469720"},{"key":"e_1_3_4_82_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-07312-0_7"},{"key":"e_1_3_4_83_2","article-title":"High-performance implementation of matrix-free high-order discontinuous Galerkin methods","author":"M\u00fcthing Steffen","year":"2017","unstructured":"Steffen M\u00fcthing, Marian Piatkowski, and Peter Bastian. 2017. High-performance implementation of matrix-free high-order discontinuous Galerkin methods. arXiv:1711.10885. Retrieved from https:\/\/arxiv.org\/abs\/1711.10885.","journal-title":"arXiv:1711.10885"},{"key":"e_1_3_4_84_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2005.08.022"},{"key":"e_1_3_4_85_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.anucene.2016.11.048"},{"key":"e_1_3_4_86_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.pnucene.2017.03.014"},{"key":"e_1_3_4_87_2","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9991(80)90005-4"},{"key":"e_1_3_4_88_2","doi-asserted-by":"publisher","DOI":"10.2514\/6.2009-950"},{"key":"e_1_3_4_89_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01061297"},{"key":"e_1_3_4_90_2","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807675"},{"key":"e_1_3_4_91_2","volume-title":"Portable Parallelization of Industrial Aerodynamic Applications (POPINDA): Results of a BMBF Project","author":"Sch\u00fcller Anton","year":"2013","unstructured":"Anton Sch\u00fcller. 2013. Portable Parallelization of Industrial Aerodynamic Applications (POPINDA): Results of a BMBF Project. Vol. 71. Vieweg+Teubner Verlag, Wiesbaden."},{"key":"e_1_3_4_92_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2009.07.013"},{"key":"e_1_3_4_93_2","doi-asserted-by":"publisher","DOI":"10.1002\/nme.1620201112"},{"key":"e_1_3_4_94_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10915-016-0345-z"},{"key":"e_1_3_4_95_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10915-016-0345-z"},{"key":"e_1_3_4_96_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-65870-4_12"},{"key":"e_1_3_4_97_2","doi-asserted-by":"crossref","unstructured":"Mario Storti Norberto M. Nigro and Sergio Idelsohn. 1991. Multigrid methods and adaptive refinement techniques in elliptic problems by finite element methods. Computer Methods in Applied Mechanics and Engineering 93 1 (1991) 13\u201330.","DOI":"10.1016\/0045-7825(91)90113-K"},{"key":"e_1_3_4_98_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0377-0427(00)00516-1"},{"key":"e_1_3_4_99_2","doi-asserted-by":"publisher","DOI":"10.5555\/2388996.2389055"},{"key":"e_1_3_4_100_2","doi-asserted-by":"publisher","DOI":"10.1002\/nla.1979"},{"key":"e_1_3_4_101_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11425-006-2005-5"},{"key":"e_1_3_4_102_2","doi-asserted-by":"publisher","DOI":"10.1145\/355815.355816"},{"key":"e_1_3_4_103_2","doi-asserted-by":"publisher","DOI":"10.21105\/joss.01370"}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580314","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3580314","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:42Z","timestamp":1750178262000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580314"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,29]]},"references-count":102,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,3,31]]}},"alternative-id":["10.1145\/3580314"],"URL":"https:\/\/doi.org\/10.1145\/3580314","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"value":"2329-4949","type":"print"},{"value":"2329-4957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,29]]},"assertion":[{"value":"2022-04-05","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-01-11","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-03-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}