{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T07:21:25Z","timestamp":1777965685579,"version":"3.51.4"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"2","funder":[{"name":"European Research Council"},{"name":"European Union\u2019s Horizon 2020","award":["810367"],"award-info":[{"award-number":["810367"]}]},{"name":"National Science Foundation","award":["CCF-1942892 and OAC-2106920"],"award-info":[{"award-number":["CCF-1942892 and OAC-2106920"]}]},{"name":"US Department of Energy, Office of Science, Advanced Scientific Computing Research program","award":["DE-SC-0023296"],"award-info":[{"award-number":["DE-SC-0023296"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>In this article, we focus on the communication costs of three symmetric matrix computations: (i) multiplying a matrix with its transpose, known as a symmetric rank-k update (SYRK) (ii) adding the result of the multiplication of a matrix with the transpose of another matrix and the transpose of that result, known as a symmetric rank-2k update (SYR2K) (iii) performing matrix multiplication with a symmetric input matrix (SYMM). All three computations appear in the Level 3 Basic Linear Algebra Subroutines (BLAS) and have wide use in applications involving symmetric matrices. We establish communication lower bounds for these kernels using sequential and distributed-memory parallel computational models, and we show that our bounds are tight by presenting communication-optimal algorithms for each setting. Our lower bound proofs rely on applying a geometric inequality for symmetric computations and analytically solving constrained nonlinear optimization problems. The symmetric matrix and its corresponding computations are accessed and performed according to a triangular block partitioning scheme in the optimal algorithms.<\/jats:p>\n          <jats:p\/>","DOI":"10.1145\/3727344","type":"journal-article","created":{"date-parts":[[2025,4,10]],"date-time":"2025-04-10T11:19:06Z","timestamp":1744283946000},"page":"1-46","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Communication Lower Bounds and Optimal Algorithms for Symmetric Matrix Computations"],"prefix":"10.1145","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9355-4042","authenticated-orcid":false,"given":"Hussam","family":"Al Daas","sequence":"first","affiliation":[{"name":"STFC, Scientific Computing Department, Rutherford Appleton Laboratory, Didcot, United Kingdom of Great Britain and Northern Ireland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1557-8027","authenticated-orcid":false,"given":"Grey","family":"Ballard","sequence":"additional","affiliation":[{"name":"Wake Forest University, Department of Computer Science, Winston-Salem, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5880-1076","authenticated-orcid":false,"given":"Laura","family":"Grigori","sequence":"additional","affiliation":[{"name":"Institute of Mathematics, EPFL, Lausanne, Switzerland"},{"name":"PSI Center for Scientific Computing, Theory and Data, Villigen, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-1449-0165","authenticated-orcid":false,"given":"Suraj","family":"Kumar","sequence":"additional","affiliation":[{"name":"Institut national de recherche en sciences et technologies du num\u00e9rique, Lyon, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-4045-0423","authenticated-orcid":false,"given":"Kathryn","family":"Rouse","sequence":"additional","affiliation":[{"name":"Inmar Intelligence, Winston-Salem, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0581-2772","authenticated-orcid":false,"given":"Mathieu","family":"Verite","sequence":"additional","affiliation":[{"name":"Institute of Mathematics, EPFL, Lausanne, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,6,9]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(90)90188-N"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/48529.48535"},{"key":"e_1_3_3_4_2","first-page":"357","volume-title":"Proceedings of the 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS\u201923)","author":"Agullo Emmanuel","year":"2023","unstructured":"Emmanuel Agullo, Alfredo Buttari, Olivier Coulaud, Lionel Eyraud-Dubois, Mathieu Faverge, Alain Franc, Abdou Guermouche, Antoine Jego, Romain Peressoni, and Florent Pruvost. 2023. On the arithmetic intensity of distributed-memory dense matrix multiplication involving a symmetric input matrix (symm). In Proceedings of the 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS\u201923). IEEE, 357\u2013367. DOI:10.1109\/IPDPS54959.2023.00044"},{"key":"e_1_3_3_5_2","volume-title":"Proceedings of the SPAA 2022","author":"Daas H. Al","year":"2022","unstructured":"H. Al Daas, G. Ballard, L. Grigori, S. Kumar, and K. Rouse. 2022. Tight memory-independent parallel matrix multiplication communication lower bounds. In Proceedings of the SPAA 2022. DOI:10.1145\/3490148.3538552"},{"key":"e_1_3_3_6_2","volume-title":"Proceedings of the SPAA 2023","author":"Daas H. Al","year":"2023","unstructured":"H. Al Daas, G. Ballard, L. Grigori, S. Kumar, and K. Rouse. 2023. Parallel memory-independent communication bounds for syrk. In Proceedings of the SPAA 2023. DOI:10.1145\/3558481.3591072"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1017\/S0962492914000038"},{"key":"e_1_3_3_8_2","volume-title":"Proceedings of the SPAA 2012","author":"Ballard G.","year":"2012","unstructured":"G. Ballard, J. Demmel, O. Holtz, B. Lipshitz, and O. Schwartz. 2012. Strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds. In Proceedings of the SPAA 2012. DOI:10.1145\/2312005.2312021"},{"key":"e_1_3_3_9_2","volume-title":"Proceedings of the SPAA 2022","author":"Beaumont O.","year":"2022","unstructured":"O. Beaumont, L. Eyraud-Dubois, J. Langou, and M. V\u00e9rit\u00e9. 2022. I\/o-optimal algorithms for symmetric linear algebra kernels. In Proceedings of the SPAA 2022. DOI:10.1145\/3490148.3538587"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1006\/jsco.1996.0125"},{"key":"e_1_3_3_11_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511804441","volume-title":"Convex Optimization","author":"Boyd S.","year":"2004","unstructured":"S. Boyd and L. Vandenberghe. 2004. Convex Optimization. Cambridge University Press. Retrieved from https:\/\/web.stanford.edu\/boyd\/cvxbook\/"},{"key":"e_1_3_3_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/71.642949"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1206"},{"key":"e_1_3_3_14_2","doi-asserted-by":"crossref","DOI":"10.21236\/ADA584726","volume-title":"Communication Lower Bounds and Optimal Algorithms for Programs That Reference Arrays - Part 1","author":"Christ M.","year":"2013","unstructured":"M. Christ, J. Demmel, N. Knight, T. Scanlon, and K. Yelick. 2013. Communication Lower Bounds and Optimal Algorithms for Programs That Reference Arrays - Part 1. Technical Report UCB\/EECS-2013-61. EECS Department, University of California, Berkeley. Retrieved from http:\/\/www.eecs.berkeley.edu\/Pubs\/TechRpts\/2013\/EECS-2013-61.html"},{"issue":"10","key":"e_1_3_3_15_2","first-page":"1277","article-title":"On a combinatorial problem","volume":"51","author":"Bruijn Nicolaas G. de","year":"1948","unstructured":"Nicolaas G. de Bruijn and Paul Erd\u00f6s. 1948. On a combinatorial problem. Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam 51, 10 (1948), 1277\u20131279.","journal-title":"Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam"},{"key":"e_1_3_3_16_2","volume-title":"Proceedings of the IPDPS 2013","author":"Demmel J.","year":"2013","unstructured":"J. Demmel, D. Eliahu, A. Fox, S. Kamil, B. Lipshitz, O. Schwartz, and O. Spillinger. 2013. Communication-optimal parallel recursive rectangular matrix multiplication. In Proceedings of the IPDPS 2013. DOI:10.1109\/IPDPS.2013.80"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-53622-3"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-8176-4842-8_15"},{"key":"e_1_3_3_19_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139173285","volume-title":"Elementary Geometry of Algebraic Curves: An Undergraduate Introduction","author":"Gibson Christopher G.","year":"1998","unstructured":"Christopher G. Gibson. 1998. Elementary Geometry of Algebraic Curves: An Undergraduate Introduction. Cambridge University Press."},{"key":"e_1_3_3_20_2","doi-asserted-by":"publisher","DOI":"10.1090\/bull\/1553"},{"key":"e_1_3_3_21_2","volume-title":"Proceedings of the STOC 1981","author":"Hong Jia-Wei","year":"1981","unstructured":"Jia-Wei Hong and H. T. Kung. 1981. I\/o complexity: The red-blue pebble game. In Proceedings of the STOC 1981. Association for Computing Machinery, New York, NY, USA. DOI:10.1145\/800076.802486"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1137\/0202019"},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2004.03.021"},{"key":"e_1_3_3_24_2","unstructured":"Peter Keevash. 2024. The existence of designs. arXiv:1401.3665 [math.CO]. https:\/\/arxiv.org\/abs\/1401.3665"},{"key":"e_1_3_3_25_2","volume-title":"Proceedings of the SC 2021","author":"Kwasniewski G.","year":"2021","unstructured":"G. Kwasniewski, M. Kabic, T. Ben-Nun, A. N. Ziogas, J. E. Saethre, A. Gaillard, T. Schneider, M. Besta, A. Kozhevnikov, J. VandeVondele, and T. Hoefler. 2021. On the parallel i\/o optimality of linear algebra kernels: Near-optimal matrix factorizations. In Proceedings of the SC 2021. DOI:10.1145\/3458817.3476167"},{"key":"e_1_3_3_26_2","volume-title":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC\u201919)","author":"Kwasniewski G.","year":"2019","unstructured":"G. Kwasniewski, M. Kabi\u0107, M. Besta, J. VandeVondele, R. Solc\u00e0, and T. Hoefler. 2019. Red-blue pebbling revisited: Near optimal parallel matrix-matrix multiplication. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC\u201919). Association for Computing Machinery, New York, NY, USA, Article 24, 22 pages. DOI:10.1145\/3295500.3356181"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1090\/S0002-9904-1949-09320-5"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1007\/PL00008264"},{"key":"e_1_3_3_29_2","volume-title":"Proceedings of the PLDI 2020","author":"Olivry A.","year":"2020","unstructured":"A. Olivry, J. Langou, L.-N. Pouchet, P. Sadayappan, and F. Rastello. 2020. Automated derivation of parametric data movement lower bounds for affine programs. In Proceedings of the PLDI 2020. DOI:10.1145\/3385412.3385989"},{"key":"e_1_3_3_30_2","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-4252-7","volume-title":"Rational Points on Elliptic Curves","author":"Silverman Joseph H.","year":"1992","unstructured":"Joseph H. Silverman and John Torrence Tate. 1992. Rational Points on Elliptic Curves. Springer."},{"key":"e_1_3_3_31_2","unstructured":"Tyler Michael Smith Bradley Lowery Julien Langou and Robert A. van de Geijn. 2019. A tight i\/o lower bound for matrix multiplication. arXiv:1702.02017 [cs.CC]. https:\/\/arxiv.org\/abs\/1702.02017"},{"key":"e_1_3_3_32_2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1007\/978-3-642-23397-5_10","volume-title":"Proceedings of the Euro-Par 2011 Parallel Processing","volume":"6853","author":"Solomonik E.","year":"2011","unstructured":"E. Solomonik and J. Demmel. 2011. Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms. In Proceedings of the Euro-Par 2011 Parallel Processing. Emmanuel Jeannot, Raymond Namyst, and Jean Roman (Eds.), Lecture Notes in Computer Science, Vol. 6853, Springer, Berlin, 90\u2013109. DOI:10.1007\/978-3-642-23397-5_10"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1177\/1094342005051521"},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF02579286"},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/0097-3165(72)90028-3"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/0097-3165(72)90029-5"},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1016\/0097-3165(75)90067-9"}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3727344","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,9]],"date-time":"2025-06-09T12:16:17Z","timestamp":1749471377000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3727344"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,9]]},"references-count":36,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3727344"],"URL":"https:\/\/doi.org\/10.1145\/3727344","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"value":"2329-4949","type":"print"},{"value":"2329-4957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,9]]},"assertion":[{"value":"2024-09-17","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-14","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-06-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}