{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:14:04Z","timestamp":1750306444214,"version":"3.41.0"},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2015,9,8]],"date-time":"2015-09-08T00:00:00Z","timestamp":1441670400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"German Research Foundation (DFG) as part of the Transregional Collaborative Research Center \u201cInvasive Computing\u201d","award":["SFB\/TR 89"],"award-info":[{"award-number":["SFB\/TR 89"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2015,11,2]]},"abstract":"<jats:p>We present an algorithm for inversion of symmetric positive definite matrices that combines the practical requirement of an optimal number of arithmetic operations and the theoretical goal of a polylogarithmic critical path length. The algorithm reduces inversion to matrix multiplication. It uses Strassen\u2019s recursion scheme, but on the critical path it breaks the recursion early, switching to an asymptotically inefficient yet fast use of Newton\u2019s method. We also show that the algorithm is numerically stable. Overall, we get a candidate for a massively parallel algorithm that scales to exascale systems even on relatively small inputs.<\/jats:p>","DOI":"10.1145\/2809812","type":"journal-article","created":{"date-parts":[[2015,9,15]],"date-time":"2015-09-15T12:09:15Z","timestamp":1442318955000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Work-Efficient Matrix Inversion in Polylogarithmic Time"],"prefix":"10.1145","volume":"2","author":[{"given":"Peter","family":"Sanders","sequence":"first","affiliation":[{"name":"Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany"}]},{"given":"Jochen","family":"Speck","sequence":"additional","affiliation":[{"name":"Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany"}]},{"given":"Raoul","family":"Steffen","sequence":"additional","affiliation":[{"name":"Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany"}]}],"member":"320","published-online":{"date-parts":[[2015,9,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/62972.63025"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2312005.2312044"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1137\/090760969"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2008.10.002"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-001-0033-7"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/580470"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1137\/0205040"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1137\/0210049"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00211-007-0114-x"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/248979"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/579525"},{"key":"e_1_2_1_13_1","unstructured":"IBTA 2014. InfiniBand Roadmap. Retrieved August 8 2015 from http:\/\/www.infinibandta.org\/content\/pages.php?pg&equals;technology_overview."},{"key":"e_1_2_1_14_1","unstructured":"Intel 2014. Intel Math Kernel Library Reference Manual -- MKL 11.2. (2014). Document Number 630813-065US."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1467"},{"volume-title":"Petascale Computing: Algorithms and Applications, David A","author":"K\u00fcster Uwe","key":"e_1_2_1_17_1","unstructured":"Uwe K\u00fcster and Michael Resch. 2008. Architectural and programming issues for sustained petaflop performance. In Petascale Computing: Algorithms and Applications, David A. Bader (Ed.). Chapman & Hall, CRC, Boca Raton, FL."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/377939.377946"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/22145.22161"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/0898-1221(89)90081-3"},{"key":"e_1_2_1_21_1","volume-title":"InfiniBand Essentials Every HPC Expert Must Know. Retrieved","author":"Paz Oded","year":"2015","unstructured":"Oded Paz. 2014. InfiniBand Essentials Every HPC Expert Must Know. Retrieved August 8, 2015 from http:\/\/www.hpcadvisorycouncil.com\/events\/2014\/swiss-workshop\/presos\/Day_1\/1_Mellanox.pdf."},{"key":"e_1_2_1_22_1","unstructured":"A. Petitet R. C. Whaley J. Dongarra and A. Cleary. 2014. HPL\u2014A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers (Scalability Analysis). Retrieved August 8 2015 from http:\/\/www.netlib.org\/benchmark\/hpl\/scalability.html."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/181014.181408"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2486159.2486173"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2009.09.001"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","unstructured":"Edgar Solomonik and James Demmel. 2011. Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms. Euro-Par 2011 Parallel Processing 90--109.","DOI":"10.5555\/2033408.2033420"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654079"},{"volume-title":"Exascale Ready Work-Optimal Matrix Inversion. Diploma thesis","author":"Steffen Raoul","key":"e_1_2_1_28_1","unstructured":"Raoul Steffen. 2012. Exascale Ready Work-Optimal Matrix Inversion. Diploma thesis. Karlsruhe Institute of Technology, Karlsruhe, Germany."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02165411"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1137\/0402037"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213977.2214056"}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2809812","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2809812","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T05:42:54Z","timestamp":1750225374000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2809812"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,9,8]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2015,11,2]]}},"alternative-id":["10.1145\/2809812"],"URL":"https:\/\/doi.org\/10.1145\/2809812","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"type":"print","value":"2329-4949"},{"type":"electronic","value":"2329-4957"}],"subject":[],"published":{"date-parts":[[2015,9,8]]},"assertion":[{"value":"2013-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-09-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}