{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,20]],"date-time":"2022-04-20T12:04:45Z","timestamp":1650456285985},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ACM Trans. Math. Softw."],"published-print":{"date-parts":[[1990,12]]},"abstract":"The Level 3 BLAS (BLAS3) are a set of specifications of FORTRAN 77 subprograms for carrying out matrix multiplications and the solution of triangular systems with multiple right-hand sides. They are intended to provide efficient and portable building blocks for linear algebra algorithms on high-performance computers. We describe algorithms for the BLAS3 operations that are asymptotically faster than the conventional ones. These algorithms are based on Strassen's method for fast matrix multiplication, which is now recognized to be a practically useful technique once matrix dimensions exceed about 100. We pay particular attention to the numerical stability of these \u201cfast BLAS3.\u201d Error bounds are given and their significance is explained and illustrated with the aid of numerical experiments. Our conclusion is that the fast BLAS3, although not as strongly stable as conventional implementations, are stable enough to merit careful consideration in many applications.<\/jats:p>","DOI":"10.1145\/98267.98290","type":"journal-article","created":{"date-parts":[[2002,7,27]],"date-time":"2002-07-27T11:31:44Z","timestamp":1027769504000},"page":"352-368","source":"Crossref","is-referenced-by-count":67,"title":["Exploiting fast matrix multiplication within the level 3 BLAS"],"prefix":"10.1145","volume":"16","author":[{"given":"Nicholas J.","family":"Higham","sequence":"first","affiliation":[{"name":"Univ. of Manchester, Manchester, UK"}]}],"member":"320","reference":[{"key":"e_1_2_1_1_2","volume-title":"The Design and Analysis of Computer Algorithms","author":"AHO A. V.","year":"1974"},{"key":"e_1_2_1_2_2","doi-asserted-by":"publisher","DOI":"10.1137\/0909040"},{"key":"e_1_2_1_3_2","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1007\/BF01395989","article-title":"Stability of fast algorithms for matrix multiplication","volume":"36","author":"BINI D.","year":"1980","journal-title":"Numer. Math."},{"key":"e_1_2_1_4_2","volume-title":"Algorithmics: Theory and Practice","author":"BRASSARD G.","year":"1988"},{"key":"e_1_2_1_6_2","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1007\/BF02308867","article-title":"Error analysis of algorithms for matrix multiplication and triangular decomposition using Winograd's identity","volume":"16","author":"BRENT R.P","year":"1970","journal-title":"Numer. Math."},{"key":"e_1_2_1_7_2","first-page":"1","volume-title":"Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing","author":"COPPERSMITH D.","year":"1987"},{"key":"e_1_2_1_9_2","volume-title":"Ill.","author":"DEMMEL J. W.","year":"1987"},{"key":"e_1_2_1_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/77626.79170"},{"key":"e_1_2_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/77626.77627"},{"key":"e_1_2_1_12_2","doi-asserted-by":"publisher","DOI":"10.1137\/0908086"},{"key":"e_1_2_1_13_2","first-page":"12","article-title":"Impact of hierarchical memory systems on linear algebra algorithm design","volume":"2","author":"GALLIVAN K.","year":"1988","journal-title":"Int. J. Supercomput. Appl."},{"key":"e_1_2_1_14_2","volume-title":"Matrix Computations","author":"GOLUB G. H.","year":"1989","edition":"2"},{"key":"e_1_2_1_15_2","doi-asserted-by":"publisher","DOI":"10.1137\/0608020"},{"key":"e_1_2_1_16_2","doi-asserted-by":"publisher","DOI":"10.1137\/0726070"},{"key":"e_1_2_1_17_2","doi-asserted-by":"publisher","DOI":"10.1137\/0911038"},{"key":"e_1_2_1_18_2","volume-title":"Release 3","author":"IBM","year":"1988","edition":"4"},{"key":"e_1_2_1_19_2","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1137\/0204009","article-title":"Computational complexity and numerical stability","volume":"4","author":"MILLER W","year":"1975","journal-title":"SIAM J. Comput."},{"key":"e_1_2_1_20_2","volume-title":"Pro-Matlab User's Guide. The MathWorks","author":"MOLER C. B.","year":"1987"},{"key":"e_1_2_1_21_2","volume-title":"Numerical Recipes: The Art of Scientific Computing","author":"PRESS W. H.","year":"1986"},{"key":"e_1_2_1_22_2","volume-title":"Numerical Algorithms for Modern Parallel Computer Architectures","author":"SCHREIBER R.S.","year":"1988"},{"key":"e_1_2_1_23_2","unstructured":"SEDGEWICK R. Algorithms. 2nd ed. Addison-Wesley Reading Mass. 1988. SEDGEWICK R. Algorithms. 2nd ed. Addison-Wesley Reading Mass. 1988."},{"key":"e_1_2_1_24_2","volume-title":"Gaussian elimination is not optimal. Numer. Math. I3","author":"STRASSEN V.","year":"1969"}],"container-title":["ACM Transactions on Mathematical Software"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/98267.98290","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,7]],"date-time":"2021-03-07T16:30:22Z","timestamp":1615134622000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/98267.98290"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1990,12]]},"references-count":22,"journal-issue":{"issue":"4","published-print":{"date-parts":[[1990,12]]}},"alternative-id":["10.1145\/98267.98290"],"URL":"http:\/\/dx.doi.org\/10.1145\/98267.98290","relation":{},"ISSN":["0098-3500","1557-7295"],"issn-type":[{"value":"0098-3500","type":"print"},{"value":"1557-7295","type":"electronic"}],"subject":["Applied Mathematics","Software"],"published":{"date-parts":[[1990,12]]}}}