{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,1]],"date-time":"2025-12-01T11:10:48Z","timestamp":1764587448692},"publisher-location":"Berlin, Heidelberg","reference-count":82,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"type":"print","value":"9783540886426"},{"type":"electronic","value":"9783540886433"}],"license":[{"start":{"date-parts":[[2008,1,1]],"date-time":"2008-01-01T00:00:00Z","timestamp":1199145600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008]]},"DOI":"10.1007\/978-3-540-88643-3_5","type":"book-chapter","created":{"date-parts":[[2008,10,6]],"date-time":"2008-10-06T23:12:26Z","timestamp":1223334746000},"page":"196-259","source":"Crossref","is-referenced-by-count":19,"title":["How to Write Fast Numerical Code: A Small Introduction"],"prefix":"10.1007","author":[{"given":"Srinivas","family":"Chellappa","sequence":"first","affiliation":[]},{"given":"Franz","family":"Franchetti","sequence":"additional","affiliation":[]},{"given":"Markus","family":"P\u00fcschel","sequence":"additional","affiliation":[]}],"member":"297","reference":[{"key":"5_CR1","unstructured":"Moore, G.E.: Cramming more components onto integrated circuits. Readings in computer architecture, 56\u201359 (2000)"},{"key":"5_CR2","unstructured":"Meadows, L., Nakamoto, S., Schuster, V.: A vectorizing, software pipelining compiler for LIW and superscalar architecture. In: Proceedings of Risc (1992)"},{"key":"5_CR3","unstructured":"Group, S.S.C.: SUIF: A parallelizing & optimizing research compiler. Technical Report CSL-TR-94-620, Computer Systems Laboratory, Stanford University (May 1994)"},{"issue":"3","key":"5_CR4","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1109\/TPDS.2005.26","volume":"16","author":"B. Franke","year":"2005","unstructured":"Franke, B., O\u2019Boyle, M.F.P.: A complete compiler approach to auto-parallelizing C programs for multi-DSP systems. IEEE Trans. Parallel Distrib. Syst.\u00a016(3), 234\u2013245 (2005)","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"5_CR5","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611970999","volume-title":"Computational Framework of the Fast Fourier Transform","author":"C. Loan Van","year":"1992","unstructured":"Van Loan, C.: Computational Framework of the Fast Fourier Transform. SIAM, Philadelphia (1992)"},{"key":"5_CR6","volume-title":"Numerical Recipes in C: The Art of Scientific Computing","author":"W.H. Press","year":"1992","unstructured":"Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)","edition":"2"},{"issue":"2","key":"5_CR7","first-page":"232","volume":"93","author":"M. P\u00fcschel","year":"2005","unstructured":"P\u00fcschel, M., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Singer, B.W., Xiong, J., Franchetti, F., Ga\u010di\u0107, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N.: SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation\u00a093(2), 232\u2013275 (2005)","journal-title":"Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation"},{"key":"5_CR8","unstructured":"Website: Spiral (1998), http:\/\/www.spiral.net"},{"key":"5_CR9","doi-asserted-by":"crossref","unstructured":"Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Proc. IEEE Int\u2019l Conf.\u00a0Acoustics, Speech, and Signal Processing (ICASSP), vol.\u00a03, pp. 1381\u20131384 (1998)","DOI":"10.1109\/ICASSP.1998.681704"},{"issue":"2","key":"5_CR10","first-page":"216","volume":"93","author":"M. Frigo","year":"2005","unstructured":"Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation\u00a093(2), 216\u2013231 (2005)","journal-title":"Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation"},{"key":"5_CR11","unstructured":"Website: FFTW, http:\/\/www.fftw.org"},{"key":"5_CR12","unstructured":"Goto, K., van de Geijn, R.: On reducing TLB misses in matrix multiplication, FLAME working note 9. Technical Report TR-2002-55, The University of Texas at Austin, Department of Computer Sciences (November 2002)"},{"key":"5_CR13","doi-asserted-by":"crossref","unstructured":"Whaley, R.C., Dongarra, J.: Automatically Tuned Linear Algebra Software (ATLAS). In: Proc.\u00a0Supercomputing (1998)","DOI":"10.1109\/SC.1998.10004"},{"issue":"2","key":"5_CR14","first-page":"211","volume":"93","author":"J.M.F. Moura","year":"2005","unstructured":"Moura, J.M.F., P\u00fcschel, M., Padua, D., Dongarra, J.: Scanning the issue: Special issue on program generation, optimization, and platform adaptation. Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation\u00a093(2), 211\u2013215 (2005)","journal-title":"Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation"},{"issue":"11","key":"5_CR15","first-page":"1161","volume":"37","author":"E. Bida","year":"2007","unstructured":"Bida, E., Toledo, S.: An automatically-tuned sorting library. Software: Practice and Experience\u00a037(11), 1161\u20131192 (2007)","journal-title":"Software: Practice and Experience"},{"key":"5_CR16","unstructured":"Li, X., Garzaran, M.J., Padua, D.: A dynamically tuned sorting library. In: Proc. Int\u2019l Symposium on Code Generation and Optimization (CGO), pp. 111\u2013124 (2004)"},{"issue":"1","key":"5_CR17","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1177\/1094342004041296","volume":"18","author":"E.-J. Im","year":"2004","unstructured":"Im, E.-J., Yelick, K., Vuduc, R.: Sparsity: Optimization framework for sparse matrix kernels. Int\u2019l J.\u00a0High Performance Computing Applications\u00a018(1), 135\u2013158 (2004)","journal-title":"Int\u2019l J.\u00a0High Performance Computing Applications"},{"issue":"2","key":"5_CR18","first-page":"293","volume":"93","author":"J. Demmel","year":"2005","unstructured":"Demmel, J., Dongarra, J., Eijkhout, V., Fuentes, E., Petitet, A., Vuduc, R., Whaley, C., Yelick, K.: Self adapting linear algebra algorithms and software. Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation\u00a093(2), 293\u2013312 (2005)","journal-title":"Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation"},{"key":"5_CR19","unstructured":"Website: BeBOP, http:\/\/bebop.cs.berkeley.edu\/"},{"key":"5_CR20","doi-asserted-by":"crossref","unstructured":"Vuduc, R., Demmel, J.W., Yelick, K.A.: OSKI: A library of automatically tuned sparse matrix kernels. In: Proc. SciDAC. Journal of Physics: Conference Series, vol.\u00a016, pp. 521\u2013530 (2005)","DOI":"10.1088\/1742-6596\/16\/1\/071"},{"issue":"1-2","key":"5_CR21","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/S0167-8191(00)00087-9","volume":"27","author":"R. Whaley","year":"2001","unstructured":"Whaley, R., Petitet, A., Dongarra, J.: Automated empirical optimization of software and the ATLAS project. Parallel Computing\u00a027(1-2), 3\u201335 (2001)","journal-title":"Parallel Computing"},{"key":"5_CR22","doi-asserted-by":"crossref","unstructured":"Bilmes, J., Asanovi\u0107, K., whye Chin, C., Demmel, J.: Optimizing matrix multiply using PHiPAC: a Portable, High-Performance, ANSI C coding methodology. In: Proc. Int\u2019l Conference on Supercomputing (ICS), pp. 340\u2013347 (1997)","DOI":"10.1145\/263580.263662"},{"key":"5_CR23","doi-asserted-by":"crossref","unstructured":"Frigo, M.: A fast Fourier transform compiler. In: Proc.\u00a0Programming Language Design and Implementation (PLDI), pp. 169\u2013180 (1999)","DOI":"10.1145\/301618.301661"},{"key":"5_CR24","doi-asserted-by":"crossref","unstructured":"Franchetti, F., Voronenko, Y., P\u00fcschel, M.: Formal loop merging for signal transforms. In: Proc. Programming Language Design and Implementation (PLDI), pp. 315\u2013326 (2005)","DOI":"10.1145\/1065010.1065048"},{"key":"5_CR25","doi-asserted-by":"crossref","unstructured":"Franchetti, F., Voronenko, Y., P\u00fcschel, M.: FFT program generation for shared memory: SMP and multicore. In: Proc.\u00a0Supercomputing (2006)","DOI":"10.1109\/SC.2006.31"},{"key":"5_CR26","series-title":"Lecture Notes in Computer Science","volume-title":"High Performance Computing for Computational Science - VECPAR 2006","author":"F. Franchetti","year":"2006","unstructured":"Franchetti, F., Voronenko, Y., P\u00fcschel, M.: A rewriting system for the vectorization of signal transforms. In: Dayd\u00e9, M., Palma, J.M.L.M., Coutinho, \u00c1.L.G.A., Pacitti, E., Lopes, J.C. (eds.) VECPAR 2006. LNCS, vol.\u00a04395. Springer, Heidelberg (2006)"},{"issue":"1","key":"5_CR27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1055531.1055532","volume":"31","author":"P. Bientinesi","year":"2005","unstructured":"Bientinesi, P., Gunnels, J.A., Myers, M.E., Quintana-Orti, E., van de Geijn, R.: The science of deriving dense linear algebra algorithms. ACM Trans. on Mathematical Software\u00a031(1), 1\u201326 (2005)","journal-title":"ACM Trans. on Mathematical Software"},{"issue":"4","key":"5_CR28","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1145\/504210.504213","volume":"27","author":"J.A. Gunnels","year":"2001","unstructured":"Gunnels, J.A., Gustavson, F.G., Henry, G.M., van de Geijn, R.A.: FLAME: Formal linear algebra methods environment. ACM Trans. on Mathematical Software\u00a027(4), 422\u2013455 (2001)","journal-title":"ACM Trans. on Mathematical Software"},{"key":"5_CR29","unstructured":"Quintana-Orti, G., Quintana-Orti, E.S., van\u00a0de Geijn, R., Van\u00a0Zee, F.G., Chan, E.: Programming algorithms-by-blocks for matrix computations on multithreaded architectures (submitted for publication)"},{"issue":"2","key":"5_CR30","doi-asserted-by":"publisher","first-page":"276","DOI":"10.1109\/JPROC.2004.840311","volume":"93","author":"G. Baumgartner","year":"2005","unstructured":"Baumgartner, G., Auer, A., Bernholdt, D.E., Bibireata, A., Choppella, V., Cociorva, D., Gao, X., Harrison, R.J., Hirata, S., Krishanmoorthy, S., Krishnan, S., Lam, C.C., Lu, Q., Nooijen, M., Pitzer, R.M., Ramanujam, J., Sadayappan, P., Sibiryakov, A.: Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models. Proceedings of the IEEE\u00a093(2), 276\u2013292 (2005); Special issue on Program Generation, Optimization, and Adaptation","journal-title":"Proceedings of the IEEE"},{"key":"5_CR31","volume-title":"Generative Programming: Methods, Tools, and Applications","author":"K. Czarnecki","year":"2000","unstructured":"Czarnecki, K., Eisenecker, U.: Generative Programming: Methods, Tools, and Applications. Addison-Wesley, Reading (2000)"},{"key":"5_CR32","series-title":"Lecture Notes in Computer Science","volume-title":"Generative and Transformational Techniques in Software Engineering","year":"2006","unstructured":"L\u00e4mmel, R., Saraiva, J., Visser, J. (eds.): GTTSE 2005. LNCS, vol.\u00a04143. Springer, Heidelberg (2006)"},{"key":"5_CR33","unstructured":"P\u00fcschel, M.: How to write fast code.Course 18-645, Electrical and Computer Engineering, Carnegie Mellon University (2008), http:\/\/www.ece.cmu.edu\/~pueschel\/teaching\/18-645-CMU-spring08\/course.html"},{"volume-title":"Introduction to algorithms","year":"2001","key":"5_CR34","unstructured":"Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (eds.): Introduction to algorithms. MIT Press, Cambridge (2001)"},{"key":"5_CR35","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611971446","volume-title":"Applied numerical linear algebra","author":"J.W. Demmel","year":"1997","unstructured":"Demmel, J.W.: Applied numerical linear algebra. SIAM, Philadelphia (1997)"},{"key":"5_CR36","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2767-8","volume-title":"Algorithms for discrete Fourier transforms and convolution","author":"R. Tolimieri","year":"1997","unstructured":"Tolimieri, R., An, M., Lu, C.: Algorithms for discrete Fourier transforms and convolution, 2nd edn. Springer, Heidelberg (1997)","edition":"2"},{"key":"5_CR37","volume-title":"Computer Architecture: A Quantitative Approach","author":"J.L. Hennessy","year":"2002","unstructured":"Hennessy, J.L., Patterson, D.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco (2002)"},{"key":"5_CR38","volume-title":"Computer Systems: A Programmer\u2019s Perspective","author":"R.E. Bryant","year":"2003","unstructured":"Bryant, R.E., O\u2019Hallaron, D.R.: Computer Systems: A Programmer\u2019s Perspective. Prentice-Hall, Englewood Cliffs (2003)"},{"issue":"3","key":"5_CR39","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1007\/BF02165411","volume":"14","author":"V. Strassen","year":"1969","unstructured":"Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik\u00a014(3), 354\u2013356 (1969)","journal-title":"Numerische Mathematik"},{"key":"5_CR40","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1016\/S0747-7171(08)80013-2","volume":"9","author":"D. Coppersmith","year":"1990","unstructured":"Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation\u00a09, 251\u2013280 (1990)","journal-title":"Journal of Symbolic Computation"},{"issue":"2","key":"5_CR41","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1145\/567806.567807","volume":"28","author":"L.S. Blackford","year":"2002","unstructured":"Blackford, L.S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., Whaley, R.C.: An updated set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. on Mathematical Software\u00a028(2), 135\u2013151 (2002)","journal-title":"ACM Trans. on Mathematical Software"},{"key":"5_CR42","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719604","volume-title":"LAPACK Users\u2019 Guide","author":"E. Anderson","year":"1999","unstructured":"Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users\u2019 Guide, 3rd edn. SIAM, Philadelphia (1999)","edition":"3"},{"key":"5_CR43","unstructured":"Website: ATLAS, http:\/\/math-atlas.sourceforge.net\/"},{"key":"5_CR44","unstructured":"Website: Goto BLAS, http:\/\/www.tacc.utexas.edu\/general\/staff\/goto\/"},{"key":"5_CR45","unstructured":"Website: LAPACK, http:\/\/www.netlib.org\/lapack\/"},{"key":"5_CR46","unstructured":"Website: ScaLAPACK, http:\/\/www.netlib.org\/scalapack\/"},{"key":"5_CR47","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719642","volume-title":"ScaLAPACK Users\u2019 Guide","author":"L.S. Blackford","year":"1997","unstructured":"Blackford, L.S., Choi, J., Cleary, A., D\u2019Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users\u2019 Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997)"},{"key":"5_CR48","unstructured":"Website: PLAPACK, http:\/\/www.cs.utexas.edu\/users\/plapack\/"},{"issue":"9","key":"5_CR49","doi-asserted-by":"publisher","first-page":"837","DOI":"10.1002\/(SICI)1096-9128(199709)9:9<837::AID-CPE267>3.0.CO;2-2","volume":"9","author":"A. Chtchelkanova","year":"1997","unstructured":"Chtchelkanova, A., Gunnels, J., Morrow, G., Overfelt, J., van de Geijn, R.: Parallel implementation of BLAS: General techniques for level 3 BLAS. Concurrency: Practice and Experience\u00a09(9), 837\u2013857 (1997)","journal-title":"Concurrency: Practice and Experience"},{"key":"5_CR50","unstructured":"Website: FLAME, http:\/\/www.cs.utexas.edu\/users\/flame\/"},{"issue":"1","key":"5_CR51","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1109\/TSP.2006.882087","volume":"55","author":"S.G. Johnson","year":"2007","unstructured":"Johnson, S.G., Frigo, M.: A modified split-radix FFT with fewer arithmetic operations. IEEE Trans.\u00a0Signal Processing\u00a055(1), 111\u2013119 (2007)","journal-title":"IEEE Trans.\u00a0Signal Processing"},{"key":"5_CR52","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-81897-4","volume-title":"Fast Fourier Transformation and Convolution Algorithms","author":"H.J. Nussbaumer","year":"1982","unstructured":"Nussbaumer, H.J.: Fast Fourier Transformation and Convolution Algorithms, 2nd edn. Springer, Heidelberg (1982)","edition":"2"},{"issue":"4","key":"5_CR53","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1007\/BF01189337","volume":"9","author":"J.R. Johnson","year":"1990","unstructured":"Johnson, J.R., Johnson, R.W., Rodriguez, D., Tolimieri, R.: A methodology for designing, modifying, and implementing FFT algorithms on various architectures. Circuits Systems Signal Processing\u00a09(4), 449\u2013500 (1990)","journal-title":"Circuits Systems Signal Processing"},{"key":"5_CR54","doi-asserted-by":"crossref","unstructured":"Franchetti, F., P\u00fcschel, M.: Short vector code generation for the discrete Fourier transform. In: Proc.\u00a0IEEE Int\u2019l Parallel and Distributed Processing Symposium (IPDPS), pp. 58\u201367 (2003)","DOI":"10.1109\/IPDPS.2003.1213153"},{"key":"5_CR55","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","DOI":"10.1007\/11946441_74","volume-title":"Parallel and Distributed Processing and Applications","author":"A. Bonelli","year":"2006","unstructured":"Bonelli, A., Franchetti, F., Lorenz, J., P\u00fcschel, M., Ueberhuber, C.W.: Automatic performance optimization of the discrete Fourier transform on distributed memory computers. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds.) ISPA 2006. LNCS, vol.\u00a04330. Springer, Heidelberg (2006)"},{"key":"5_CR56","unstructured":"Website: FFTPACK, http:\/\/www.netlib.org\/fftpack\/"},{"key":"5_CR57","unstructured":"GNU: GSL http:\/\/www.gnu.org\/software\/gsl\/"},{"key":"5_CR58","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1007\/3-540-45545-0_17","volume-title":"Computational Science - ICCS 2001","author":"D. Mirkovi\u0107","year":"2001","unstructured":"Mirkovi\u0107, D., Johnsson, S.L.: Automatic performance tuning in the UHFFT library. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS-ComputSci 2001. LNCS, vol.\u00a02073, pp. 71\u201380. Springer, Heidelberg (2001)"},{"key":"5_CR59","unstructured":"Website: UHFFT, http:\/\/www2.cs.uh.edu\/~mirkovic\/fft\/parfft.htm"},{"key":"5_CR60","unstructured":"Website: FFTE, http:\/\/www.ffte.jp"},{"key":"5_CR61","unstructured":"Website: ACML, http:\/\/developer.amd.com\/acml.jsp"},{"key":"5_CR62","unstructured":"Website: Intel MKL, http:\/\/www.intel.com\/cd\/software\/products\/asmo-na\/eng\/307757.htm"},{"key":"5_CR63","unstructured":"Website: Intel IPP, http:\/\/www.intel.com\/cd\/software\/products\/asmo-na\/eng\/perflib\/ipp\/302910.htm"},{"key":"5_CR64","unstructured":"Website, I.B.M.: ESSL and PESSL, http:\/\/www-03.ibm.com\/systems\/p\/software\/essl.html"},{"key":"5_CR65","unstructured":"Website: NAG, http:\/\/www.nag.com\/"},{"key":"5_CR66","unstructured":"Website: IMSL, http:\/\/www.vni.com\/products\/imsl\/"},{"issue":"12","key":"5_CR67","doi-asserted-by":"publisher","first-page":"1612","DOI":"10.1109\/12.40842","volume":"38","author":"M.D. Hill","year":"1989","unstructured":"Hill, M.D., Smith, A.J.: Evaluating associativity in CPU caches. IEEE Trans. Comput.\u00a038(12), 1612\u20131630 (1989)","journal-title":"IEEE Trans. Comput."},{"key":"5_CR68","unstructured":"Intel Corporation: Intel 64 and IA-32 Architectures Optimization Reference Manual (2007), http:\/\/www.intel.com\/products\/processor\/manuals\/index.htm"},{"key":"5_CR69","unstructured":"Advanced Micro Devices (AMD) Inc.: Software Optimization Guide for AMD Athlon 64 and AMD Optero Processors (2005), http:\/\/developer.amd.com\/devguides.jsp"},{"key":"5_CR70","unstructured":"GNU: GCC:optimization options, http:\/\/gcc.gnu.org\/onlinedocs\/gcc\/Optimize-Options.html"},{"key":"5_CR71","unstructured":"Intel: Quick-reference guide to optimization with intel compilers version 10.x, http:\/\/cache-www.intel.com\/cd\/00\/00\/22\/23\/222300_222300.pdf"},{"key":"5_CR72","unstructured":"Intel: Intel VTune"},{"key":"5_CR73","unstructured":"Microsoft: Microsoft Visual Studio"},{"key":"5_CR74","unstructured":"GNU: Gnu gprof manual, http:\/\/www.gnu.org\/software\/binutils\/manual\/gprof-2.9.1\/html_mono\/gprof.html"},{"issue":"2","key":"5_CR75","first-page":"358","volume":"93","author":"K. Yotov","year":"2005","unstructured":"Yotov, K., Li, X., Ren, G., Garzaran, M.J., Padua, D., Pingali, K., Stodghill, P.: Is search really necessary to generate high-performance BLAS? Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation\u00a093(2), 358\u2013386 (2005)","journal-title":"Proceedings of the IEEE Special issue on Program Generation, Optimization, and Adaptation"},{"key":"5_CR76","unstructured":"Wolfe, M.: Iteration space tiling for memory hierarchies. In: SIAM Conference on Parallel Processing for Scientific Computing (1987)"},{"key":"5_CR77","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1090\/S0025-5718-1965-0178586-1","volume":"19","author":"J.W. Cooley","year":"1965","unstructured":"Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. of Computation\u00a019, 297\u2013301 (1965)","journal-title":"Math. of Computation"},{"issue":"1","key":"5_CR78","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1177\/1094342004041291","volume":"18","author":"M. P\u00fcschel","year":"2004","unstructured":"P\u00fcschel, M., Singer, B., Xiong, J., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Johnson, R.W.: SPIRAL: A generator for platform-adapted libraries of signal processing algorithms. Int\u2019l Journal of High Performance Computing Applications\u00a018(1), 21\u201345 (2004)","journal-title":"Int\u2019l Journal of High Performance Computing Applications"},{"key":"5_CR79","doi-asserted-by":"crossref","unstructured":"D\u2019Alberto, P., Milder, P.A., Sandryhaila, A., Franchetti, F., Hoe, J.C., Moura, J.M.F., P\u00fcschel, M., Johnson, J.: Generating FPGA accelerated DFT libraries. In: Proc. Symposium on Field-Programmable Custom Computing Machines (FCCM) (2007)","DOI":"10.1109\/FCCM.2007.58"},{"key":"5_CR80","doi-asserted-by":"crossref","unstructured":"Milder, P.A., Franchetti, F., Hoe, J.C., P\u00fcschel, M.: Formal datapath representation and manipulation for implementing DSP transforms. In: Proc. Design Automation Conference (DAC) (2008)","DOI":"10.1145\/1391469.1391572"},{"key":"5_CR81","doi-asserted-by":"crossref","unstructured":"Xiong, J., Johnson, J., Johnson, R., Padua, D.: SPL: A language and compiler for DSP algorithms. In: Proc.\u00a0Programming Language Design and Implementation (PLDI), pp. 298\u2013308 (2001)","DOI":"10.1145\/378795.378860"},{"key":"5_CR82","doi-asserted-by":"publisher","first-page":"535","DOI":"10.1016\/B978-044450813-3\/50011-4","volume-title":"Handbook of Automated Reasoning","author":"N. Dershowitz","year":"2001","unstructured":"Dershowitz, N., Plaisted, D.A.: Rewriting. In: Handbook of Automated Reasoning, vol.\u00a01, pp. 535\u2013610. Elsevier, Amsterdam (2001)"}],"container-title":["Lecture Notes in Computer Science","Generative and Transformational Techniques in Software Engineering II"],"original-title":[],"link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-540-88643-3_5","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,5,14]],"date-time":"2019-05-14T08:56:29Z","timestamp":1557824189000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/978-3-540-88643-3_5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008]]},"ISBN":["9783540886426","9783540886433"],"references-count":82,"URL":"https:\/\/doi.org\/10.1007\/978-3-540-88643-3_5","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2008]]}}}