{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T12:14:57Z","timestamp":1761394497120},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1-2","license":[{"start":{"date-parts":[[2014,6,26]],"date-time":"2014-06-26T00:00:00Z","timestamp":1403740800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Sign Process Syst"],"published-print":{"date-parts":[[2014,10]]},"DOI":"10.1007\/s11265-014-0896-x","type":"journal-article","created":{"date-parts":[[2014,6,25]],"date-time":"2014-06-25T00:27:48Z","timestamp":1403656068000},"page":"169-190","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra\/FFT Cores"],"prefix":"10.1007","volume":"77","author":[{"given":"Ardavan","family":"Pedram","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"John D.","family":"McCalpin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andreas","family":"Gerstlauer","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2014,6,26]]},"reference":[{"key":"896_CR1","doi-asserted-by":"crossref","unstructured":"Akin, B., Milder, P.A., Franchetti, F., Hoe, J.C. (2012). Memory bandwidth efficient two-dimensional fast Fourier transform algorithm and implementation for large problem sizes. In Proceedings of the 2012 IEEE 20th international symposium on field-programmable custom computing machines, FCCM \u201912 (pp. 188\u2013191). IEEE.","DOI":"10.1109\/FCCM.2012.40"},{"key":"896_CR2","doi-asserted-by":"crossref","unstructured":"Bailey, D.H. (1989). FFTs in external or hierarchical memory. In Proceedings of the 1989 ACM\/IEEE conference on supercomputing (pp. 234\u2013242). ACM.","DOI":"10.1145\/76263.76288"},{"issue":"2","key":"896_CR3","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1109\/TAU.1969.1162041","volume":"17","author":"G Bergland","year":"1969","unstructured":"Bergland, G. (1969). Fast Fourier transform hardware implementations\u2014an overview. IEEE Transactions on Audio and Electroacoustics, 17 (2), 104\u2013108.","journal-title":"IEEE Transactions on Audio and Electroacoustics"},{"issue":"19","key":"896_CR4","doi-asserted-by":"crossref","first-page":"4707","DOI":"10.1109\/TSP.2013.2273199","volume":"61","author":"A Blake","year":"2013","unstructured":"Blake, A., Witten, I., Cree, M. (2013). The fastest Fourier transform in the south. IEEE Transactions on Signal Processing, 61 (19), 4707\u20134716.","journal-title":"IEEE Transactions on Signal Processing"},{"key":"896_CR5","unstructured":"Cheney, M., Borden, B., of the mathematical Sciences, C.B. (U.S.) (2009). N.S.F.: fundamentals of radar imaging. CBMS-NSF regional conference series in applied mathematics. Philadelphia: SIAM."},{"key":"896_CR6","unstructured":"Chung, E.S., Milder, P.A., Hoe, J.C., Mai, K. (2010). Single-chip heterogeneous computing: does the future include custom logic, FPGAs, and GPGPUs? In 43rd annual IEEE\/ACM international symposium on microarchitecture, MICRO-43 (pp. 225\u2013236). Washington, DC: IEEE Computer Society."},{"issue":"90","key":"896_CR7","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1090\/S0025-5718-1965-0178586-1","volume":"19","author":"J Cooley","year":"1965","unstructured":"Cooley, J., & Tukey, J. (1965). An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19 (90), 297\u2013301.","journal-title":"Mathematics of Computation"},{"issue":"2","key":"896_CR8","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1109\/JPROC.2004.840301","volume":"93","author":"M Frigo","year":"2005","unstructured":"Frigo, M., & Johnson, S. (2005). The design and implementation of FFTW3. Proceedings of the IEEE, 93 (2), 216\u2013231.","journal-title":"Proceedings of the IEEE"},{"key":"896_CR9","unstructured":"Galal, S., & Horowitz, M. (2010). Energy-efficient floating point unit design. IEEE Transactions on Computers, PP(99)."},{"key":"896_CR10","unstructured":"Greene, J., Pepe, M., Cooper, R. (2005). A parallel 64k complex FFT algorithm for the IBM\/Sony\/Toshiba Cell broadband engine processor. In Conference on the global signal processing expo."},{"key":"896_CR11","doi-asserted-by":"crossref","unstructured":"Hemmert, K.S., & Underwood, K.D. (2005). An analysis of the double-precision floating-point FFT on FPGAs. In Proceedings of the 2005 IEEE 13th international symposium on field-programmable custom computing machines, FCCM \u201905 (pp. 171\u2013180).","DOI":"10.1109\/FCCM.2005.16"},{"key":"896_CR12","unstructured":"Ho, C.H. (2010). Customizable and reconfigurable platform for optimising floating-point computations. Ph.D. thesis, University of London, Imperial College of Science, Technology and Medicine, Department of Computing."},{"key":"896_CR13","unstructured":"Jain, S., Erraguntla, V., Vangal, S., Hoskote, Y., Borkar, N., Mandepudi, T., Karthik, V. (2010). A 90 mW\/GFlop 3.4 GHz reconfigurable fused\/continuous multiply-accumulator for floating-point and integer operands in 65 nm. In 23rd international conference on VLSI design, 2010. VLSID \u201910 (pp. 252\u2013257)."},{"key":"896_CR14","doi-asserted-by":"crossref","unstructured":"Kak, A, & Slaney, M. (2001). Principles of computerized tomographic imaging. Classics in Applied Mathematics. Philadelphia: SIAM.","DOI":"10.1137\/1.9780898719277"},{"key":"896_CR15","unstructured":"Karner, H., Auer, M., Ueberhuber, C.W. (1998). Top speed FFTs for FMA architectures. Tech. Rep. AURORA TR1998-16, Institute for Applied and Numerical Mathematics, Vienna University of Technology."},{"key":"896_CR16","unstructured":"Kistler, M., Gunnels, J., Brokenshire, D., Benton, B. (2009). Petascale computing with accelerators. In Proceedings of the 14th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP \u201909 (pp. 241\u2013250). New York: ACM."},{"key":"896_CR17","doi-asserted-by":"crossref","unstructured":"Kuehl, C., Liebstueckel, U., Tejerina, I., Uemminghaus, M., Witte, F., Kolb, M., Suess, M., Weigand, R., Kopp, N. (2012). Fast Fourier Transform Co-processor (FFTC), towards embedded GFLOPs. In Society of photo-optical instrumentation engineers (SPIE) conference series, society of photo-optical instrumentation engineers (SPIE) conference series (vol. 8539).","DOI":"10.1117\/12.974825"},{"issue":"5","key":"896_CR18","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1007\/s11390-011-0186-z","volume":"26","author":"L Li","year":"2011","unstructured":"Li, L., Chen, Y.J., Liu, D.F., Qian, C., Hu, W.W. (2011). An FFT performance model for optimizing general-purpose processor architecture. Journal of Computer Science and Technology, 26 (5), 875\u2013889.","journal-title":"Journal of Computer Science and Technology"},{"key":"896_CR19","doi-asserted-by":"crossref","unstructured":"Milder, P., Franchetti, F., Hoe, J.C., P\u00fcschel, M. (2012). Computer generation of hardware for linear digital signal processing transforms. ACM Transactions on Design Automation of Electronic Systems, 17(2), 15:1\u201315:33","DOI":"10.1145\/2159542.2159547"},{"key":"896_CR20","doi-asserted-by":"crossref","unstructured":"Mou, S., & Yang, X. (2007). Design of a high-speed FPGA-based 32-bit floating-point FFT processor. In Eighth ACIS international conference on software engineering, artificial intelligence, networking, and parallel\/distributed computing, 2007. SNPD 2007 (vol. 1, pp. 84\u201387).","DOI":"10.1109\/SNPD.2007.46"},{"issue":"12","key":"896_CR21","first-page":"1724","volume":"61","author":"A Pedram","year":"2012","unstructured":"Pedram, A., van de Geijn, R., Gerstlauer, A. (2012). Codesign tradeoffs for high-performance, low-power linear algebra architectures. IEEE Transactions on Computers, Special Issue on Power Efficient Computing, 61 (12), 1724\u20131736.","journal-title":"IEEE Transactions on Computers, Special Issue on Power Efficient Computing"},{"key":"896_CR22","doi-asserted-by":"crossref","unstructured":"Pedram, A., Gerstlauer, A., van de Geijn, R. (2012). On the efficiency of register file versus broadcast interconnect for collective communications in data-parallel hardware accelerators. In Proceedings of the 2012 IEEE 24th international symposium on computer architecture and high performance computing (SBAC-PAD) (pp. 19\u201326).","DOI":"10.1109\/SBAC-PAD.2012.35"},{"key":"896_CR23","unstructured":"Pedram, A., Gerstlauer, A., Geijn, R.A. (2011). A high-performance, low-power linear algebra core. In Proceedings of the 22nd IEEE international conference on application-specific systems, architectures and processors, ASAP \u201911 (pp. 35\u201342). Washington, DC: IEEE Computer Society."},{"key":"896_CR24","doi-asserted-by":"crossref","unstructured":"Pedram, A., Gerstlauer, A., van de Geijn, R.A. (2013). Floating point architecture extensions for optimized matrix factorization. In Proceedings of the 2013 IEEE 21st symposium on computer arithmetic, ARITH \u201913. IEEE.","DOI":"10.1109\/ARITH.2013.21"},{"key":"896_CR25","unstructured":"Pedram, A., Gilani, S.Z., Kim, N.S., van de Geijn, R., Schulte, M., Gerstlauer, A. (2012). A linear algebra core design for efficient level-3 BLAS. In Proceedings of the 2012 IEEE 23rd international conference on application-specific systems, architectures and processors, ASAP \u201912 (pp. 149\u2013152). Washington, DC: IEEE Computer Society."},{"key":"896_CR26","doi-asserted-by":"crossref","unstructured":"Pedram, A., McCalpin, J., Gerstlauer, A. (2013). Transforming a linear algebra core to an FFT accelerator. In Proceedings of the 2013 IEEE 24th international conference on application-specific systems, architectures and processors (ASAP) (pp. 175\u2013184).","DOI":"10.1109\/ASAP.2013.6567572"},{"key":"896_CR27","doi-asserted-by":"crossref","unstructured":"Pereira, K., Athanas, P., Lin, H., Feng, W. (2011). Spectral method characterization on FPGA and GPU accelerators. In 2011 international conference on reconfigurable computing and FPGAs (ReConFig) (pp. 487\u2013492).","DOI":"10.1109\/ReConFig.2011.83"},{"key":"896_CR28","unstructured":"Satpathy, S., Sewell, K., Manville, T., Chen, Y.P., Dreslinski, R., Sylvester, D., Mudge, T., Blaauw, D. (2012). A 4.5Tb\/s 3.4Tb\/s\/W 64x64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45 nm CMOS. In 2012 IEEE international solid-state circuits conference digest of technical papers (ISSCC) (pp. 478\u2013480)."},{"key":"896_CR29","doi-asserted-by":"crossref","unstructured":"Satpathy, S., Sylvester, D., Blaauw, D. (2012). A standard cell compatible bidirectional repeater with thyristor assist. In 2012 symposium on VLSI circuits (VLSIC) (pp. 174\u2013175).","DOI":"10.1109\/VLSIC.2012.6243846"},{"issue":"2","key":"896_CR30","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1109\/TC.2010.271","volume":"61","author":"EE Swartzlander Jr","year":"2012","unstructured":"Swartzlander, E.E. Jr., & Saleh, H.H. (2012). FFT implementation with fused floating-point operations. IEEE Transactions on Computers, 61 (2), 284\u2013288.","journal-title":"IEEE Transactions on Computers"},{"key":"896_CR31","doi-asserted-by":"crossref","unstructured":"Varma, B.S.C., Paul, K., Balakrishnan, M. (2013). Accelerating 3D-FFT using hard embedded blocks in FPGAs. In International conference on VLSI design (pp. 92\u201397).","DOI":"10.1109\/VLSID.2013.169"},{"key":"896_CR32","doi-asserted-by":"crossref","unstructured":"Wu, D., Zou, X., Dai, K., Rao, J., Chen, P., Zheng, Z. (2011). Implementation and evaluation of parallel FFT on engineering and scientific computation accelerator (ESCA) architecture. Journal of Zhejiang University-Science C, 12(12), 976\u2013989.","DOI":"10.1631\/jzus.C1100027"},{"key":"896_CR33","doi-asserted-by":"crossref","unstructured":"Yuffe, M., Knoll, E., Mehalel, M., Shor, J., Kurts, T. (2011). A fully integrated multi-CPU, GPU and memory controller 32nm processor. In Proceedings of the 2011 IEEE international solid-state circuits conference digest of technical papers (ISSCC). IEEE.","DOI":"10.1109\/ISSCC.2011.5746311"},{"key":"896_CR34","unstructured":"Van Zee, F.G., & van de Geijn, R. (2012). FLAME Working Note #66, R.A.: BLIS: a framework for generating BLAS-like libraries. Technical Report TR-12-30, The University of Texas at Austin, Department of Computer Sciences."},{"key":"896_CR35","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Wang, D., Pan, Y., Wang, D., Zhou, X., Sobelman, G. (2011). FFT implementation with multi-operand floating point units. In 2011 IEEE 9th international conference on ASIC (ASICON) (pp. 216\u2013219).","DOI":"10.1109\/ASICON.2011.6157160"}],"container-title":["Journal of Signal Processing Systems"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-014-0896-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11265-014-0896-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-014-0896-x","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,11]],"date-time":"2019-08-11T22:00:57Z","timestamp":1565560857000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11265-014-0896-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,6,26]]},"references-count":35,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2014,10]]}},"alternative-id":["896"],"URL":"https:\/\/doi.org\/10.1007\/s11265-014-0896-x","relation":{},"ISSN":["1939-8018","1939-8115"],"issn-type":[{"value":"1939-8018","type":"print"},{"value":"1939-8115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,6,26]]}}}