{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T08:05:06Z","timestamp":1770710706077,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":46,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,11,17]],"date-time":"2019-11-17T00:00:00Z","timestamp":1573948800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2107YFB0202105, 2016YFB0200803, 2017YFB0202302"],"award-info":[{"award-number":["2107YFB0202105, 2016YFB0200803, 2017YFB0202302"]}]},{"name":"National Natural Science Foundation of China","award":["61602443, 61432018, 61521092, 61502450"],"award-info":[{"award-number":["61602443, 61432018, 61521092, 61502450"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,11,17]]},"DOI":"10.1145\/3295500.3356138","type":"proceedings-article","created":{"date-parts":[[2019,11,7]],"date-time":"2019-11-07T19:43:22Z","timestamp":1573155802000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["AutoFFT"],"prefix":"10.1145","author":[{"given":"Zhihao","family":"Li","sequence":"first","affiliation":[{"name":"University of Chinese Academy of Sciences"}]},{"given":"Haipeng","family":"Jia","sequence":"additional","affiliation":[{"name":"Chinese Academy of Sciences"}]},{"given":"Yunquan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Chinese Academy of Sciences"}]},{"given":"Tun","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences"}]},{"given":"Liang","family":"Yuan","sequence":"additional","affiliation":[{"name":"Chinese Academy of Sciences"}]},{"given":"Luning","family":"Cao","sequence":"additional","affiliation":[{"name":"Chinese Academy of Sciences"}]},{"given":"Xiao","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences"}]}],"member":"320","published-online":{"date-parts":[[2019,11,17]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"UHFFT: A high performance DFT framework.","author":"Ali Ayaz","year":"2006","unstructured":"Ayaz Ali and Lennart Johnsson . 2006 . UHFFT: A high performance DFT framework. (2006). Ayaz Ali and Lennart Johnsson. 2006. UHFFT: A high performance DFT framework. (2006)."},{"key":"e_1_3_2_1_2_1","volume-title":"AOCL: AMD Optimizing CPU Libraries. https:\/\/developer.amd.com\/wp-content\/resources\/AMDCPULibrariesUserGuide_1.0.pdf.","author":"AMD.","year":"2019","unstructured":"AMD. 2019 . AOCL: AMD Optimizing CPU Libraries. https:\/\/developer.amd.com\/wp-content\/resources\/AMDCPULibrariesUserGuide_1.0.pdf. AMD. 2019. AOCL: AMD Optimizing CPU Libraries. https:\/\/developer.amd.com\/wp-content\/resources\/AMDCPULibrariesUserGuide_1.0.pdf."},{"key":"e_1_3_2_1_3_1","unstructured":"AMD. 2019. A software library containing FFT functions written in OpenCL. https:\/\/github.com\/clMathLibraries\/clFFT.  AMD. 2019. A software library containing FFT functions written in OpenCL. https:\/\/github.com\/clMathLibraries\/clFFT."},{"key":"e_1_3_2_1_4_1","unstructured":"Apple. 2019. The Apple Accelerate libraries - vDSP. https:\/\/developer.apple.com\/documentation\/accelerate\/vdsp\/fast_fourier_transforms.  Apple. 2019. The Apple Accelerate libraries - vDSP. https:\/\/developer.apple.com\/documentation\/accelerate\/vdsp\/fast_fourier_transforms."},{"key":"e_1_3_2_1_5_1","volume-title":"ARM Ne10 project. https:\/\/github.com\/projectNe10\/Ne10","author":"ARM.","unstructured":"ARM. 2019. ARM Ne10 project. https:\/\/github.com\/projectNe10\/Ne10 . ARM. 2019. ARM Ne10 project. https:\/\/github.com\/projectNe10\/Ne10."},{"key":"e_1_3_2_1_6_1","unstructured":"ARM. 2019. Arm Performance Libraries (ARMPL) 19.2.0. https:\/\/static.docs.arm.com\/101004\/1920\/arm_performance_libraries_reference_101004_1920_00_en.pdf.  ARM. 2019. Arm Performance Libraries (ARMPL) 19.2.0. https:\/\/static.docs.arm.com\/101004\/1920\/arm_performance_libraries_reference_101004_1920_00_en.pdf."},{"key":"e_1_3_2_1_7_1","volume-title":"Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al.","author":"Asanovic Krste","year":"2006","unstructured":"Krste Asanovic , Ras Bodik , Bryan Christopher Catanzaro , Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al. 2006 . The landscape of parallel computing research: A view from berkeley. Technical Report. Technical Report UCB\/EECS-2006-183, EECS Department, University of .... Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al. 2006. The landscape of parallel computing research: A view from berkeley. Technical Report. Technical Report UCB\/EECS-2006-183, EECS Department, University of ...."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11265-014-0889-9"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAU.1970.1162132"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1978.1163036"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126919"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TE.1969.4320436"},{"key":"e_1_3_2_1_13_1","volume-title":"An algorithm for the machine calculation of complex Fourier series. Mathematics of computation 19, 90","author":"Cooley James W","year":"1965","unstructured":"James W Cooley and John W Tukey . 1965. An algorithm for the machine calculation of complex Fourier series. Mathematics of computation 19, 90 ( 1965 ), 297--301. James W Cooley and John W Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Mathematics of computation 19, 90 (1965), 297--301."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2018.07.034"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2038037.1941589"},{"key":"e_1_3_2_1_16_1","volume-title":"Split radix'FFT algorithm. Electronics letters 20, 1","author":"Duhamel Pierre","year":"1984","unstructured":"Pierre Duhamel and Henk Hollmann . 1984. Split radix'FFT algorithm. Electronics letters 20, 1 ( 1984 ), 14--16. Pierre Duhamel and Henk Hollmann. 1984. Split radix'FFT algorithm. Electronics letters 20, 1 (1984), 14--16."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2018.2873289"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2009.934155"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1064978.1065048"},{"key":"e_1_3_2_1_20_1","unstructured":"M Frigo and SG Johnson. 2019. benchFFT. http:\/\/www.ffttw.org\/benchfft.  M Frigo and SG Johnson. 2019. benchFFT. http:\/\/www.ffttw.org\/benchfft."},{"key":"e_1_3_2_1_21_1","unstructured":"M Frigo and SG Johnson. 2019. The benchmarking methodology of benchFFT. http:\/\/www.fftw.org\/speed\/.  M Frigo and SG Johnson. 2019. The benchmarking methodology of benchFFT. http:\/\/www.fftw.org\/speed\/."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1998.681704"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2004.840301"},{"key":"e_1_3_2_1_25_1","volume-title":"AccFFT: A library for distributed-memory FFT on CPU and GPU architectures. CoRR abs\/1506.07933","author":"Gholami Amir","year":"2015","unstructured":"Amir Gholami , Judith Hill , Dhairya Malhotra , and George Biros . 2015. AccFFT: A library for distributed-memory FFT on CPU and GPU architectures. CoRR abs\/1506.07933 ( 2015 ). arXiv:1506.07933 http:\/\/arxiv.org\/abs\/1506.07933 Amir Gholami, Judith Hill, Dhairya Malhotra, and George Biros. 2015. AccFFT: A library for distributed-memory FFT on CPU and GPU architectures. CoRR abs\/1506.07933 (2015). arXiv:1506.07933 http:\/\/arxiv.org\/abs\/1506.07933"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.2478\/s13540-013-0041-8"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-014-1123-z"},{"key":"e_1_3_2_1_28_1","volume-title":"ESSL: IBM Engineering and Scientific Subroutine Library. https:\/\/www.ibm.com\/support\/knowledgecenter\/en\/SSFHY8_6.1\/navigation\/welcome.html.","author":"IBM.","year":"2019","unstructured":"IBM. 2019 . ESSL: IBM Engineering and Scientific Subroutine Library. https:\/\/www.ibm.com\/support\/knowledgecenter\/en\/SSFHY8_6.1\/navigation\/welcome.html. IBM. 2019. ESSL: IBM Engineering and Scientific Subroutine Library. https:\/\/www.ibm.com\/support\/knowledgecenter\/en\/SSFHY8_6.1\/navigation\/welcome.html."},{"key":"e_1_3_2_1_29_1","unstructured":"Intel. 2016. Intel 64 and IA-32 architectures optimization reference manual (Chapter 2.1). https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/manuals\/64-ia-32-architectures-optimization-manual.pdf.  Intel. 2016. Intel 64 and IA-32 architectures optimization reference manual (Chapter 2.1). https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/manuals\/64-ia-32-architectures-optimization-manual.pdf."},{"key":"e_1_3_2_1_30_1","unstructured":"Intel. 2019. Intel Math Kernel Library Developer Reference's Appendix C: FFTW Interface to Intel Math Kernel Library. https:\/\/software.intel.com\/sites\/default\/files\/mkl-2019-developer-reference-c_2.pdf.  Intel. 2019. Intel Math Kernel Library Developer Reference's Appendix C: FFTW Interface to Intel Math Kernel Library. https:\/\/software.intel.com\/sites\/default\/files\/mkl-2019-developer-reference-c_2.pdf."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1977.1162973"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-013-1314-8"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2018.10.012"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/335231.335252"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2159430.2159437"},{"key":"e_1_3_2_1_36_1","unstructured":"Nvidia. 2019. CUFFT library. https:\/\/docs.nvidia.com\/pdf\/CUFFT_Library.pdf.  Nvidia. 2019. CUFFT library. https:\/\/docs.nvidia.com\/pdf\/CUFFT_Library.pdf."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2909437.2909451"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2018.00048"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2004.840306"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1976.1162805"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/PROC.1968.6477"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1464182.1464209"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-0-12-592101-5.50007-5"},{"key":"e_1_3_2_1_44_1","volume-title":"FFTE: A Fast Fourier Transform Package","author":"Takahashi Daisuke","year":"2014","unstructured":"Daisuke Takahashi . 2014 . FFTE: A Fast Fourier Transform Package . http:\/\/www.ffte.jp\/. Daisuke Takahashi. 2014. FFTE: A Fast Fourier Transform Package. http:\/\/www.ffte.jp\/."},{"key":"e_1_3_2_1_45_1","volume-title":"High-Performance Computing on the Intel\u00ae Xeon Phi\u2122","author":"Wang Endong","unstructured":"Endong Wang , Qing Zhang , Bo Shen , Guangyong Zhang , Xiaowei Lu , Qing Wu , and Yajuan Wang . 2014. Intel math kernel library . In High-Performance Computing on the Intel\u00ae Xeon Phi\u2122 . Springer , 167--188. Endong Wang, Qing Zhang, Bo Shen, Guangyong Zhang, Xiaowei Lu, Qing Wu, and Yajuan Wang. 2014. Intel math kernel library. In High-Performance Computing on the Intel\u00ae Xeon Phi\u2122. Springer, 167--188."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/378795.378860"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijleo.2018.10.083"}],"event":{"name":"SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis","location":"Denver Colorado","acronym":"SC '19","sponsor":["SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing","IEEE CS"]},"container-title":["Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3295500.3356138","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3295500.3356138","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:02:13Z","timestamp":1750208533000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3295500.3356138"}},"subtitle":["a template-based FFT codes auto-generation framework for ARM and X86 CPUs"],"short-title":[],"issued":{"date-parts":[[2019,11,17]]},"references-count":46,"alternative-id":["10.1145\/3295500.3356138","10.1145\/3295500"],"URL":"https:\/\/doi.org\/10.1145\/3295500.3356138","relation":{},"subject":[],"published":{"date-parts":[[2019,11,17]]},"assertion":[{"value":"2019-11-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}