{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T16:50:08Z","timestamp":1765039808259,"version":"build-2065373602"},"reference-count":49,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,5,12]],"date-time":"2025-05-12T00:00:00Z","timestamp":1747008000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100005187","name":"National and Kapodistrian University of Athens","doi-asserted-by":"publisher","award":["4000140732\/23\/NL\/ND","ESA AO\/1-11498\/22\/UK\/ND"],"award-info":[{"award-number":["4000140732\/23\/NL\/ND","ESA AO\/1-11498\/22\/UK\/ND"]}],"id":[{"id":"10.13039\/501100005187","id-type":"DOI","asserted-by":"publisher"}]},{"name":"GREEK CUBESATS IN-ORBIT VALIDATION PROJECTS","award":["4000140732\/23\/NL\/ND","ESA AO\/1-11498\/22\/UK\/ND"],"award-info":[{"award-number":["4000140732\/23\/NL\/ND","ESA AO\/1-11498\/22\/UK\/ND"]}]},{"name":"ERMIS\u2013Hellenic CubeSat Demonstration Mission","award":["4000140732\/23\/NL\/ND","ESA AO\/1-11498\/22\/UK\/ND"],"award-info":[{"award-number":["4000140732\/23\/NL\/ND","ESA AO\/1-11498\/22\/UK\/ND"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Software"],"abstract":"<jats:p>Satellite and edge computing designers develop algorithms that restrict resource utilization and execution time. Among these design efforts, optimizing Fast Fourier Transform (FFT), key to many tasks, has led mainly to in-place FFT-specific hardware accelerators. Aiming at improving the FFT performance on processors and computing devices with limited resources, the current paper enhances the efficiency of the radix-2 FFT by exploring the benefits of an in-place technique. First, we present the advantages of organizing the single memory bank of processors to store two (2) FFT elements in each memory address and provide parallel load and store of each FFT pair of data. Second, we optimize the floating point (FP) and block floating point (BFP) configurations to improve the FFT Signal-to-Noise (SNR) performance and the resource utilization. The resulting techniques reduce the memory requirements by two and significantly improve the time performance for the overall prevailing BFP representation. The execution of inputs ranging from 1K to 16K FFT points, using 8-bit or 16-bit as FP or BFP numbers, on the space-proven Atmel AVR32 and Vision Processing Unit (VPU) Intel Movidius Myriad 2, the edge device Raspberry Pi Zero 2W and a low-cost accelerator on Xilinx Zynq 7000 Field Programmable Gate Array (FPGA), validates the method\u2019s performance improvement.<\/jats:p>","DOI":"10.3390\/software4020011","type":"journal-article","created":{"date-parts":[[2025,5,12]],"date-time":"2025-05-12T10:58:07Z","timestamp":1747047487000},"page":"11","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Improving the Fast Fourier Transform for Space and Edge Computing Applications with an Efficient In-Place Method"],"prefix":"10.3390","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-1470-0215","authenticated-orcid":false,"given":"Christoforos","family":"Vasilakis","sequence":"first","affiliation":[{"name":"Electronics Lab, Physics Department, National & Kapodistrian University of Athens, 157 84 Athens, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-7124-1203","authenticated-orcid":false,"given":"Alexandros","family":"Tsagkaropoulos","sequence":"additional","affiliation":[{"name":"Electronics Lab, Physics Department, National & Kapodistrian University of Athens, 157 84 Athens, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-7837-4362","authenticated-orcid":false,"given":"Ioannis","family":"Koutoulas","sequence":"additional","affiliation":[{"name":"Electronics Lab, Physics Department, National & Kapodistrian University of Athens, 157 84 Athens, Greece"}]},{"given":"Dionysios","family":"Reisis","sequence":"additional","affiliation":[{"name":"Electronics Lab, Physics Department, National & Kapodistrian University of Athens, 157 84 Athens, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2025,5,12]]},"reference":[{"key":"ref_1","unstructured":"Kacker, S., Chien, S., Devaraj, K., Demir, I., and Cahoy, K. (2025, January 28\u201330). Leveraging Realtime Meteorological Data for Dynamic Tasking of Agile Earth-Observing Satellites. Proceedings of the 14th International Workshop on Planning and Scheduling for Space, Toulouse, France."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1016\/j.ifacol.2018.07.147","article-title":"Impact of Edge Computing Paradigm on Energy Consumption in IoT","volume":"51","author":"Mocnej","year":"2018","journal-title":"IFAC-PapersOnLine"},{"key":"ref_3","unstructured":"ECSS Secretariat (2011). Space Engineering Structural Materials Handbook\u2014Part 8: Glossary, ESA Requirements and Standards Division."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Sanjeet, S., Sahoo, B.D., and Parhi, K.K. (2021, January 9\u201311). Comparison of Real-Valued FFT Architectures for Low-Throughput Applications using FPGA. Proceedings of the 2021 IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), Lansing, MI, USA.","DOI":"10.1109\/MWSCAS47672.2021.9531878"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1593","DOI":"10.1007\/s11265-018-1387-2","article-title":"Parallel memory accessing for FFT architectures","volume":"90","author":"Kitsakis","year":"2018","journal-title":"J. Signal Process. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1146","DOI":"10.1109\/TCSI.2015.2402935","article-title":"Building conflict-free FFT schedules","volume":"62","author":"Richardson","year":"2015","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1109\/78.747802","article-title":"An effective memory addressing scheme for FFT processors","volume":"47","author":"Ma","year":"1999","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_8","first-page":"876","article-title":"A novel memory-based FFT architecture for real-valued signals based on a radix-2 decimation-in-frequency algorithm","volume":"62","author":"Ma","year":"2015","journal-title":"IEEE Trans. Circuits Syst. II Express Briefs"},{"key":"ref_9","unstructured":"Li, B., Cheng, S., and Lin, J. (2021). tcFFT: Accelerating half-precision FFT through tensor cores. arXiv."},{"key":"ref_10","first-page":"26","article-title":"A generalized mixed-radix algorithm for memory-based FFT processors","volume":"57","author":"Hsiao","year":"2010","journal-title":"IEEE Trans. Circuits Syst. II Express Briefs"},{"key":"ref_11","unstructured":"Micikevicius, P., Stosic, D., Burgess, N., Cornea, M., Dubey, P., Grisenthwaite, R., Ha, S., Heinecke, A., Judd, P., and Kamalu, J. (2022). FP8 formats for deep learning. arXiv."},{"key":"ref_12","unstructured":"Kuzmin, A., Baalen, M.V., Ren, Y., Nagel, M., Peters, J., and Blankevoort, T. (2024). FP8 quantization: The power of the exponent. arXiv."},{"key":"ref_13","unstructured":"(2024, October 26). Atmel 32-bit AVR Microcontroller AT32UC3C. Available online: https:\/\/ww1.microchip.com\/downloads\/en\/DeviceDoc\/doc32117.pdf."},{"key":"ref_14","unstructured":"UoA (2024, November 07). ERMIS Project Press Release. Available online: https:\/\/hub.uoa.gr\/en\/ermis-project\/."},{"key":"ref_15","unstructured":"The European Space Agency (2024, October 26). ESA Backs Greek Firms\u2019 and Universities\u2019 CubeSats. Available online: https:\/\/www.esa.int\/Applications\/Connectivity_and_Secure_Communications\/ESA_backs_Greek_firms_and_universities_CubeSats."},{"key":"ref_16","unstructured":"Intel (2024, October 28). Intel\u00ae Movidius\u2122 Myriad\u2122 2 Vision Processing Unit. Available online: https:\/\/www.intel.com\/content\/www\/us\/en\/products\/sku\/122461\/intel-movidius-myriad-2-vision-processing-unit-4gb\/specifications.html."},{"key":"ref_17","unstructured":"Navarro, J.E., Samuelsson, A., Gingsj\u00f6, H., Barendt, J., Dunne, A., Reisis, D., Kyriakos, A., Papatheofanous, E.A., Bezaitis, C., and Matthijs, P. (2021, January 14\u201317). Hight-performance compute board\u2014A Fault-Tolerant Module for On-Board Vision Processing. Proceedings of the OBDP2021\u20142nd European Workshop on On-Board Data Processing (OBDP2021), Noordwijk, The Netherlands."},{"key":"ref_18","unstructured":"Raspberry (2024, October 26). Raspberry Pi Zero 2 W Product Brief. Available online: https:\/\/datasheets.raspberrypi.com\/rpizero2\/raspberry-pi-zero-2-w-product-brief.pdf."},{"key":"ref_19","unstructured":"(2019). IEEE Standard for VHDL Language Reference Manual (Standard No. IEEE Std 1076-2019)."},{"key":"ref_20","unstructured":"AMD (2024, November 03). Zynq-7000 SoC Data Sheet, Containing the XC7Z007S, Available online: https:\/\/docs.amd.com\/v\/u\/en-US\/ds190-Zynq-7000-Overview."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Nath, V., and Mandal, J. (2021). FPGA Implementation of Radix-4-Based Two-Dimensional FFT with and Without Pipelining Using Efficient Data Reordering Scheme. Nanoelectronics, Circuits and Communication Systems, Springer Nature.","DOI":"10.1007\/978-981-15-7486-3"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Heo, J., Jung, Y., Lee, S., and Jung, Y. (2021). FPGA Implementation of an Efficient FFT Processor for FMCW Radar Signal Processing. Sensors, 21.","DOI":"10.3390\/s21196443"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Schulz, P., and Sleahtitchi, G. (2023, January 7\u20139). FPGA-based Accelerator for FFT-Processing in Edge Computing. Proceedings of the 2023 IEEE 12th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Dortmund, Germany.","DOI":"10.1109\/IDAACS58523.2023.10348654"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1038","DOI":"10.1109\/TPDS.2021.3101764","article-title":"A Low-Power Transprecision Floating Point Cluster for Efficient Near-Sensor Data Analytics","volume":"33","author":"Montagna","year":"2022","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_25","unstructured":"IEEE (2019). IEEE Std 754-2019 (Revision of IEEE 754-2008), IEEE. IEEE Standard for Floating Point Arithmetic."},{"key":"ref_26","unstructured":"(2024, October 31). Float8\u2014Float32 Converter Source Code. Available online: https:\/\/github.com\/CVasilakis\/float8-float32-converter."},{"key":"ref_27","unstructured":"ARM (2024, November 07). Arm Compiler for Embedded Reference Guide\u2014Half-Precision Floating Point Number Format. Available online: https:\/\/developer.arm.com\/documentation\/101754\/0623\/armclang-Reference\/Other-Compiler-specific-Features\/Half-precision-floating-point-number-format."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Padua, D. (2011). Encyclopedia of Parallel Computing, Springer US.","DOI":"10.1007\/978-0-387-09766-4"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Lapsley, P., Bier, J., Shoham, A., and Lee, E.A. (1997). DSP Processor Fundamentals: Architectures and Features, IEEE Press.","DOI":"10.1109\/9780470544433"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1023\/A:1008110410087","article-title":"A Hierarchical Block-Floating Point Arithmetic","volume":"24","author":"Kobayashi","year":"2000","journal-title":"J. VLSI Signal Process. Syst. Signal, Image Video Technol."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Kobayashi, S., and Fettweis, G.P. (1999, January 15\u201319). A New Approach for Block-Floating Point Arithmetic. Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ, USA.","DOI":"10.1109\/ICASSP.1999.758322"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1109\/TASSP.1979.1163314","article-title":"A Simple Fixed-Point Error Bound for the Fast Fourier Transform","volume":"27","author":"Knight","year":"1979","journal-title":"IEEE Trans. Acoust. Speech, Signal Process."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/ASP\/2006\/96421","article-title":"Floating-to-Fixed-Point Conversion for Digital Signal Processors","volume":"2006","author":"Menard","year":"2006","journal-title":"EURASIP J. Adv. Signal Process."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Nussbaumer, H.J. (1982). Fast Fourier Transform and Convolution Algorithms, Springer.","DOI":"10.1007\/978-3-642-81897-4"},{"key":"ref_35","unstructured":"(2025, February 20). NanoMind A3200 Datasheet\u2014On-Board Computer System for Mission Critical Space Applications. Available online: https:\/\/gomspace.com\/UserFiles\/Subsystems\/datasheet\/gs-ds-nanomind-a3200_1006901-117.pdf."},{"key":"ref_36","unstructured":"(2024, October 26). FreeRTOS\u2014Real-Time Operating System for Microcontrollers and Small Microprocessors. Available online: https:\/\/freertos.org\/."},{"key":"ref_37","unstructured":"RTEMS\u2122 (2024, October 28). RTEMS Open Source Real Time Operating System Project. Available online: https:\/\/www.rtems.org\/."},{"key":"ref_38","unstructured":"Raspberry (2024, November 07). Raspberry Pi OS. Available online: https:\/\/www.raspberrypi.com\/software\/."},{"key":"ref_39","unstructured":"AMD (2024, October 28). AMD Zynq\u2122 7000 SoCs Product Brief. Available online: https:\/\/www.amd.com\/content\/dam\/amd\/en\/documents\/products\/adaptive-socs-and-fpgas\/soc\/zynq-7000-product-brief.pdf."},{"key":"ref_40","unstructured":"Digilent (2024, October 28). FPGA Development Board Cora Z7-07S. Available online: https:\/\/digilent.com\/reference\/programmable-logic\/cora-z7\/start?srsltid=AfmBOooTIa7Iy8mYnVjX-pl8uRyraS4DIbmQ7nr-0nub8q_6dBhD9wjm."},{"key":"ref_41","unstructured":"(2024, October 10). AMD Vivado\u2122 Design Suite. Available online: https:\/\/www.amd.com\/en\/products\/software\/adaptive-socs-and-fpgas\/vivado.html."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"60516","DOI":"10.1109\/ACCESS.2021.3074070","article-title":"A Survey of Energy Consumption Measurement in Embedded Systems","volume":"9","author":"Guo","year":"2021","journal-title":"IEEE Access"},{"key":"ref_44","unstructured":"(2025, February 20). Jetson TX1 Module. Available online: https:\/\/developer.nvidia.com\/embedded\/jetson-tx1."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Dunham, M.E., Baker, Z., Stettler, M., Pigue, M., Graham, P., Schmierer, E.N., and Power, J. (2009, January 9\u201311). High Efficiency Space-Based Software Radio Architectures: A Minimum Size, Weight, and Power TeraOps Processor. Proceedings of the 2009 International Conference on Reconfigurable Computing and FPGAs, Cancun, Mexico.","DOI":"10.1109\/ReConFig.2009.42"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"034002","DOI":"10.1117\/1.JATIS.9.3.034002","article-title":"Wideband digital multi-channel merge-split fast Fourier transform spectrometer: Design and characterization","volume":"9","author":"Sharma","year":"2023","journal-title":"J. Astron. Telesc. Instrum. Syst."},{"key":"ref_47","unstructured":"Frigo, M., and Johnson, S. (1998, January 15). FFTW: An adaptive software architecture for the FFT. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP \u201998 (Cat. No.98CH36181), Seattle, WA, USA."},{"key":"ref_48","first-page":"100034","article-title":"FourierPIM: High-throughput in-memory Fast Fourier Transform and polynomial multiplication","volume":"4","author":"Leitersdorf","year":"2023","journal-title":"Mem. Mater. Devices, Circuits Syst."},{"key":"ref_49","unstructured":"Elam, D., and Iovescu, C. (2003). A Block Floating Point Implementation for an N-Point FFT on the TMS320C55x DSP, Texas Instruments."}],"container-title":["Software"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2674-113X\/4\/2\/11\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:31:24Z","timestamp":1760031084000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2674-113X\/4\/2\/11"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,12]]},"references-count":49,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["software4020011"],"URL":"https:\/\/doi.org\/10.3390\/software4020011","relation":{},"ISSN":["2674-113X"],"issn-type":[{"type":"electronic","value":"2674-113X"}],"subject":[],"published":{"date-parts":[[2025,5,12]]}}}