{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,24]],"date-time":"2025-06-24T05:44:50Z","timestamp":1750743890694,"version":"3.37.3"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2021,3,1]],"date-time":"2021-03-01T00:00:00Z","timestamp":1614556800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,3,1]],"date-time":"2021-03-01T00:00:00Z","timestamp":1614556800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100011688","name":"Electronic Components and Systems for European Leadership","doi-asserted-by":"publisher","award":["H2020-ECSEL-2017-2-783162"],"award-info":[{"award-number":["H2020-ECSEL-2017-2-783162"]}],"id":[{"id":"10.13039\/501100011688","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Sign Process Syst"],"published-print":{"date-parts":[[2021,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.<\/jats:p>","DOI":"10.1007\/s11265-021-01650-6","type":"journal-article","created":{"date-parts":[[2021,3,1]],"date-time":"2021-03-01T08:02:55Z","timestamp":1614585775000},"page":"565-586","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Generating High-Performance FPGA Accelerator Designs for Big Data Analytics with Fletcher and Apache Arrow"],"prefix":"10.1007","volume":"93","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7043-7131","authenticated-orcid":false,"given":"Johan","family":"Peltenburg","sequence":"first","affiliation":[]},{"given":"Jeroen","family":"van Straten","sequence":"additional","affiliation":[]},{"given":"Matthijs","family":"Brobbel","sequence":"additional","affiliation":[]},{"given":"Zaid","family":"Al-Ars","sequence":"additional","affiliation":[]},{"given":"H. Peter","family":"Hofstee","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,3,1]]},"reference":[{"key":"1650_CR1","doi-asserted-by":"publisher","unstructured":"Maas, M., Asanovi\u0107, K., Kubiatowicz, J. (2017). Return of the runtimes: Rethinking the language runtime system for the cloud 3.0 era. In Pproceedings of the 16th Workshop on Hot Topics in Operating Systems, ser. HotOS \u201917 (pp. 138\u2013143). New York: ACM, DOI https:\/\/doi.org\/10.1145\/3102980.3103003","DOI":"10.1145\/3102980.3103003"},{"key":"1650_CR2","doi-asserted-by":"crossref","unstructured":"Peltenburg, J., van Straten, J., Wijtemans, L., van Leeuwen, L., Al-Ars, Z., Hofstee, P. (2019). Fletcher: A Framework to Efficiently Integrate FPGA Accelerators with Apache Arrow. In 2019 29th International Conference on Field Programmable Logic and Applications (FPL) (pp. 270\u2013277).","DOI":"10.1109\/FPL.2019.00051"},{"key":"1650_CR3","doi-asserted-by":"crossref","unstructured":"Peltenburg, J., van Straten, J., Brobbel, M., Hofstee, H.P., Al-Ars, Z. (2019). Supporting columnar in-memory formats on fpga: The hardware design of fletcher for apache arrow. In Applied Reconfigurable Computing (pp. 32\u201347): Springer International Publishing.","DOI":"10.1007\/978-3-030-17227-5_3"},{"key":"1650_CR4","unstructured":"Delft University of Technology. (2020). vhlib: a vendor-agnostic VHDL IP library. [Online]. Available: https:\/\/github.com\/abs-tudelft\/vhlib."},{"key":"1650_CR5","unstructured":"Delft University of Technology. (2020). Cerata: a Hardware Construction Library written in C++\u200917. [Online]. Available: https:\/\/github.com\/abs-tudelft\/cerata."},{"key":"1650_CR6","unstructured":"Delft University of Technology. (2020). Fletchgen: The Fletcher Design Generator. [Online]. Available: https:\/\/github.com\/abs-tudelft\/fletcher\/tree\/develop\/codegen\/cpp\/fletchgen."},{"key":"1650_CR7","unstructured":"Delft University of Technology. (2020). vhdMMIO: a fully vendor-agnostic tool to build AXI4-lite MMIO infrastructure. [Online]. Available: https:\/\/github.com\/abs-tudelft\/vhdmmio."},{"key":"1650_CR8","unstructured":"Delft University of Technology. (2020). Fletcher platform-specific libraries. [Online]. Available: https:\/\/github.com\/abs-tudelft\/fletcher\/tree\/develop\/platforms."},{"key":"1650_CR9","doi-asserted-by":"crossref","unstructured":"Caulfield, A.M., Chung, E.S., Putnam, A., Angepat, H., Fowers, J., Haselman, M., Heil, S., Humphrey, M., Kaur, P., Kim, J., Lo, D., Massengill, T., Ovtcharov, K., Papamichael, M., Woods, L., Lanka, S., Chiou, D., Burger, D. (2016). A cloud-scale acceleration architecture. In 2016 49th Annual IEEE\/ACM international symposium on microarchitecture (MICRO) (pp. 1\u201313).","DOI":"10.1109\/MICRO.2016.7783710"},{"issue":"10","key":"1650_CR10","doi-asserted-by":"publisher","first-page":"1591","DOI":"10.1109\/TCAD.2015.2513673","volume":"35","author":"R Nane","year":"2016","unstructured":"Nane, R., Sima, V., Pilato, C., Choi, J., Fort, B., Canis, A., Chen, Y.T., Hsiao, H., Brown, S., Ferrandi, F., Anderson, J., Bertels, K. (2016). A survey and evaluation of FPGA high-level synthesis tools. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35 (10), 1591\u20131604.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"issue":"5","key":"1650_CR11","doi-asserted-by":"publisher","first-page":"898","DOI":"10.1109\/TCAD.2018.2834439","volume":"38","author":"S Lahti","year":"2019","unstructured":"Lahti, S., Sj\u00f6vall, P., Vanne, J., H\u00e4m\u00e4l\u00e4inen, T.D. (2019). Are We There Yet? A study on the state of high-level synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(5), 898\u2013911.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"issue":"2","key":"1650_CR12","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1145\/3282307","volume":"62","author":"JL Hennessy","year":"2019","unstructured":"Hennessy, J.L., & Patterson, D.A. (2019). A new golden age for computer architecture. Communications ACM, 62(2), 48\u201360. [Online]. Available: https:\/\/doi-org.tudelft.idm.oclc.org\/10.1145\/3282307.","journal-title":"Communications ACM"},{"key":"1650_CR13","unstructured":"Truong, L., & Hanrahan, P. (2019). A golden age of hardware description languages: Applying programming language techniques to improve design productivity. In 3rd Summit on advances in programming languages (SNAPL 2019) ser. Leibniz International Proceedings in Informatics (LIPIcs), (Vol. 136 pp. 7:1\u20137:21). Dagstuhl: Schloss Dagstuhl\u2013Leibniz-Zentrum fuer Informatik. [Online]. Available: http:\/\/drops.dagstuhl.de\/opus\/volltexte\/2019\/10550."},{"key":"1650_CR14","doi-asserted-by":"crossref","unstructured":"Peltenburg, J., Hesam, A., Al-Ars, Z. (2017). Pushing big data into accelerators: Can the jvm saturate our hardware?. In High performance computing (pp. 220\u2013236): Springer International Publishing.","DOI":"10.1007\/978-3-319-67630-2_18"},{"key":"1650_CR15","unstructured":"Google Inc. (2020). Protocol buffers. [Online]. Available: https:\/\/developers.google.com\/protocol-buffers."},{"key":"1650_CR16","unstructured":"Google Inc. (2020). Flatbuffers: Memory efficient serialization library. [Online]. Available: https:\/\/github.com\/google\/flatbuffers."},{"key":"1650_CR17","unstructured":"The Apache Software Foundation. (2020). Apache Arrow. [Online]. Available: https:\/\/arrow.apache.org\/."},{"key":"1650_CR18","doi-asserted-by":"crossref","unstructured":"Winterstein, F., Bayliss, S., Constantinides, G.A. (2013). High-level synthesis of dynamic data structures: A case study using Vivado HLS. In 2013 International conference on field-programmable technology (FPT) (pp. 362\u2013365).","DOI":"10.1109\/FPT.2013.6718388"},{"key":"1650_CR19","doi-asserted-by":"crossref","unstructured":"Weisz, G., & Hoe, J.C. (2015). CoRAM++: Supporting data-structure-specific memory interfaces for FPGA computing. In 2015 25th International conference on field programmable logic and applications (FPL) (pp. 1\u20138).","DOI":"10.1109\/FPL.2015.7294017"},{"key":"1650_CR20","doi-asserted-by":"crossref","unstructured":"Korinth, J., Hofmann, J., Heinz, C., Koch, A. (2019). The tapaSCo Open-Source Toolflow for the automated composition of task-based parallel reconfigurable computing systems. In Applied reconfigurable computing (pp. 214\u2013229): Springer International Publishing.","DOI":"10.1007\/978-3-030-17227-5_16"},{"key":"1650_CR21","doi-asserted-by":"crossref","unstructured":"Koeplinger, D., Feldman, M., Prabhakar, R., Zhang, Y., Hadjis, S., Fiszel, R., Zhao, T., Nardi, L., Pedram, A., Kozyrakis, C., Olukotun, K. (2018). Spatial: A language and compiler for application accelerators. In Proceedings of the 39th ACM SIGPLAN Conference on programming language design and implementation, ser. PLDI 2018 (pp. 296\u2013311). New York: Association for Computing Machinery.","DOI":"10.1145\/3192366.3192379"},{"key":"1650_CR22","doi-asserted-by":"crossref","unstructured":"Castellane, A., & Mesnet, B. (2019). Enabling fast and highly effective fpga design process using the capi snap framework. In High performance computing (pp. 317\u2013329): Springer International Publishing.","DOI":"10.1007\/978-3-030-34356-9_25"},{"key":"1650_CR23","doi-asserted-by":"crossref","unstructured":"Bachrach, J., Vo, H., Richards, B., Lee, Y., Waterman, A., Avi\u017eienis, R., Wawrzynek, J., Asanovi\u0107, K. (2012). Chisel: Constructing hardware in a Scala embedded language. In DAC Design automation conference 2012 (pp. 1212\u20131221).","DOI":"10.1145\/2228360.2228584"},{"issue":"4\/5","key":"1650_CR24","doi-asserted-by":"publisher","first-page":"8:1","DOI":"10.1147\/JRD.2018.2856978","volume":"62","author":"J Stuecheli","year":"2018","unstructured":"Stuecheli, J., Starke, W.J., Irish, J.D., Arimilli, L.B., Dreps, D., Blaner, B., Wollbrink, C., Allison, B. (2018). IBM POWER9 Opens up a new era of acceleration enablement: OpenCAPI. IBM Journal of Research and Development, 62(4\/5), 8:1\u20138:8.","journal-title":"IBM Journal of Research and Development"},{"key":"1650_CR25","doi-asserted-by":"crossref","unstructured":"Ellson, J., Gansner, E., Koutsofios, L., North, S.C., Woodhull, G. (2002). Graphviz\u2014 open source graph drawing tools. In Graph Drawing (pp. 483\u2013484). Berlin: Springer.","DOI":"10.1007\/3-540-45848-4_57"},{"key":"1650_CR26","doi-asserted-by":"crossref","unstructured":"van Dam, L., Peltenburg, J., Al-Ars, Z., Hofstee, H.P. (2019). An accelerator for posit arithmetic targeting posit level 1 blas routines and pair-hmm. In Proceedings of the conference for next generation arithmetic 2019, ser. CoNGA\u201919. [Online]. Available: https:\/\/doi-org.tudelft.idm.oclc.org\/10.1145\/3316279.3316284. New York: Association for Computing Machinery.","DOI":"10.1145\/3316279.3316284"},{"key":"1650_CR27","doi-asserted-by":"crossref","unstructured":"Peltenburg, J., Van Leeuwen, L.T., Hoozemans, J., Fang, J., Al-Ars, J., Hofstee, H. P. (2020). Battling the CPU bottleneck in apache parquet to arrow conversion using FPGA. In 2020 international conference on Field-Programmable technology (ICFPT).","DOI":"10.1109\/ICFPT51103.2020.00048"},{"key":"1650_CR28","doi-asserted-by":"crossref","unstructured":"Schuiki, F., Kurth, A., Grosser, T., Benini, L. (2020). LLHD: A multi-level intermediate representation for hardware description languages.","DOI":"10.1145\/3395654"},{"key":"1650_CR29","unstructured":"Evans, J. (2006). scalable concurrent malloc (3) implementation for FreeBSD. In Proceedings of the BSDCan Conference, Ottawa, Canada."},{"key":"1650_CR30","doi-asserted-by":"publisher","unstructured":"Al-Ars, Z., Basten, T., de Beer, A., Geilen, M., Goswami, D., J\u00e4\u00e4skel\u00e4inen, P., Kadlec, J., de Alejandro, M.M., Palumbo, F., Peeren, G., et al. (2019). The FitOptiVis ECSEL Project: Highly efficient distributed embedded image\/video processing in cyber-physical systems. In Proceedings of the 16th ACM international conference on computing frontiers, ser. CF \u201919 (pp. 333\u2013338). New York: Association for Computing Machinery. [Online]. Available: https:\/\/doi.org\/10.1145\/3310273.3323437","DOI":"10.1145\/3310273.3323437"}],"container-title":["Journal of Signal Processing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-021-01650-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11265-021-01650-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-021-01650-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,19]],"date-time":"2021-05-19T05:04:51Z","timestamp":1621400691000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11265-021-01650-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,1]]},"references-count":30,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,5]]}},"alternative-id":["1650"],"URL":"https:\/\/doi.org\/10.1007\/s11265-021-01650-6","relation":{},"ISSN":["1939-8018","1939-8115"],"issn-type":[{"type":"print","value":"1939-8018"},{"type":"electronic","value":"1939-8115"}],"subject":[],"published":{"date-parts":[[2021,3,1]]},"assertion":[{"value":"2 May 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 January 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 February 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 March 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}