{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T03:52:28Z","timestamp":1780372348048,"version":"3.54.1"},"reference-count":78,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2021,5,1]],"date-time":"2021-05-01T00:00:00Z","timestamp":1619827200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,5,2]],"date-time":"2021-05-02T00:00:00Z","timestamp":1619913600000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Sign Process Syst"],"published-print":{"date-parts":[[2021,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The integration of FPGA-based accelerators into a complete heterogeneous system is a challenging task faced by many researchers and engineers, especially now that FPGAs enjoy increasing popularity as implementation platforms for efficient, application-specific accelerators for domains such as signal processing, machine learning and intelligent storage. To lighten the burden of system integration from the developers of accelerators, the open-source <jats:italic>TaPaSCo<\/jats:italic> framework presented in this work provides an automated toolflow for the construction of heterogeneous many-core architectures from custom processing elements, and a simple, uniform programming interface to utilize spatially distributed, parallel computation on FPGAs. TaPaSCo aims to increase the <jats:italic>scalability<\/jats:italic> and <jats:italic>portability<\/jats:italic> of FPGA designs through automated <jats:italic>design space exploration<\/jats:italic>, greatly simplifying the scaling of hardware designs and facilitating iterative growth and portability across FPGA devices and families. This work describes TaPaSCo with its primary design abstractions and shows how TaPaSCo addresses portability and extensibility of FPGA hardware designs for systems-on-chip. A study of successful projects using TaPaSCo shows its versatility and can serve as inspiration and reference for future users, with more details on the usage of TaPaSCo presented in an in-depth case study and a short overview of the workflow.<\/jats:p>","DOI":"10.1007\/s11265-021-01640-8","type":"journal-article","created":{"date-parts":[[2021,5,3]],"date-time":"2021-05-03T15:28:03Z","timestamp":1620055683000},"page":"545-563","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":37,"title":["The TaPaSCo Open-Source Toolflow"],"prefix":"10.1007","volume":"93","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5927-4426","authenticated-orcid":false,"given":"Carsten","family":"Heinz","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jaco","family":"Hofmann","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jens","family":"Korinth","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lukas","family":"Sommer","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lukas","family":"Weber","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Andreas","family":"Koch","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,5,2]]},"reference":[{"key":"1640_CR1","doi-asserted-by":"crossref","unstructured":"Adler, M., Fleming, K.E., Parashar, A., Pellauer, M., & Emer, J. (2011). Leap scratchpads: automatic memory and cache management for reconfigurable logic. In Proceedings of the 19th ACM\/SIGDA international symposium on Field programmable gate arrays (pp. 25\u201328).","DOI":"10.1145\/1950413.1950421"},{"key":"1640_CR2","doi-asserted-by":"crossref","unstructured":"Aldinucci, M. (2017). FastFlow: high-level and efficient streaming on multi-core. Programming multi-core and many-core computing systems.","DOI":"10.1002\/9781119332015.ch13"},{"issue":"4","key":"1640_CR3","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1109\/MM.2020.2996616","volume":"40","author":"A Amid","year":"2020","unstructured":"Amid, A., Biancolin, D., Gonzalez, A., Grubb, D., Karandikar, S., Liew, H., Magyar, A., Mao, H., Ou, A., Pemberton, N., Rigge, P., Schmidt, C., Wright, J., Zhao, J., Shao, Y.S., Asanovi\u0107, K., & Nikoli\u0107, B. (2020). Chipyard: integrated design, simulation, and implementation framework for custom SoCs. IEEE Micro, 40(4), 10\u201321. https:\/\/doi.org\/10.1109\/MM.2020.2996616.","journal-title":"IEEE Micro"},{"key":"1640_CR4","unstructured":"Asanovi\u0107, K., Avizienis, R., Bachrach, J., Beamer, S., Biancolin, D., Celio, C., Cook, H., Dabbelt, D., Hauser, J., Izraelevitz, A., Karandikar, S., Keller, B., Kim, D., Koenig, J., Lee, Y., Love, E., Maas, M., Magyar, A., Mao, H., Moreto, M., Ou, A., Patterson, D.A., Richards, B., Schmidt, C., Twigg, S., Vo, H., & Waterman, A. (2016). The rocket chip generator. Tech. rep. UCB\/EECS-2016-17, EECS Department, University of California, Berkeley."},{"key":"1640_CR5","doi-asserted-by":"crossref","unstructured":"Baaij, C. (2010). Clash: Structural descriptions of synchronous hardware using haskell. In 2010 13th Euromicro conf. on digital system design: architectures, methods and tools (DSD).","DOI":"10.1109\/DSD.2010.21"},{"key":"1640_CR6","doi-asserted-by":"crossref","unstructured":"Bachrach, J., Vo, H., Richards, B., Lee, Y., Waterman, A., Avizienis, R., Wawrzynek, J., & Asanovic, K. (2012). Chisel: Constructing hardware in a Scala embedded language. In Proc. DAC 2012.","DOI":"10.1145\/2228360.2228584"},{"key":"1640_CR7","unstructured":"BlueSpec Inc. (2003). BlueSpec SystemVerilog, http:\/\/bluespec.com\/technology\/ (2003). acc: 05\/16\/2018."},{"key":"1640_CR8","unstructured":"Brugnoni, S., Corbat, T., Sommerlad, P., Suter, T., Korinth, J., Chevallerie, D. de la, & Koch, A. (2016). Automated generation of reconfigurable systems-on-chip by interactive code transformations for high-level synthesis. In Third International Workshop on FPGAs Software Programmers (FSP)."},{"issue":"4","key":"1640_CR9","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1109\/2.839323","volume":"33","author":"TJ Callahan","year":"2000","unstructured":"Callahan, T.J., Hauser, J.R., & Wawrzynek, J. (2000). The Garp architecture and C compiler. Computer, 33(4), 62\u201369.","journal-title":"Computer"},{"key":"1640_CR10","doi-asserted-by":"crossref","unstructured":"Canis, A. (2011). LegUp: high-level synthesis for FPGA-based processor\/accelerator systems. In Proc. of the 19th ACM\/SIGDA int. symp. on field programmable gate arrays.","DOI":"10.1145\/1950413.1950423"},{"key":"1640_CR11","doi-asserted-by":"crossref","unstructured":"Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J.H., Brown, S., & Czajkowski, T. (2011). LegUp: high-level synthesis for FPGA-based processor\/accelerator systems. In Proceedings of the 19th ACM\/SIGDA international symposium on Field programmable gate arrays, (pp. 33\u201336 ).","DOI":"10.1145\/1950413.1950423"},{"key":"1640_CR12","doi-asserted-by":"crossref","unstructured":"Charles, P. (2005). X10: an object-oriented approach to non-uniform cluster computing.","DOI":"10.1145\/1094811.1094852"},{"key":"1640_CR13","doi-asserted-by":"crossref","unstructured":"Chen, Y-T, Cong, J, & Xiao, B. (2015). Aracompiler: a prototyping flow and evaluation framework for accelerator-rich architectures. In 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (pp. 157\u2013158 ).","DOI":"10.1109\/ISPASS.2015.7095795"},{"key":"1640_CR14","doi-asserted-by":"crossref","unstructured":"De La Chevallerie, D., Korinth, J., & Koch, A. (2014). Integrating FPGA-based processing elements into a runtime for parallel heterogeneous computing. In 2014 International conference on field-programmable technology (FPT) (pp. 314\u2013317).","DOI":"10.1109\/FPT.2014.7082807"},{"key":"1640_CR15","unstructured":"Digilent Inc. (2015). ZedBoard, http:\/\/zedboard.org\/product\/zedboard. acc: 05\/16\/2018."},{"issue":"07","key":"1640_CR16","doi-asserted-by":"publisher","first-page":"P07027","DOI":"10.1088\/1748-0221\/13\/07\/p07027","volume":"13","author":"J Duarte","year":"2018","unstructured":"Duarte, J., Han, S., Harris, P., Jindariani, S., Kreinar, E., Kreis, B., Ngadiuba, J., Pierini, M., Rivera, R., Tran, N., & Wu, Z. (2018). Fast inference of deep neural networks in FPGAs for particle physics. Journal of Instrumentation, 13(07), P07027\u2013P07027. https:\/\/doi.org\/10.1088\/1748-0221\/13\/07\/p07027.","journal-title":"Journal of Instrumentation"},{"key":"1640_CR17","unstructured":"Embedded Systems and Applications Group. (2020). TU Darmstadt: TaPaSCo Contributor\u2019s Guide. https:\/\/github.com\/esa-tu-darmstadt\/tapasco\/wiki\/Contributor's-Guide."},{"key":"1640_CR18","unstructured":"Embedded Systems and Applications Group. (2020). TU Darmstadt: TaPaSCo Issue Tracker. https:\/\/github.com\/esa-tu-darmstadt\/tapasco\/issues."},{"key":"1640_CR19","doi-asserted-by":"crossref","unstructured":"Fleming, K., & Adler, M. (2016). The LEAP FPGA operating system. In FPGAs for software programmers (pp. 245\u2013258): Springer.","DOI":"10.1007\/978-3-319-26408-0_14"},{"key":"1640_CR20","doi-asserted-by":"crossref","unstructured":"Frigo, J., Gokhale, M., & Lavenier, D. (2001). Evaluation of the streams-C C-to-FPGA compiler: an applications perspective. In Proceedings of the 2001 ACM\/SIGDA ninth international symposium on Field programmable gate arrays (pp. 134\u2013 140 ).","DOI":"10.1145\/360276.360326"},{"key":"1640_CR21","unstructured":"G\u00e4dke, H., & Koch, A. (2007). Comrade-A compiler for adaptive systems. In Design, Automation and Test in Europe (DATE)."},{"issue":"10","key":"1640_CR22","doi-asserted-by":"publisher","first-page":"1517","DOI":"10.1109\/TCAD.2009.2026356","volume":"28","author":"A Gerstlauer","year":"2009","unstructured":"Gerstlauer, A., Haubelt, C., Pimentel, A.D., Stefanov, T.P., Gajski, D.D., & Teich, J. (2009). Electronic system-level synthesis methodologies. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 28(10), 1517\u20131530.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"1640_CR23","doi-asserted-by":"crossref","unstructured":"Gill, A. (2009). Introducing Kansas lava. In Int. Symp. on implementation and application of functional languages.","DOI":"10.1007\/978-3-642-16478-1_2"},{"key":"1640_CR24","doi-asserted-by":"crossref","unstructured":"Gokhale, M., Stone, J., Arnold, J., & Kalinowski, M. (2000). Stream-oriented FPGA computing in the Streams-C high level language. In Proceedings 2000 IEEE symposium on field-programmable custom computing machines (Cat. No. PR00871) (pp. 49\u201356 ).","DOI":"10.1109\/FPGA.2000.903392"},{"key":"1640_CR25","unstructured":"Guo, Z., Buyukkurt, B., Najjar, W., & Vissers, K. (2005). Optimized generation of data-path from C codes for FPGAs. In Design, automation and test in Europe (pp. 112\u2013117 )."},{"key":"1640_CR26","doi-asserted-by":"publisher","unstructured":"Heinz, C., Hofmann, J.A., Sommer, L., & Koch, A. (2020). Improving job launch rates in the TaPaSCo FPGA middleware by hardware\/software-co-design. In 2020 IEEE\/ACM International workshop on runtime and operating systems for supercomputers (ROSS). https:\/\/doi.org\/10.1109\/ROSS51935.2020.00008 (pp. 22\u201330).","DOI":"10.1109\/ROSS51935.2020.00008"},{"key":"1640_CR27","doi-asserted-by":"crossref","unstructured":"Heinz, C., Lavan, Y., Hofmann, J., & Koch, A. (2019). A catalog and in-hardware evaluation of open-source drop-in compatible RISC-V softcore processors. In IEEE Proc. International conference on ReConFigurable computing and FPGAs (ReConFig).","DOI":"10.1109\/ReConFig48160.2019.8994796"},{"key":"1640_CR28","doi-asserted-by":"crossref","unstructured":"Hofmann, J., Korinth, J., & Koch, A. (2016). A scalable high-performance hardware architecture for real-time stereo vision by semi-global matching. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops.","DOI":"10.1109\/CVPRW.2016.110"},{"key":"1640_CR29","doi-asserted-by":"crossref","unstructured":"Hofmann, J., Korinth, J., & Koch, A. (2016). A scalable latency-insensitive architecture for FPGA-accelerated semi-global matching in stereo vision applications. In Proc. Int. Conference on ReConFigurable Computing and FPGAs (ReConFig).","DOI":"10.1109\/ReConFig.2016.7857147"},{"key":"1640_CR30","unstructured":"Hofmann, J., Thostrup, L., Ziegler, T., Binnig, C., & Koch, A. (2019). High-performance in-network data processing. In International workshop on accelerating analytics and data management systems using modern processor and storage architectures, ADMS@VLDB 2019, Los Angeles, United States."},{"key":"1640_CR31","doi-asserted-by":"crossref","unstructured":"Huang, S.S. (2008). Liquid metal: Object-oriented programming across the hardware\/software boundary. In European conference on object-oriented programming.","DOI":"10.1007\/978-3-540-70592-5_5"},{"key":"1640_CR32","doi-asserted-by":"crossref","unstructured":"Huthmann, J., Liebig, B., Oppermann, J., & Koch, A. (2013). Hardware\/software co-compilation with the Nymble system. In 2013 8th International workshop on reconfigurable and communication-centric systems-on-chip (ReCoSoC) (pp. 1\u20138 ).","DOI":"10.1109\/ReCoSoC.2013.6581538"},{"key":"1640_CR33","unstructured":"IEEE Standards Association. (2014). IEEE 1685-2014 - IEEE Standard for IP-XACT, standard structure for packaging, integrating, and reusing IP within tool flows. acc: 05\/16\/2018."},{"key":"1640_CR34","unstructured":"Intel Corporation: Open programmable acceleration engine, https:\/\/opae.github.io\/. Visited on 08\/20\/2020."},{"key":"1640_CR35","unstructured":"Intel Inc. (2016). Intel FPGA SDK for OpenCL, https:\/\/www.altera.com\/products\/designsoftware\/embedded-software-developers\/opencl\/overview.html. acc: 05\/16\/2018."},{"key":"1640_CR36","doi-asserted-by":"crossref","unstructured":"Ismail, A., & Shannon, L. (2011). FUSE: Front-end user framework for O\/S abstraction of hardware accelerators. In 2011 IEEE 19th annual international symposium on field- programmable custom computing machines (pp. 170\u2013177 ).","DOI":"10.1109\/FCCM.2011.48"},{"key":"1640_CR37","doi-asserted-by":"crossref","unstructured":"Kapre, N., & Gray, J. (2015). Hoplite: Building austere overlay NoCs for FPGAs. In 2015 25th international conference on field programmable logic and applications (FPL).","DOI":"10.1109\/FPL.2015.7293956"},{"key":"1640_CR38","doi-asserted-by":"publisher","unstructured":"Kapre, N., & Gray, J. (2017). Hoplite: A deflection-routed directional torus NoC for FPGAs. ACM Trans. Reconfigurable Technol. Syst. 10(2). https:\/\/doi.org\/10.1145\/3027486.","DOI":"10.1145\/3027486"},{"key":"1640_CR39","doi-asserted-by":"publisher","unstructured":"Karandikar, S., Mao, H., Kim, D., Biancolin, D., Amid, A., Lee, D., Pemberton, N., Amaro, E., Schmidt, C., Chopra, A., Huang, Q., Kovacs, K., Nikolic, B., Katz, R., Bachrach, J., & Asanovic, K. (2018). FireSim: FPGA-accelerated cycle-exact scale-out system simulation in the public cloud. In 2018 ACM\/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) (pp. 29\u201342 ), DOI https:\/\/doi.org\/10.1109\/ISCA.2018.00014.","DOI":"10.1109\/ISCA.2018.00014"},{"key":"1640_CR40","doi-asserted-by":"crossref","unstructured":"King, M., Hicks, J., & Ankcorn, J. (2015). Software-driven hardware development. In Proceedings of the 2015 ACM\/SIGDA international symposium on field-programmable gate arrays (pp. 13\u201322).","DOI":"10.1145\/2684746.2689064"},{"key":"1640_CR41","doi-asserted-by":"crossref","unstructured":"Koch, D., Beckhoff, C., & Teich, J. (2008). ReCoBus-Builder\u2014A novel tool and technique to build statically and dynamically reconfigurable systems for FPGAS. In 2008 International conference on field programmable logic and applications (pp. 119\u2013124).","DOI":"10.1109\/FPL.2008.4629918"},{"key":"1640_CR42","doi-asserted-by":"crossref","unstructured":"Korinth, J., de la Chevallerie, D., & Koch, A. (2015). An open-source tool flow for the composition of reconfigurable hardware thread pool architectures. In IEEE 23rd Ann. int. symp. on field-programmable custom computing machines (FCCM).","DOI":"10.1109\/FCCM.2015.22"},{"key":"1640_CR43","doi-asserted-by":"crossref","unstructured":"Korinth, J., Hofmann, J., Heinz, C., & Koch, A. (2019). The TaPaSCo open-source toolflow for the automated composition of task-based parallel reconfigurable computing systems. In International symposium on applied reconfigurable computing (ARC).","DOI":"10.1007\/978-3-030-17227-5_16"},{"key":"1640_CR44","unstructured":"Kurth, A., Vogel, P., Capotondi, A., Marongiu, A., & Benini, L. (2017). HERO: Heterogeneous embedded research platform for exploring RISC-V Manycore accelerators on FPGA. arXiv:1712.06497."},{"key":"1640_CR45","doi-asserted-by":"crossref","unstructured":"Lange, H., & Koch, A. (2000). Memory access schemes for configurable processors. In International workshop on field programmable logic and applications (pp. 615\u2013625).","DOI":"10.1007\/3-540-44614-1_66"},{"key":"1640_CR46","doi-asserted-by":"crossref","unstructured":"Lo, C., & Chow, P. (2018). Multi-fidelity optimization for high-level synthesis directives. In 2018 28th International conference on field programmable logic and applications (FPL) (pp. 272\u20132727).","DOI":"10.1109\/FPL.2018.00054"},{"issue":"1","key":"1640_CR47","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1596532.1596540","volume":"9","author":"E L\u00fcbbers","year":"2009","unstructured":"L\u00fcbbers, E, & Platzner, M. (2009). ReconOS: Multithreaded programming for reconfigurable computers. ACM Transactions on embedded computing systems (TECS), 9(1), 1\u201333.","journal-title":"ACM Transactions on embedded computing systems (TECS)"},{"key":"1640_CR48","doi-asserted-by":"crossref","unstructured":"Mantovani, P., Giri, D., Di Guglielmo, G., Piccolboni, L., Zuckerman, J., Cota, E.G., Petracca, M., Pilato, C., & Carloni, L.P. (2020). Agile SoC development with open ESP : Invited Paper. In 2020 IEEE\/ACM international conference on computer aided design (ICCAD) (pp. 1\u20139 ).","DOI":"10.1145\/3400302.3415753"},{"key":"1640_CR49","doi-asserted-by":"crossref","unstructured":"Minhas, U.I., Woods, R., & Karakonstantis, G. (2019). Evaluation of FPGA partitioning schemes for time and space sharing of heterogeneous tasks. In International symposium on applied reconfigurable computing (pp. 334\u2013349 ).","DOI":"10.1007\/978-3-030-17227-5_24"},{"issue":"10","key":"1640_CR50","doi-asserted-by":"publisher","first-page":"1591","DOI":"10.1109\/TCAD.2015.2513673","volume":"35","author":"R Nane","year":"2015","unstructured":"Nane, R., Sima, V.-M., Pilato, C., Choi, J., Fort, B., Canis, A., Chen, Y.T., Hsiao, H., Brown, S., & Ferrandi, F. (2015). A survey and evaluation of FPGA high-level synthesis tools. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(10), 1591\u20131604.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"1640_CR51","doi-asserted-by":"crossref","unstructured":"Ober, M., Hofmann, J., Sommer, L., Weber, L., & Koch, A. (2019). High-throughput multi-threaded sum-product network inference in the reconfigurable cloud. In Fifth international workshop on heterogeneous high-performance reconfigurable computing (H2RC).","DOI":"10.1109\/H2RC49586.2019.00009"},{"issue":"2","key":"1640_CR52","doi-asserted-by":"publisher","first-page":"8:1","DOI":"10.1145\/3317670","volume":"12","author":"J Oppermann","year":"2019","unstructured":"Oppermann, J., Reuter-Oppermann, M., Sommer, L., Koch, A., & Sinnen, O. (2019). Exact and practical modulo scheduling for high-level synthesis. TRETS, 12(2), 8:1\u20138:26. https:\/\/doi.org\/10.1145\/3317670.","journal-title":"TRETS"},{"key":"1640_CR53","doi-asserted-by":"crossref","unstructured":"Oppermann, J., Sommer, L., Weber, L., Reuter-Oppermann, M., Koch, A., & Sinnen, O. (2019). SkyCastle: a resource-aware multi-loop scheduler for high-level synthesis. In International conference on field-programmable technology (FPT).","DOI":"10.1109\/ICFPT47387.2019.00013"},{"key":"1640_CR54","doi-asserted-by":"crossref","unstructured":"Peck, W. (2006). Hthreads: A computational model for reconfigurable devices. In Int. Conf. on Field Programmable Logic and Applications (FPL\u201906).","DOI":"10.1109\/FPL.2006.311336"},{"key":"1640_CR55","doi-asserted-by":"crossref","unstructured":"Peck, W., Anderson, E., Agron, J., Stevens, J., Baijot, F., & Andrews, D. (2006). Hthreads: A computational model for reconfigurable devices. In 2006 International conference on field programmable logic and applications (pp. 1\u20134 ).","DOI":"10.1109\/FPL.2006.311336"},{"key":"1640_CR56","doi-asserted-by":"crossref","unstructured":"Pilato, C., & Ferrandi, F. (2013). Bambu: A modular framework for the high level synthesis of memory-intensive applications. In 2013 23rd International conference on field programmable logic and applications (pp. 1\u20134).","DOI":"10.1109\/FPL.2013.6645550"},{"key":"1640_CR57","unstructured":"REPARA Project Consortium. (2016). Work Package 5 deliverables. acc: 05\/16\/2018."},{"key":"1640_CR58","doi-asserted-by":"crossref","unstructured":"Rodriguez, A., Valverde, J., Portilla, J., Otero, A., Riesgo, T., & De la Torre, E. (2018). Fpgabased high-performance embedded systems for adaptive edge computing in cyber-physical systems: The artico3 framework, (Vol. 18 p. 1877).","DOI":"10.3390\/s18061877"},{"key":"1640_CR59","unstructured":"Skalicky, S., Schmidt, A.G., & French, M. (2014). High level hardware\/software embedded system design with redsharc. arXiv:1408.4725."},{"key":"1640_CR60","unstructured":"Slurm Workload Manager, https:\/\/slurm.schedmd.com\/overview.html. acc: 08\/03\/2018."},{"key":"1640_CR61","unstructured":"So, H.K.-H., & Brodersen, R.W. (2007). BORPH: An operating system for FPGA-based reconfigurable computers. University of California, Berkeley."},{"key":"1640_CR62","doi-asserted-by":"crossref","unstructured":"Sommer, L., Korinth, J., & Koch, A. (2017). OpenMP device offloading to FPGA accelerators. In 2017 IEEE 28th Int. conf. on application-specific systems, architectures and processors (ASAP).","DOI":"10.1109\/ASAP.2017.7995280"},{"key":"1640_CR63","doi-asserted-by":"crossref","unstructured":"Sommer, L., Oppermann, J., Hofmann, J., & Koch, A. (2017). Synthesis of interleaved multithreaded accelerators from OpenMP Loops. In 2017 international conference on reconfigurable computing and FPGAs (ReConFig\u201917).","DOI":"10.1109\/RECONFIG.2017.8279823"},{"key":"1640_CR64","doi-asserted-by":"crossref","unstructured":"Sommer, L., Oppermann, J., Molina, A., Binnig, C., Kersting, K., & Koch, A. (2018). Automatic mapping of the sum-product network inference problem to FPGA-based accelerators. In IEEE International conference on computer design (ICCD).","DOI":"10.1109\/ICCD.2018.00060"},{"key":"1640_CR65","unstructured":"Sommer, L., Oppermann, J., Molina, A., Binnig, C., Kersting, K., & Koch, A. (2018). Automatic Synthesis of FPGA-based Accelerators for the Sum-Product Network Inference Problem. In ICML 2018 Workshop on tractable probabilistic models (TPM)."},{"key":"1640_CR66","doi-asserted-by":"crossref","unstructured":"Sommer, L., Weber, L., Kumm, M., & Koch, A. (2020). Comparison of arithmetic number formats for inference in sum-product networks on FPGAs. In 2020 IEEE 28th Annual international symposium on field-programmable custom computing machines (FCCM).","DOI":"10.1109\/FCCM48280.2020.00020"},{"issue":"3","key":"1640_CR67","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2866578","volume":"15","author":"E Sotiriou-Xanthopoulos","year":"2016","unstructured":"Sotiriou-Xanthopoulos, E., Xydis, S., Siozios, K., Economakos, G., & Soudris, D. (2016). An integrated exploration and virtual platform framework for many-accelerator heterogeneous systems. ACM Transactions on embedded computing systems (TECS), 15(3), 1\u201326.","journal-title":"ACM Transactions on embedded computing systems (TECS)"},{"key":"1640_CR68","unstructured":"TaPaSCo. (2017). https:\/\/github.com\/esa-tu-darmstadt\/tapasco. acc: 04\/09\/2020."},{"key":"1640_CR69","unstructured":"TaPaSCo RISC-V. (2019). https:\/\/github.com\/esa-tu-darmstadt\/tapasco-riscv. acc: 04\/09\/2020."},{"key":"1640_CR70","doi-asserted-by":"crossref","unstructured":"Weber, L., Sommer, L., Oppermann, J., Molina, A., Kersting, K., & Koch, A. (2019). Resource-efficient logarithmic number scale arithmetic for SPN inference on FPGAs. In International conference on field-programmable technology (FPT).","DOI":"10.1109\/ICFPT47387.2019.00040"},{"key":"1640_CR71","doi-asserted-by":"crossref","unstructured":"Weber, S.J., Paul, J.M., & Thomas, D.E. (2001). Co-RAM: combinational logic synthesis applied to software partitions for mapping to a novel memory device, (Vol. 9 pp. 805\u2013812).","DOI":"10.1109\/92.974894"},{"key":"1640_CR72","doi-asserted-by":"crossref","unstructured":"Wenzel, J., & Hochberger, C. (2016). RapidSoC: short turnaround creation of FPGA based SoCs. In Proceedings of the 27th international symposium on rapid system prototyping: shortening the path from specification to prototype (pp. 86\u201392 ).","DOI":"10.1145\/2990299.2990314"},{"key":"1640_CR73","unstructured":"Xilinx Inc. (2018). Vivado high level synthesis, https:\/\/www.xilinx.com\/products\/design-tools\/vivado\/integration\/esl-design.html. acc: 05\/16\/2018."},{"key":"1640_CR74","unstructured":"Xilinx Inc. (2020). Vitis Platform, https:\/\/www.xilinx.com\/products\/design-tools\/vitis\/vitis-platform.html (visited on 09\/10\/2020)."},{"key":"1640_CR75","doi-asserted-by":"crossref","unstructured":"Xu, C., Liu, G., Zhao, R., Yang, S., Luo, G., & Zhang, Z. (2017). A parallel bandit-based approach for autotuning fpga compilation. In Proceedings of the 2017 ACM\/SIGDA international symposium on field-programmable gate arrays (pp. 157\u2013166).","DOI":"10.1145\/3020078.3021747"},{"key":"1640_CR76","doi-asserted-by":"crossref","unstructured":"Yang, H.J., Fleming, K., Adler, M., & Emer, J. (2014). LEAP shared memories: Automating the construction of FPGA coherent memories. In 2014 IEEE 22nd annual international symposium on field-programmable custom computing machines (pp. 117\u2013124 ).","DOI":"10.1109\/FCCM.2014.43"},{"key":"1640_CR77","doi-asserted-by":"crossref","unstructured":"Zaruba, F., & Benini, L. (2019). The cost of application-class processing: energy and performance analysis of a linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology. IEEE Transactions on Very Large Scale Integration (VLSI) Systems.","DOI":"10.1109\/TVLSI.2019.2926114"},{"key":"1640_CR78","doi-asserted-by":"crossref","unstructured":"Zhang, P., Huang, M., Xiao, B., Huang, H., & Cong, J. (2015). CMOST: a system-level FPGA compilation framework. In 2015 52nd ACM\/EDAC\/IEEE Design Automation Conference (DAC) (pp. 1\u20136).","DOI":"10.1145\/2744769.2744807"}],"container-title":["Journal of Signal Processing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-021-01640-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11265-021-01640-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-021-01640-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,8]],"date-time":"2021-12-08T06:04:03Z","timestamp":1638943443000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11265-021-01640-8"}},"subtitle":["for the Automated Composition of Task-Based Parallel Reconfigurable Computing Systems"],"short-title":[],"issued":{"date-parts":[[2021,5]]},"references-count":78,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,5]]}},"alternative-id":["1640"],"URL":"https:\/\/doi.org\/10.1007\/s11265-021-01640-8","relation":{},"ISSN":["1939-8018","1939-8115"],"issn-type":[{"value":"1939-8018","type":"print"},{"value":"1939-8115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5]]},"assertion":[{"value":"28 April 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 November 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 January 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 May 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}