{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:45:32Z","timestamp":1761896732940,"version":"3.41.0"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2018,3,22]],"date-time":"2018-03-22T00:00:00Z","timestamp":1521676800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2018,3,31]]},"abstract":"<jats:p>Modern data centers increasingly employ FPGA-based heterogeneous acceleration platforms as a result of their great potential for continued performance and energy efficiency. Today, FPGAs provide more hardware parallelism than is possible with GPUs or CPUs, whereas C-like programming environments facilitate shorter development time, even close to software cycles. In this work, we address limitations and overheads in access and transfer of data to accelerators over common CPU-accelerator interconnects such as PCIe. We present three different FPGA accelerator dispatching methods for streaming applications (e.g., multimedia, vision computing). The first uses zero-copy data transfers and on-chip scratchpad memory (SPM) for energy efficiency, and the second uses also zero-copy but shared copy engines among different accelerator instances and local external memory. The third uses the processor\u2019s memory management unit to acquire the physical address of user pages and uses scatter-gather data transfers with SPM. Even though all techniques exhibit advantages in terms of scalability and relieve the processor from control overheads through using integrated schedulers, the first method presents the best energy-efficient acceleration in streaming applications.<\/jats:p>","DOI":"10.1145\/3180263","type":"journal-article","created":{"date-parts":[[2018,3,23]],"date-time":"2018-03-23T12:29:49Z","timestamp":1521808189000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Energy-Performance Considerations for Data Offloading to FPGA-Based Accelerators Over PCIe"],"prefix":"10.1145","volume":"15","author":[{"given":"Dimitrios","family":"Mbakoyiannis","sequence":"first","affiliation":[{"name":"Technological Educational Institute of Crete, Crete, Greece"}]},{"given":"Othon","family":"Tomoutzoglou","sequence":"additional","affiliation":[{"name":"Technological Educational Institute of Crete, Crete, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2371-0633","authenticated-orcid":false,"given":"George","family":"Kornaros","sequence":"additional","affiliation":[{"name":"Technological Educational Institute of Crete, Crete, Greece"}]}],"member":"320","published-online":{"date-parts":[[2018,3,22]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Brad Brech Juan Rubio and Michael Hollinger. 2014. Data Engine for NoSQL \u2014IBM Power Systems Edition. White Paper. IBM. https:\/\/www-304.ibm.com\/webapp\/set2\/sas\/f\/capi\/CAPI_FlashWhitePaper.pdf.  Brad Brech Juan Rubio and Michael Hollinger. 2014. Data Engine for NoSQL \u2014IBM Power Systems Edition. White Paper. IBM. https:\/\/www-304.ibm.com\/webapp\/set2\/sas\/f\/capi\/CAPI_FlashWhitePaper.pdf."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2010.36"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2897972"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2744794"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2927964.2927971"},{"volume-title":"SAVE: Towards Efficient Resource Management in Heterogeneous System Architectures","year":"2014","author":"Durelli G.","key":"e_1_2_1_6_1"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2554688.2554723"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2017.19"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2015.15"},{"key":"e_1_2_1_10_1","unstructured":"Intel. 2009. An Introduction to the Intel QuickPath Interconnect. White Paper. Intel. http:\/\/www.intel.com\/technology\/quickpath\/introduction.pdf.  Intel. 2009. An Introduction to the Intel QuickPath Interconnect. White Paper. Intel. http:\/\/www.intel.com\/technology\/quickpath\/introduction.pdf."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2815631"},{"key":"e_1_2_1_12_1","unstructured":"Jason Lawley. 2014. Understanding Performance of PCI Express Systems. WP350 (v1.2). Xilinx.  Jason Lawley. 2014. Understanding Performance of PCI Express Systems. WP350 (v1.2). Xilinx."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28365-9_16"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2016.7577381"},{"volume-title":"Proceedings of the Tutorial in ASPLOS-XXI.","author":"Kegel A.","key":"e_1_2_1_15_1"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/LES.2013.2251454"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSOC.2014.6972448"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCSim.2016.7568335"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2872362.2872379"},{"volume-title":"Retrieved","year":"2012","author":"Nazarewicz Michal","key":"e_1_2_1_20_1"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021740"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2014.6844463"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTCHIPS.2014.7478821"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/2665671.2665678"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/3195638.3195697"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1147\/JRD.2014.2380198"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2744902"},{"volume-title":"Proceedings of the 14th IEEE\/ACM International Symposium on Cluster, Cloud, and Grid Computing. 11--20","author":"van Werkhoven Ben","key":"e_1_2_1_28_1"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2016.7482091"},{"volume-title":"Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL\u201916)","author":"Vesper Malte","key":"e_1_2_1_30_1"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2015.03.004"},{"key":"e_1_2_1_32_1","unstructured":"Xilinx. 2017. AXI Memory Mapped to PCI Express (PCIe) Gen2 v2.8 Logicore IP Product Guide (PG055). Xilinx.  Xilinx. 2017. AXI Memory Mapped to PCI Express (PCIe) Gen2 v2.8 Logicore IP Product Guide (PG055). Xilinx."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684746.2689060"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3180263","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3180263","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:39:30Z","timestamp":1750210770000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3180263"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,3,22]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,3,31]]}},"alternative-id":["10.1145\/3180263"],"URL":"https:\/\/doi.org\/10.1145\/3180263","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2018,3,22]]},"assertion":[{"value":"2017-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-03-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}