{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T16:51:19Z","timestamp":1771951879469,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":15,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T00:00:00Z","timestamp":1619481600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"U.S. Department of Energy Office of Science Contract No. DE-AC02-05CH11231"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,4,27]]},"DOI":"10.1145\/3456669.3456671","type":"proceedings-article","created":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T15:22:31Z","timestamp":1619536951000},"page":"1-9","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Experiences Porting the SU3_Bench Microbenchmark to the Intel Arria 10 and Xilinx Alveo U280 FPGAs"],"prefix":"10.1145","author":[{"given":"Douglas","family":"Doerfler","sequence":"first","affiliation":[{"name":"Lawrence Berkeley National Laboratory, US"}]},{"given":"Farzad","family":"Fatollahi-Fard","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, US"}]},{"given":"Colin","family":"MacLean","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, US"}]},{"given":"Tan","family":"Nguyen","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, US"}]},{"given":"Samuel","family":"Williams","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, US"}]},{"given":"Nicholas","family":"Wright","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory, US"}]},{"given":"Marco","family":"Siracusa","sequence":"additional","affiliation":[{"name":"DEIB\/Politecnico di Milano, IT"}]}],"member":"320","published-online":{"date-parts":[[2021,4,27]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid.2012.123"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/H2RC51942.2020.00008"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2010.05.002"},{"key":"e_1_3_2_1_4_1","volume-title":"Chris Daley and Thomas Applencourt","author":"Doerfler Douglas","year":"2020","unstructured":"Douglas Doerfler , Chris Daley and Thomas Applencourt . 2020 . SU3_bench, a Micro-benchmark for Exploring Exascale Era Programming Models, Compilers and Runtimes . https:\/\/p3hpcforum2020.alcf.anl.gov\/wp-content\/uploads\/sites\/8\/2020\/09\/P3HPC_Doerfler_Day-1.pdf. Accessed: 2020-10-29. Douglas Doerfler, Chris Daley and Thomas Applencourt. 2020. SU3_bench, a Micro-benchmark for Exploring Exascale Era Programming Models, Compilers and Runtimes. https:\/\/p3hpcforum2020.alcf.anl.gov\/wp-content\/uploads\/sites\/8\/2020\/09\/P3HPC_Doerfler_Day-1.pdf. Accessed: 2020-10-29."},{"key":"e_1_3_2_1_5_1","volume-title":"Applied Reconfigurable Computing. Architectures, Tools, and Applications, Fernando Rinc\u00f3n, Jes\u00fas Barba, Hayden K.\u00a0H. So, Pedro Diniz, and Juli\u00e1n Caba(Eds.)","author":"Favaro Federico","unstructured":"Federico Favaro , Ernesto Dufrechou , Pablo Ezzatti , and Juan\u00a0 P. Oliver . 2020. Exploring FPGA Optimizations to Compute Sparse Numerical Linear Algebra Kernels . In Applied Reconfigurable Computing. Architectures, Tools, and Applications, Fernando Rinc\u00f3n, Jes\u00fas Barba, Hayden K.\u00a0H. So, Pedro Diniz, and Juli\u00e1n Caba(Eds.) . Springer International Publishing , Cham , 258\u2013268. Federico Favaro, Ernesto Dufrechou, Pablo Ezzatti, and Juan\u00a0P. Oliver. 2020. Exploring FPGA Optimizations to Compute Sparse Numerical Linear Algebra Kernels. In Applied Reconfigurable Computing. Architectures, Tools, and Applications, Fernando Rinc\u00f3n, Jes\u00fas Barba, Hayden K.\u00a0H. So, Pedro Diniz, and Juli\u00e1n Caba(Eds.). Springer International Publishing, Cham, 258\u2013268."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC43674.2020.9286213"},{"key":"e_1_3_2_1_7_1","volume-title":"Intel Arria 10 Device Datasheet. A10-DATASHEET |","author":"Intel Corp.","year":"2020","unstructured":"Intel Corp. [n.d.]. Intel Arria 10 Device Datasheet. A10-DATASHEET | 2020 .06.26. Intel Corp.[n.d.]. Intel Arria 10 Device Datasheet. A10-DATASHEET | 2020.06.26."},{"key":"e_1_3_2_1_8_1","volume-title":"Intel FPGA SDK for OpenCL Pro Edition: Best Practices Guide. UG-OCL003 |","author":"Intel Corp.","year":"2020","unstructured":"Intel Corp. [n.d.]. Intel FPGA SDK for OpenCL Pro Edition: Best Practices Guide. UG-OCL003 | 2020 .09.28. Intel Corp.[n.d.]. Intel FPGA SDK for OpenCL Pro Edition: Best Practices Guide. UG-OCL003 | 2020.09.28."},{"key":"e_1_3_2_1_9_1","volume-title":"Document Revision: 19","unstructured":"Khronos.org. 2012. The OpenCL Specification. Version: 1.2 , Document Revision: 19 , Editor : Aaftab Munshi . Khronos.org. 2012. The OpenCL Specification. Version: 1.2, Document Revision: 19, Editor: Aaftab Munshi."},{"key":"e_1_3_2_1_10_1","unstructured":"G. Korcyl and P. Korcyl. 2020. Optimized implementation of the conjugate gradient algorithm for FPGA-based platforms using the Dirac-Wilson operator as an example. arxiv:2001.05218\u00a0[cs.DC] https:\/\/arxiv.org\/abs\/2001.05218  G. Korcyl and P. Korcyl. 2020. Optimized implementation of the conjugate gradient algorithm for FPGA-based platforms using the Dirac-Wilson operator as an example. arxiv:2001.05218\u00a0[cs.DC] https:\/\/arxiv.org\/abs\/2001.05218"},{"key":"e_1_3_2_1_11_1","volume-title":"Towards Lattice Quantum Chromodynamics on FPGA devices. Computer Physics Communications 249 (Apr","author":"Korcyl Grzegorz","year":"2020","unstructured":"Grzegorz Korcyl and Piotr Korcyl . 2020. Towards Lattice Quantum Chromodynamics on FPGA devices. Computer Physics Communications 249 (Apr 2020 ), 107029. https:\/\/doi.org\/10.1016\/j.cpc.2019.107029 Grzegorz Korcyl and Piotr Korcyl. 2020. Towards Lattice Quantum Chromodynamics on FPGA devices. Computer Physics Communications 249 (Apr 2020), 107029. https:\/\/doi.org\/10.1016\/j.cpc.2019.107029"},{"key":"e_1_3_2_1_12_1","volume-title":"https:\/\/gitlab.com\/NERSC\/nersc-proxies\/su3_bench\/-\/tree\/fpgaAccessed December 1st","author":"Proxy App Suite NERSC","year":"2020","unstructured":"NERSC Proxy App Suite . [n.d.]. SU(3) matrix-matrix multiply microbenchmark. https:\/\/gitlab.com\/NERSC\/nersc-proxies\/su3_bench\/-\/tree\/fpgaAccessed December 1st , 2020 . NERSC Proxy App Suite. [n.d.]. SU(3) matrix-matrix multiply microbenchmark. https:\/\/gitlab.com\/NERSC\/nersc-proxies\/su3_bench\/-\/tree\/fpgaAccessed December 1st, 2020."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"crossref","unstructured":"T. Nguyen S. Williams M. Siracusa C. MacClean D. Doerfler and N.\u00a0J. Wright. 2020. The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing. In 2020 IEEE\/ACM Performance Modeling Benchmarking and Simulation of High Performance Computer Systems (PMBS). 8\u201319. https:\/\/doi.org\/DOI 10.1109\/PMBS51919.2020.00007  T. Nguyen S. Williams M. Siracusa C. MacClean D. Doerfler and N.\u00a0J. Wright. 2020. The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing. In 2020 IEEE\/ACM Performance Modeling Benchmarking and Simulation of High Performance Computer Systems (PMBS). 8\u201319. https:\/\/doi.org\/DOI 10.1109\/PMBS51919.2020.00007","DOI":"10.1109\/PMBS51919.2020.00007"},{"key":"e_1_3_2_1_14_1","volume-title":"Alveo U280 Data Center Accelerator Card Data Sheet. DS963 (v1.3)","author":"Xilinx Inc.","year":"2020","unstructured":"Xilinx Inc. [n.d.]. Alveo U280 Data Center Accelerator Card Data Sheet. DS963 (v1.3) May 11, 2020 . Xilinx Inc.[n.d.]. Alveo U280 Data Center Accelerator Card Data Sheet. DS963 (v1.3) May 11, 2020."},{"key":"e_1_3_2_1_15_1","volume-title":"Application Acceleration Development. UG1393 (v2019.2)","author":"Xilinx Inc.","year":"2020","unstructured":"Xilinx Inc. [n.d.]. Vitis Unified Software Platform Documentation , Application Acceleration Development. UG1393 (v2019.2) February 28, 2020 . Xilinx Inc.[n.d.]. Vitis Unified Software Platform Documentation, Application Acceleration Development. UG1393 (v2019.2) February 28, 2020."}],"event":{"name":"IWOCL'21: International Workshop on OpenCL","location":"Munich Germany","acronym":"IWOCL'21"},"container-title":["International Workshop on OpenCL"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3456669.3456671","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3456669.3456671","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:46:55Z","timestamp":1750193215000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3456669.3456671"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,27]]},"references-count":15,"alternative-id":["10.1145\/3456669.3456671","10.1145\/3456669"],"URL":"https:\/\/doi.org\/10.1145\/3456669.3456671","relation":{},"subject":[],"published":{"date-parts":[[2021,4,27]]},"assertion":[{"value":"2021-04-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}