{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T07:57:31Z","timestamp":1768982251498,"version":"3.49.0"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,12,6]],"date-time":"2021-12-06T00:00:00Z","timestamp":1638748800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Reconfigurable Technol. Syst."],"published-print":{"date-parts":[[2022,6,30]]},"abstract":"<jats:p>\n            Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the utmost importance to simulate increasingly larger computational models, hardware acceleration is receiving increased attention due to its potential to maximize the performance of scientific computing. Field-Programmable Gate Arrays could accelerate scientific computing because of the possibility to fully customize the memory hierarchy important in irregular applications such as iterative linear solvers. In this article, we study the potential of using Field-Programmable Gate Arrays in High-Performance Computing because of the rapid advances in reconfigurable hardware, such as the increase in on-chip memory size, increasing number of logic cells, and the integration of High-Bandwidth Memories on board. To perform this study, we propose a novel Sparse Matrix-Vector multiplication unit and an ILU0 preconditioner tightly integrated with a BiCGStab solver kernel. We integrate the developed preconditioned iterative solver in\n            <jats:italic>Flow<\/jats:italic>\n            from the Open Porous Media project, a state-of-the-art open source reservoir simulator. Finally, we perform a thorough evaluation of the FPGA solver kernel in both stand-alone mode and integrated in the reservoir simulator, using the NORNE field, a real-world case reservoir model using a grid with more than 10\n            <jats:sup>5<\/jats:sup>\n            cells and using three unknowns per cell.\n          <\/jats:p>","DOI":"10.1145\/3476229","type":"journal-article","created":{"date-parts":[[2021,12,6]],"date-time":"2021-12-06T21:40:47Z","timestamp":1638826847000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Hardware Acceleration of High-Performance Computational Flow Dynamics Using High-Bandwidth Memory-Enabled Field-Programmable Gate Arrays"],"prefix":"10.1145","volume":"15","author":[{"given":"Tom","family":"Hogervorst","sequence":"first","affiliation":[{"name":"Delft University of Technology, Delft, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4175-6560","authenticated-orcid":false,"given":"R\u0103zvan","family":"Nane","sequence":"additional","affiliation":[{"name":"Delft University of Technology, Delft, The Netherlands"}]},{"given":"Giacomo","family":"Marchiori","sequence":"additional","affiliation":[{"name":"Big Data Accelerate B.V., Delft, The Netherlands"}]},{"given":"Tong Dong","family":"Qiu","sequence":"additional","affiliation":[{"name":"Big Data Accelerate B.V., Delft, The Netherlands"}]},{"given":"Markus","family":"Blatt","sequence":"additional","affiliation":[{"name":"OPM-OS AS, Oslo, Norway"}]},{"given":"Alf Birger","family":"Rustad","sequence":"additional","affiliation":[{"name":"Equinor S.A., Rotvoll, Norway"}]}],"member":"320","published-online":{"date-parts":[[2021,12,6]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"NVIDIA. n.d. NVIDIA Nsight Compute Command Line Interface. Retrieved November 3 2021 from https:\/\/docs.nvidia.com\/nsight-compute\/NsightComputeCli\/index.html."},{"key":"e_1_3_1_3_2","unstructured":"AMD. 2020. AMD to Acquire Xilinx. Retrieved November 3 2021 from https:\/\/www.amd.com\/en\/press-releases\/2020-10-27-amd-to-acquire-xilinx-creating-the-industry-s-high-performance-computing."},{"key":"e_1_3_1_4_2","volume-title":"Proceedings of the SPE Reservoir Simulation Symposium","year":"1983","unstructured":"J. R. Appleyard. 1983. Nested factorization. In Proceedings of the SPE Reservoir Simulation Symposium."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2020.06.007"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1137\/140968896"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2014.6927464"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2554688.2554785"},{"key":"e_1_3_1_10_2","unstructured":"Dune. 2020. Dune Project. Retrieved November 3 2021 from https:\/\/dune-project.org\/."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2016.42"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.2172\/1614847"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.5555\/2650280.2650344"},{"key":"e_1_3_1_14_2","unstructured":"Intel. 2015. Intel Acquisition of Altera. Retrieved November 3 2021 from https:\/\/newsroom.intel.com\/press-kits\/intel-acquisition-of-altera."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1137\/0914041"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-11515-8_10"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2006.8"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.5555\/2049883"},{"key":"e_1_3_1_19_2","unstructured":"NVIDIA. 2020. The API Reference Guide for cuSPARSE the CUDA Sparse Matrix Library. Retrieved November 3 2021 from https:\/\/docs.nvidia.com\/cuda\/cusparse\/index.html."},{"key":"e_1_3_1_20_2","unstructured":"OPM. 2020. Open Porous Media Project. Retrieved November 3 2021 from https:\/\/github.com\/OPM."},{"key":"e_1_3_1_21_2","unstructured":"OPM. 2020. Open Porous Media Reservoir Simulator. Retrieved November 3 2021 from https:\/\/github.com\/OPM\/opm-simulators."},{"key":"e_1_3_1_22_2","unstructured":"OPM. 2020. Open Porous Media Reservoir Simulator\u2014FPGA Kernels. Retrieved November 3 2021 from https:\/\/github.com\/OPM\/FPGA."},{"key":"e_1_3_1_23_2","unstructured":"OPM. 2020. OPM Tests. Retrieved November 3 2021 from https:\/\/github.com\/OPM\/opm-tests\/tree\/master\/norne."},{"key":"e_1_3_1_24_2","article-title":"Scaling up HBM efficiency of top-K SpMV for approximate embedding similarity on FPGAs","volume":"2103","author":"Parravicini Alberto","year":"2021","unstructured":"Alberto Parravicini, Luca Giuseppe Cellamare, Marco Siracusa, and Marco Domenico Santambrogio. 2021. Scaling up HBM efficiency of top-K SpMV for approximate embedding similarity on FPGAs. CoRR abs\/2103.04808 (2021). arXiv:2103.04808 https:\/\/arxiv.org\/abs\/2103.04808.","journal-title":"CoRR"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS45731.2020.9181266"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.camwa.2020.05.014"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.5555\/829576"},{"key":"e_1_3_1_28_2","first-page":"51","article-title":"Sparstition: A partitioning scheme for large-scale sparse matrix vector multiplication on FPGA","author":"Sigurbergsson Bj\u00f6rn","year":"2019","unstructured":"Bj\u00f6rn Sigurbergsson, Tom Hogervorst, Tong D. Qiu, and Razvan Nane. 2019. Sparstition: A partitioning scheme for large-scale sparse matrix vector multiplication on FPGA. In Proceedings of the I nternational Conference on Application-Specific Systems, Architectures, and Processors (ASAP\u201919).51\u201358.","journal-title":"Proceedings of the I nternational Conference on Application-Specific Systems, Architectures, and Processors (ASAP\u201919)."},{"key":"e_1_3_1_29_2","volume-title":"Proceedings of the SPE Reservoir Simulation Symposium","year":"2013","unstructured":"Hamdi Tchelepi and Yifan Zhou. 2013. Multi-GPU parallelization of nested factorization for solving large linear systems. In Proceedings of the SPE Reservoir Simulation Symposium."},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/74.979531"},{"key":"e_1_3_1_31_2","volume-title":"Accelerating Sparse Linear Algebra and Deep Neural Networks on Reconfigurable Platforms","author":"Umuro\u011flu Yaman","year":"2018","unstructured":"Yaman Umuro\u011flu. 2018. Accelerating Sparse Linear Algebra and Deep Neural Networks on Reconfigurable Platforms. Ph.D. Dissertation. NTNU."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1137\/0913035"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSII.2013.2278111"},{"key":"e_1_3_1_34_2","unstructured":"Xilinx. 2020. Alveo U280 Data Center Accelerator Card. Retrieved November 3 2021 from https:\/\/www.xilinx.com\/products\/boards-and-kits\/alveo\/u280.html."},{"key":"e_1_3_1_35_2","unstructured":"Xilinx. 2020. Ultra RAM. Retrieved November 3 2021 from https:\/\/www.xilinx.com\/support\/documentation\/white_papers\/wp477-ultraram.pdf."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.2118\/152271-MS"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/1046192.1046202"}],"container-title":["ACM Transactions on Reconfigurable Technology and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3476229","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3476229","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:49:16Z","timestamp":1750268956000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3476229"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,6]]},"references-count":36,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,6,30]]}},"alternative-id":["10.1145\/3476229"],"URL":"https:\/\/doi.org\/10.1145\/3476229","relation":{},"ISSN":["1936-7406","1936-7414"],"issn-type":[{"value":"1936-7406","type":"print"},{"value":"1936-7414","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,6]]},"assertion":[{"value":"2021-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-12-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}