{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,29]],"date-time":"2026-03-29T09:10:37Z","timestamp":1774775437859,"version":"3.50.1"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,12,17]],"date-time":"2024-12-17T00:00:00Z","timestamp":1734393600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NSF","award":["CNS-1739643 and CNS-1763503"],"award-info":[{"award-number":["CNS-1739643 and CNS-1763503"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Reconfigurable Technol. Syst."],"published-print":{"date-parts":[[2025,3,31]]},"abstract":"<jats:p>The development of FPGA-based applications using HLS is fraught with performance pitfalls and large design space exploration times. These issues are exacerbated when the application is complicated and its performance is dependent on the input dataset, as is often the case with graph neural network approaches to machine learning. Here, we introduce HLPerf, an open-source, simulation-based performance evaluation framework for dataflow architectures that both supports early exploration of the design space and shortens the performance evaluation cycle. We apply the methodology to GNNHLS, an HLS-based graph neural network benchmark containing six commonly used graph neural network models and four datasets with distinct topologies and scales. The results show that HLPerf achieves over 10, 000\u00d7 average simulation acceleration relative to RTL simulation and over 400\u00d7 acceleration relative to state-of-the-art cycle-accurate tools at the cost of 7% mean error rate relative to actual FPGA implementation performance. This acceleration positions HLPerf as a viable component in the design cycle.<\/jats:p>","DOI":"10.1145\/3655627","type":"journal-article","created":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T13:42:50Z","timestamp":1712065370000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["HLPerf: Demystifying the Performance of HLS-based Graph Neural Networks with Dataflow Architectures"],"prefix":"10.1145","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9952-0628","authenticated-orcid":false,"given":"Chenfeng","family":"Zhao","sequence":"first","affiliation":[{"name":"Computer Science and Engineering, Washington University in St. Louis, St. Louis, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6062-5147","authenticated-orcid":false,"given":"Clayton","family":"Faber","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Washington University in St. Louis, St. Louis, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7207-6106","authenticated-orcid":false,"given":"Roger","family":"Chamberlain","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Washington University in St. Louis, St. Louis, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0482-5435","authenticated-orcid":false,"given":"Xuan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering, Northeastern University, Boston, United States"}]}],"member":"320","published-online":{"date-parts":[[2024,12,17]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2021.3090339"},{"key":"e_1_3_3_3_2","article-title":"Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)","author":"Xilinx ARM","year":"2023","unstructured":"ARM Xilinx. 2023. Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393). Retrieved from https:\/\/docs.xilinx.com\/r\/en-US\/ug1393-vitis-application-acceleration","journal-title":"Retrieved from"},{"key":"e_1_3_3_4_2","article-title":"pcyparser","author":"Bendersky Eli","year":"2023","unstructured":"Eli Bendersky. 2023. pcyparser. Retrieved from https:\/\/github.com\/eliben\/pycparser","journal-title":"Retrieved from"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/N-SSC.2007.4785534"},{"key":"e_1_3_3_6_2","article-title":"Residual Gated Graph ConvNets","author":"Bresson Xavier","year":"2017","unstructured":"Xavier Bresson and Thomas Laurent. 2017. Residual Gated Graph ConvNets. arXiv preprint. arxiv:1711.07553","journal-title":"arXiv preprint"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/H2RC51942.2020.00008"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC43674.2020.9286221"},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/CASES.2013.6662524"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3431920.3439290"},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.2970597"},{"key":"e_1_3_3_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2017.8203844"},{"key":"e_1_3_3_13_2","first-page":"1106","volume-title":"International Conference on Machine Learning","author":"Dai Hanjun","year":"2018","unstructured":"Hanjun Dai, Zornitsa Kozareva, Bo Dai, Alex Smola, and Le Song. 2018. Learning steady-states of iterative algorithms over graphs. In International Conference on Machine Learning. PMLR, 1106\u20131114."},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2020.3039409"},{"key":"e_1_3_3_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000108"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC55821.2022.9926398"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/RSDHA54838.2021.00008"},{"key":"e_1_3_3_18_2","first-page":"10","article-title":"Protein interface prediction using graph convolutional networks","volume":"30","author":"Fout Alex","year":"2017","unstructured":"Alex Fout, Jonathon Byrd, Basir Shariat, and Asa Ben-Hur. 2017. Protein interface prediction using graph convolutional networks. Adv. Neural Inf. Process. Syst. 30 (2017), 10 pages.","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00079"},{"key":"e_1_3_3_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2005.1555942"},{"key":"e_1_3_3_21_2","first-page":"11","article-title":"Inductive representation learning on large graphs","volume":"30","author":"Hamilton Will","year":"2017","unstructured":"Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 30 (2017), 11 pages.","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3316781.3317829"},{"key":"e_1_3_3_23_2","first-page":"22118","article-title":"Open graph benchmark: Datasets for machine learning on graphs","volume":"33","author":"Hu Weihua","year":"2020","unstructured":"Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Adv. Neural Inf. Process. Syst. 33 (2020), 22118\u201322133.","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_3_3_24_2","article-title":"Parallel programming for FPGAs","author":"Kastner Ryan","year":"2018","unstructured":"Ryan Kastner, Janarbek Matai, and Stephen Neuendorffer. 2018. Parallel programming for FPGAs. arXiv preprintarxiv:1805.03648","journal-title":"arXiv preprint"},{"key":"e_1_3_3_25_2","volume-title":"International Conference on Learning Representations","author":"Kipf Thomas N.","year":"2017","unstructured":"Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=SJU4ayYgl"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2021.3127288"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3400302.3415645"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2020.3014632"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2019.00069"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.576"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/FPT.2018.00018"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM57271.2023.00010"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2019.2943570"},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.1098\/rsta.2019.0061"},{"key":"e_1_3_3_35_2","article-title":"SimPy: Discrete event simulation for Python","author":"SimPy Team","year":"2023","unstructured":"Team SimPy. 2023. SimPy: Discrete event simulation for Python. Retrieved from https:\/\/simpy.readthedocs.io","journal-title":"Retrieved from"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3494534"},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2017.29"},{"key":"e_1_3_3_38_2","volume-title":"International Conference on Learning Representations","author":"Veli\u010dkovi\u0107 Petar","year":"2017","unstructured":"Petar Veli\u010dkovi\u0107, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. In International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=rJXMpikCZ"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2978386"},{"key":"e_1_3_3_40_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00138-021-01251-0"},{"key":"e_1_3_3_41_2","article-title":"Basic examples for Vitis HLS","year":"2023","unstructured":"Xilinx. 2023. Basic examples for Vitis HLS. Retrieved from https:\/\/github.com\/Xilinx\/Vitis-HLS-Introductory-Examples","journal-title":"Retrieved from"},{"key":"e_1_3_3_42_2","article-title":"Vitis Accel Examples","year":"2023","unstructured":"Xilinx. 2023. Vitis Accel Examples. Retrieved from https:\/\/github.com\/Xilinx\/Vitis_Accel_Examples","journal-title":"Retrieved from"},{"key":"e_1_3_3_43_2","first-page":"17","volume-title":"International Conference on Learning Representations","author":"Xu Keyulu","year":"2019","unstructured":"Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks? In International Conference on Learning Representations. 17 pages. Retrieved from https:\/\/openreview.net\/forum?id=ryGs6iA5Km"},{"key":"e_1_3_3_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00012"},{"key":"e_1_3_3_45_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11782"},{"key":"e_1_3_3_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD51958.2021.9643549"},{"key":"e_1_3_3_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD58817.2023.00092"},{"key":"e_1_3_3_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/RTAS52030.2021.00048"},{"key":"e_1_3_3_49_2","article-title":"Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms","author":"Zhou Ao","year":"2023","unstructured":"Ao Zhou, Jianlei Yang, Yingjie Qi, Yumeng Shi, Tong Qiao, Weisheng Zhao, and Chunming Hu. 2023. Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms. arXiv preprint. arxiv:2309.10875","journal-title":"arXiv preprint"}],"container-title":["ACM Transactions on Reconfigurable Technology and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3655627","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3655627","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:46Z","timestamp":1750291426000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3655627"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,17]]},"references-count":48,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,3,31]]}},"alternative-id":["10.1145\/3655627"],"URL":"https:\/\/doi.org\/10.1145\/3655627","relation":{},"ISSN":["1936-7406","1936-7414"],"issn-type":[{"value":"1936-7406","type":"print"},{"value":"1936-7414","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,17]]},"assertion":[{"value":"2023-12-31","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-03-19","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-17","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}