{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:45:13Z","timestamp":1772725513334,"version":"3.50.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2023,7,24]],"date-time":"2023-07-24T00:00:00Z","timestamp":1690156800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2023,7,31]]},"abstract":"<jats:p>The Bonsai Merkle tree (BMT) is a widely used tree structure for authentication of metadata such as encryption counters in a secure computing system. Common BMT algorithms were designed for traditional Von Neumann architectures with a software-centric implementation in mind and as such, they are predominantly recursive and sequential in nature. However, the modern heterogeneous computing platforms employing Field-Programmable Gate Array (FPGA) devices require concurrency-focused algorithms to fully utilize the versatility and parallel nature of such systems. The recursive nature of traditional BMT algorithms makes them challenging to implement in such hardware-based setups.<\/jats:p>\n          <jats:p>Our goal for this work is to introduce HMT, a hardware-friendly BMT algorithm that enables the verification and update processes to function independently and provides the benefits of relaxed update while being comparable to the eager update in terms of update complexity. The methodology of HMT contributes both novel algorithmic revisions and innovative hardware techniques to implementing BMT. We mathematically demonstrate the challenges of potentially unbounded recursions in relaxed BMT updates. To solve this problem, we use a partitioned BMT caching scheme that allocates a separate write-back cache for each BMT level\u2014thus allowing for low and fixed upper bounds for dirty evictions compared to the traditional BMT caches. Then we introduce the aforementioned hybrid BMT algorithm that is hardware-targeted, parallel, and relaxes the update depending on BMT cache hit but makes the update conditions more flexible compared to lazy update to save additional write-backs.<\/jats:p>\n          <jats:p>Deploying this new algorithm, we have designed a new BMT controller with a dataflow architecture including speculative buffers and parallel write-back engines to facilitate performance-enhancing mechanisms (like multiple concurrent authentication and independent updates) that were not possible with the conventional lazy algorithm. Our empirical performance measurements on a Xilinx U200 accelerator FPGA have demonstrated that HMT can achieve up to 7\u00d7 improvement in bandwidth and 4.5\u00d7 reduction in latency over lazy-update BMT baseline and up to 14% faster execution in standard benchmarks compared to a state-of-the-art, eager-update BMT solution.<\/jats:p>","DOI":"10.1145\/3595179","type":"journal-article","created":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T12:25:17Z","timestamp":1682684717000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["HMT: A Hardware-centric Hybrid Bonsai Merkle Tree Algorithm\u00a0for High-performance Authentication"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3213-4639","authenticated-orcid":false,"given":"Rakin Muhammad","family":"Shadab","sequence":"first","affiliation":[{"name":"University of Central Florida"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4541-6638","authenticated-orcid":false,"given":"Yu","family":"Zou","sequence":"additional","affiliation":[{"name":"University of Central Florida"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5819-1773","authenticated-orcid":false,"given":"Sanjay","family":"Gandham","sequence":"additional","affiliation":[{"name":"University of Central Florida"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3987-463X","authenticated-orcid":false,"given":"Amro","family":"Awad","sequence":"additional","affiliation":[{"name":"North Carolina State University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8220-962X","authenticated-orcid":false,"given":"Mingjie","family":"Lin","sequence":"additional","affiliation":[{"name":"University of Central Florida"}]}],"member":"320","published-online":{"date-parts":[[2023,7,24]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Mazen Alwadi Kazi Zubair David Mohaisen and Amro Awad. 2020. Phoenix: Towards ultra-low overhead recoverable and persistently secure nvm. IEEE Transactions on Dependable and Secure Computing 19 2 (2020) 1049\u20131063.","DOI":"10.1109\/TDSC.2020.3020085"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISVLSI.2019.00114"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322250"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/WiMob.2008.14"},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","unstructured":"Christophe Bobda Joel Mandebi Mbongue Paul Chow Mohammad Ewais Naif Tarafdar Juan Camilo Vega Ken Eguro Dirk Koch Suranga Handagala Miriam Leeser Martin Herbordt Hafsah Shahzad Peter Hofste Burkhard Ringlein Jakub Szefer Ahmed Sanaullah and Russell Tessier. 2022. The future of FPGA acceleration in datacenters and the cloud. ACM Transactions on Reconfigurable Technology and Systems 15 3 (2022) 1\u201342.","DOI":"10.1145\/3506713"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"Zhengguo Chen Youtao Zhang and Nong Xiao. 2020. CacheTree: Reducing integrity verification overhead of secure non-volatile memories. IEEE Transactions on ComputerU-Aided Design of Integrated Circuits and Systems 40 7 (2020) 1340\u20131353.","DOI":"10.1109\/TCAD.2020.3015925"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.33130\/AJCT.2019v05i01.013"},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","unstructured":"Reouven Elbaz David Champagne Catherine Gebotys Ruby B. Lee Nachiketh Potlapally and Lionel Torres. 2009. Hardware mechanisms for memory authentication: A survey of existing techniques and engines. Transactions on Computational Science IV 5430 Special Issue (2009) 1\u201322.","DOI":"10.1007\/978-3-642-01004-0_1"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00015"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2003.1183547"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2016.124"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480118"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/2566673"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMSCS.2015.2494014"},{"key":"e_1_3_2_16_2","volume-title":"Constructing Digital Signatures from a One-way Function","author":"Lamport Leslie","year":"1979","unstructured":"Leslie Lamport. 1979. Constructing Digital Signatures from a One-way Function. Technical Report. Citeseer."},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSYST.2013.2271537"},{"issue":"5","key":"e_1_3_2_18_2","first-page":"653","article-title":"A survey of blockchain security issues and challenges.","volume":"19","author":"Lin Iuon-Chang","year":"2017","unstructured":"Iuon-Chang Lin and Tzu-Chun Liao. 2017. A survey of blockchain security issues and challenges. International Journal of Network Security 19, 5 (2017), 653\u2013659.","journal-title":"International Journal of Network Security"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00500-015-1918-8"},{"key":"e_1_3_2_20_2","unstructured":"John D. McCalpin. 1995. Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter 2 19\u201325 (1995)."},{"key":"e_1_3_2_21_2","unstructured":"Larry W. McVoy and Carl Staelin. 1996. lmbench: Portable tools for performance analysis. In Proceedings of the USENIX Annual Technical Conference . San Diego CA 279\u2013294."},{"key":"e_1_3_2_22_2","first-page":"369","volume-title":"Proceedings of the Conference on the Theory and Application of Cryptographic Techniques","author":"Merkle Ralph C.","year":"1987","unstructured":"Ralph C. Merkle. 1987. A digital signature based on a conventional encryption function. In Proceedings of the Conference on the Theory and Application of Cryptographic Techniques. Springer, 369\u2013378."},{"key":"e_1_3_2_23_2","doi-asserted-by":"crossref","unstructured":"Arun Prasad Mohan Mohamed Asfak R. and Angelin Gladston. 2020. Merkle tree and blockchain-based cloud data auditing. International Journal of Cloud Applications and Computing 10 3 (2020) 54\u201366.","DOI":"10.4018\/IJCAC.2020070103"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3069223"},{"issue":"11","key":"e_1_3_2_25_2","first-page":"1","article-title":"Accelerating deep convolutional neural networks using specialized hardware","volume":"2","author":"Ovtcharov Kalin","year":"2015","unstructured":"Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, and Eric S. Chung. 2015. Accelerating deep convolutional neural networks using specialized hardware. Microsoft Research Whitepaper 2, 11 (2015), 1\u20134.","journal-title":"Microsoft Research Whitepaper"},{"key":"e_1_3_2_26_2","unstructured":"James Pallister Simon Hollis and Jeremy Bennett. 2013. BEEBS: Open benchmarks for energy measurements on embedded platforms. arXiv Preprint arXiv:1308.5174."},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.44"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2003.1253207"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVT.2019.2899647"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-24676-3_32"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3419100"},{"key":"e_1_3_2_32_2","doi-asserted-by":"crossref","unstructured":"Xiuxiu Wang Yipei Niu Fangming Liu and Zichen Xu. 2020. When FPGA meets cloud: A first look at performance. IEEE Transactions on Cloud Computing 10 2 (2020) 1344\u20131357.","DOI":"10.1109\/TCC.2020.2992548"},{"key":"e_1_3_2_33_2","doi-asserted-by":"crossref","unstructured":"Zeke Wang Hongjing Huang Jie Zhang and Gustavo Alonso. 2020. Shuhai: Benchmarking high bandwidth memory On FPGAs. In Proceedings of the 2020 28th Annual International Symposium on Field-Programmable Custom Computing Machines . IEEE 111\u2013119.","DOI":"10.1109\/FCCM48280.2020.00024"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.23919\/FPL.2017.8056797"},{"key":"e_1_3_2_35_2","volume-title":"Osiris: A Low-Cost Mechanism to Enable Restoration of Secure Non-Volatile Memories.","author":"Ye Mao","year":"2018","unstructured":"Mao Ye, Clayton Hughes, and Amro Awad. 2018. Osiris: A Low-Cost Mechanism to Enable Restoration of Secure Non-Volatile Memories.Technical Report. Sandia National Lab(SNL-NM), Albuquerque, NM."},{"key":"e_1_3_2_36_2","first-page":"715","article-title":"DSPstone: A DSP-oriented benchmarking methodology","author":"Zivojnovic Vojin","year":"1994","unstructured":"Vojin Zivojnovic. 1994. DSPstone: A DSP-oriented benchmarking methodology. Proc. Signal Processing Applications and Technology, Dallas, TX, 1994 (1994), 715\u2013720.","journal-title":"Proc. Signal Processing Applications and Technology, Dallas, TX, 1994"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/HOST49136.2021.9702283"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISVLSI.2019.00066"},{"issue":"1","key":"e_1_3_2_39_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3492735","article-title":"ARES: Persistently secure non-volatile memory with processor-transparent and hardware-friendly integrity verification and metadata recovery","volume":"21","author":"Zou Yu","year":"2022","unstructured":"Yu Zou, Kazi Abu Zubair, Mazen Alwadi, Rakin Muhammad Shadab, Sanjay Gandham, Amro Awad, and Mingjie Lin. 2022. ARES: Persistently secure non-volatile memory with processor-transparent and hardware-friendly integrity verification and metadata recovery. ACM Transactions on Embedded Computing Systems 21, 1 (2022), 1\u201332.","journal-title":"ACM Transactions on Embedded Computing Systems"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322252"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3595179","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3595179","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:08Z","timestamp":1750182548000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3595179"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,24]]},"references-count":39,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,7,31]]}},"alternative-id":["10.1145\/3595179"],"URL":"https:\/\/doi.org\/10.1145\/3595179","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,24]]},"assertion":[{"value":"2022-11-24","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-13","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}