{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T00:39:27Z","timestamp":1759538367395,"version":"build-2065373602"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62374146 and 92373205"],"award-info":[{"award-number":["62374146 and 92373205"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key Research and Development Program of China","award":["2023YFB4404404"],"award-info":[{"award-number":["2023YFB4404404"]}]},{"name":"Key Technologies R&D Program of Jiangsu","award":["BE2023005-2"],"award-info":[{"award-number":["BE2023005-2"]}]},{"name":"R&D programme of Zhejiang Province","award":["2024C01012"],"award-info":[{"award-number":["2024C01012"]}]},{"name":"Ant Group through CCF-Ant Research Fund"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>Fully Homomorphic Encryption (FHE) is regarded as a promising way to protect data privacy with encrypted computation. Due to high computation overhead, hardware based FHE accelerators were proposed to speed up FHE applications. To support complicated FHE-encrypted neural network applications, multi-chiplet based FHE accelerators were further proposed for scaling up system size, whereas one of the challenges is designing efficient intra- and inter-chiplet interconnection networks to accelerate data transfer. Conventional regular topologies like mesh or Kite either lead to high inter-chiplet transmission latency or excessive power consumption as these topologies assume uniform bandwidth or radix for nodes\/links, ignoring the highly irregular distribution of inter-chiplet communication volumes. On the other hand, the problem of generating customized intra- and inter-chiplet interconnection networks has high complexity and previous network-on-chip topology generation works cannot efficiently improve the performance of intra- and inter-chiplet interconnection networks. In this article, the intra- and inter-chiplet interconnection optimization problem is defined, aiming to minimize the execution time of FHE applications under cost and power constraints. To efficiently solve this problem, we propose a bilevel optimization algorithm, which decomposes the problem into three sub-problems: (1) FHE parameters selection, (2) task-to-core mapping, and (3) intra-\/inter-chiplet interconnection network topology generation. These sub-problems are then solved iteratively. Experimental results demonstrate that our proposed method reduces execution time by 51.66%, 43.16%, 39.44%, 43.34%, and 27.70% compared with REED and four multi-chiplet based FHE accelerators with mesh, Kite, Butterfly, and Florets as inter-chiplet interconnection networks. Therefore, the proposed method can effectively accelerate FHE applications on large-scale multi-chiplet systems.<\/jats:p>","DOI":"10.1145\/3762995","type":"journal-article","created":{"date-parts":[[2025,8,25]],"date-time":"2025-08-25T11:25:51Z","timestamp":1756121151000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["On Improving the Performance of Intra- and Inter-chiplet Interconnection Networks in Multi-chiplet Systems for Accelerating FHE Encrypted Neural Network Applications"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-5868-0124","authenticated-orcid":false,"given":"Zewei","family":"Lai","sequence":"first","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University","place":["Hangzhou, China"]},{"name":"Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-1777-4326","authenticated-orcid":false,"given":"Jinhui","family":"Ye","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University","place":["Hangzhou, China"]},{"name":"Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2263-5643","authenticated-orcid":false,"given":"Xiaohang","family":"Wang","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University","place":["Hangzhou, China"]},{"name":"Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-0533-9200","authenticated-orcid":false,"given":"Zheang","family":"Fu","sequence":"additional","affiliation":[{"name":"Zhejiang University","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2056-0569","authenticated-orcid":false,"given":"Amit Kumar","family":"Singh","sequence":"additional","affiliation":[{"name":"University of Essex","place":["Colchester, United Kingdom of Great Britain and Northern Ireland"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7453-9365","authenticated-orcid":false,"given":"Yingtao","family":"Jiang","sequence":"additional","affiliation":[{"name":"University of Nevada Las Vegas","place":["Las Vegas, United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3441-6277","authenticated-orcid":false,"given":"Kui","family":"Ren","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University","place":["Hangzhou, China"]},{"name":"Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9510-1079","authenticated-orcid":false,"given":"Mei","family":"Yang","sequence":"additional","affiliation":[{"name":"Dept. of Electrical and Computer Engineering, University of Nevada, Las Vegas","place":["Las Vegas, United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5619-8226","authenticated-orcid":false,"given":"Sihai","family":"Qiu","sequence":"additional","affiliation":[{"name":"Beijing Smart-chip Microelectronics Technology Co., Ltd","place":["Beijing, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8330-2041","authenticated-orcid":false,"given":"Xiaodong","family":"Li","sequence":"additional","affiliation":[{"name":"Ant Group Co Ltd","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-0241-8138","authenticated-orcid":false,"given":"Xin","family":"Tang","sequence":"additional","affiliation":[{"name":"Ant Group Co Ltd","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-4802-9587","authenticated-orcid":false,"given":"Jie","family":"Song","sequence":"additional","affiliation":[{"name":"Ant Group Co Ltd","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6440-7550","authenticated-orcid":false,"given":"Mingzhe","family":"Zhang","sequence":"additional","affiliation":[{"name":"Ant Group Co Ltd","place":["Hangzhou, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,9,26]]},"reference":[{"key":"e_1_3_3_2_2","first-page":"882","volume-title":"IEEE International Symposium on High-Performance Computer Architecture","author":"Agrawal Rashmi","year":"2023","unstructured":"Rashmi Agrawal, Leo de Castro, Guowei Yang, Chiraag Juvekar, Rabia Tugce Yazicigil, Anantha P. Chandrakasan, Vinod Vaikuntanathan, and Ajay Joshi. 2023. FAB: An FPGA-based accelerator for bootstrappable fully homomorphic encryption. In IEEE International Symposium on High-Performance Computer Architecture. 882\u2013895."},{"key":"e_1_3_3_3_2","first-page":"756","volume-title":"ACM\/IEEEInternational Symposium on Computer Architecture","author":"Agrawal Rashmi S.","year":"2024","unstructured":"Rashmi S. Agrawal, Anantha P. Chandrakasan, and Ajay Joshi. 2024. HEAP: A fully homomorphic encryption accelerator with parallelized bootstrapping. In ACM\/IEEEInternational Symposium on Computer Architecture. 756\u2013769."},{"doi-asserted-by":"publisher","unstructured":"Aikata Aikata Ahmet Can Mert Sunmin Kwon Maxim Deryabin and Sujoy Sinha Roy. 2025. REED: Chiplet-based accelerator for fully homomorphic encryption. Transactions on Cryptographic Hardware and Embedded Systems 2025 2 (2025) 163\u2013208. 10.46586\/tches.v2025.i2.163-208","key":"e_1_3_3_4_2","DOI":"10.46586\/tches.v2025.i2.163-208"},{"unstructured":"Arnold O. Allen. 2014. Probability Statistics and Queueing Theory. Academic press. https:\/\/books.google.com.hk\/books?hl=en&lr=&id=LWniBQAAQBAJ&oi=fnd&pg=PP1&dq=Probability +Statistics+and+Queueing+Theory&ots=PEZXYtxe7I&sig=HugV6U4QRM979UymBIR7aP1Ka4U&redir_esc=y#v=onepage&q=Probability%2C%20Statistics%20and%20Queueing%20Theory&f=false","key":"e_1_3_3_5_2"},{"doi-asserted-by":"publisher","key":"e_1_3_3_6_2","DOI":"10.1145\/2954679.2872414"},{"key":"e_1_3_3_7_2","first-page":"1","volume-title":"ACM\/IEEE Design Automation Conference","author":"Bharadwaj Srikant","year":"2020","unstructured":"Srikant Bharadwaj, Jieming Yin, Bradford M. Beckmann, and Tushar Krishna. 2020. Kite: A family of heterogeneous interposer topologies enabled via accurate interconnect modeling. In ACM\/IEEE Design Automation Conference. 1\u20136."},{"doi-asserted-by":"publisher","key":"e_1_3_3_8_2","DOI":"10.1007\/978-3-642-32009-5_50"},{"key":"e_1_3_3_9_2","first-page":"1413","volume-title":"IEEE International Conference on Communications","author":"Bui Van-Phuc","year":"2024","unstructured":"Van-Phuc Bui, Shashi Raj Pandey, Pedro Maia de Sant Ana, and Petar Popovski. 2024. Value-based reinforcement learning for digital twins in cloud computing. In IEEE International Conference on Communications. 1413\u20131418."},{"key":"e_1_3_3_10_2","first-page":"156","volume-title":"IEEE International Symposium on High-Performance Computer Architecture","author":"Cai Jingwei","year":"2024","unstructured":"Jingwei Cai, Zuotong Wu, Sen Peng, Yuchen Wei, Zhanhong Tan, Guiming Shi, Mingyu Gao, and Kaisheng Ma. 2024. Gemini: Mapping and architecture co-exploration for large-scale DNN chiplet accelerators. In IEEE International Symposium on High-Performance Computer Architecture. 156\u2013171."},{"doi-asserted-by":"publisher","key":"e_1_3_3_11_2","DOI":"10.1016\/j.future.2024.107547"},{"doi-asserted-by":"publisher","key":"e_1_3_3_12_2","DOI":"10.1109\/TCAD.2008.925775"},{"key":"e_1_3_3_13_2","first-page":"367","volume-title":"ACM\/IEEE International Symposium on Computer Architecture","author":"Chen Yu-Hsin","year":"2016","unstructured":"Yu-Hsin Chen, Joel S. Emer, and Vivienne Sze. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In ACM\/IEEE International Symposium on Computer Architecture. 367\u2013379."},{"key":"e_1_3_3_14_2","first-page":"347","volume-title":"Selected Areas in Cryptography","author":"Cheon Jung Hee","year":"2018","unstructured":"Jung Hee Cheon, Kyoohyung Han, Andrey Kim, Miran Kim, and Yongsoo Song. 2018. A full RNS variant of approximate homomorphic encryption. In Selected Areas in Cryptography 11349 (2018), 347\u2013368."},{"key":"e_1_3_3_15_2","first-page":"409","volume-title":"International Conference on the Theory and Applications of Cryptology and Information Security","author":"Cheon Jung Hee","year":"2017","unstructured":"Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. 2017. Homomorphic encryption for arithmetic of approximate numbers. In International Conference on the Theory and Applications of Cryptology and Information Security. 409\u2013437."},{"doi-asserted-by":"publisher","key":"e_1_3_3_16_2","DOI":"10.1007\/s00145-019-09319-x"},{"key":"e_1_3_3_17_2","first-page":"338","volume-title":"IEEE\/ACM International Symposium on Microarchitecture","author":"Deng Xianglong","year":"2024","unstructured":"Xianglong Deng, Shengyu Fan, Zhicheng Hu, Zhuoyu Tian, Zihao Yang, Jiangrui Yu, Dingyuan Cao, Dan Meng, Rui Hou, Meng Li, Qian Lou, and Mingzhe Zhang. 2024. Trinity: A general purpose FHE accelerator. In IEEE\/ACM International Symposium on Microarchitecture. 338\u2013351."},{"key":"e_1_3_3_18_2","first-page":"243:1\u2013243:6","volume-title":"ACM\/IEEE Design Automation Conference","author":"Du Yibo","year":"2024","unstructured":"Yibo Du, Ying Wang, Bing Li, Fuping Li, Shengwen Liang, Huawei Li, Xiaowei Li, and Yinhe Han. 2024. Chiplever: Towards effortless extension of chiplet-based system for FHE. In ACM\/IEEE Design Automation Conference. 243:1\u2013243:6."},{"key":"e_1_3_3_19_2","first-page":"922","volume-title":"IEEE International Symposium on High-Performance Computer Architecture","author":"Fan Shengyu","year":"2023","unstructured":"Shengyu Fan, Zhiwei Wang, Weizhi Xu, Rui Hou, Dan Meng, and Mingzhe Zhang. 2023. TensorFHE: Achieving practical computation on encrypted data using GPGPU. In IEEE International Symposium on High-Performance Computer Architecture. 922\u2013934."},{"key":"e_1_3_3_20_2","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1145\/3489517.3530428","volume-title":"ACM\/IEEE Design Automation Conference","author":"Feng Yinxiao","year":"2022","unstructured":"Yinxiao Feng and Kaisheng Ma. 2022. Chiplet actuary: A quantitative cost model and multi-chiplet architecture exploration. In ACM\/IEEE Design Automation Conference. 121\u2013126."},{"key":"e_1_3_3_21_2","first-page":"96","volume-title":"International Conference for High Performance Computing, Networking, Storage, and Analysis","author":"Feng Yinxiao","year":"2024","unstructured":"Yinxiao Feng and Kaisheng Ma. 2024. Switch-less dragonfly on wafers: A scalable interconnection architecture based on wafer-scale integration. In International Conference for High Performance Computing, Networking, Storage, and Analysis. 96."},{"key":"e_1_3_3_22_2","first-page":"930","volume-title":"International Symposium on Microarchitecture","author":"Feng Yinxiao","year":"2023","unstructured":"Yinxiao Feng, Dong Xiang, and Kaisheng Ma. 2023. Heterogeneous die-to-die interfaces: Enabling more flexible chiplet interconnection systems. In International Symposium on Microarchitecture. 930\u2013943."},{"key":"e_1_3_3_23_2","first-page":"1059","volume-title":"IEEE International Symposium on High-Performance Computer Architecture","author":"Feng Yinxiao","year":"2023","unstructured":"Yinxiao Feng, Dong Xiang, and Kaisheng Ma. 2023. A scalable methodology for designing efficient interconnection network of chiplets. In IEEE International Symposium on High-Performance Computer Architecture. 1059\u20131071."},{"key":"e_1_3_3_24_2","first-page":"2996","volume-title":"IEEE Conference on Decision and Control","author":"Gammelli Daniele","year":"2021","unstructured":"Daniele Gammelli, Kaidi Yang, James Harrison, Filipe Rodrigues, Francisco C. Pereira, and Marco Pavone. 2021. Graph neural network reinforcement learning for autonomous mobility-on-demand systems. In IEEE Conference on Decision and Control. 2996\u20133003."},{"key":"e_1_3_3_25_2","first-page":"75","volume-title":"Cryptology Conference","author":"Gentry Craig","year":"2013","unstructured":"Craig Gentry, Amit Sahai, and Brent Waters. 2013. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In Cryptology Conference. 75\u201392."},{"key":"e_1_3_3_26_2","first-page":"351","volume-title":"Asia and South Pacific Design Automation Conference","author":"Kabir MD Arafat","year":"2020","unstructured":"MD Arafat Kabir and Yarui Peng. 2020. Chiplet-package co-design for 2.5D systems using standard ASIC CAD tools. In Asia and South Pacific Design Automation Conference. 351\u2013356."},{"doi-asserted-by":"publisher","key":"e_1_3_3_27_2","DOI":"10.1109\/TVLSI.2011.2178620"},{"key":"e_1_3_3_28_2","first-page":"1237","volume-title":"IEEE\/ACM International Symposium on Microarchitecture","author":"Kim Jongmin","year":"2022","unstructured":"Jongmin Kim, Gwangho Lee, Sangpyo Kim, Gina Sohn, Minsoo Rhu, John Kim, and Jung Ho Ahn. 2022. ARK: Fully homomorphic encryption accelerator with runtime data generation and inter-operation key reuse. In IEEE\/ACM International Symposium on Microarchitecture. 1237\u20131254."},{"key":"e_1_3_3_29_2","first-page":"119","volume-title":"International Symposium on Secure and Private Execution Environment Design","author":"Kim Sangpyo","year":"2024","unstructured":"Sangpyo Kim, Jongmin Kim, Jaeyoung Choi, and Jung Ho Ahn. 2024. CiFHER: A chiplet-based FHE accelerator with a resizable structure. In International Symposium on Secure and Private Execution Environment Design. 119\u2013130."},{"key":"e_1_3_3_30_2","first-page":"711","volume-title":"International Symposium on Computer Architecture","author":"Kim Sangpyo","year":"2022","unstructured":"Sangpyo Kim, Jongmin Kim, Michael Jaemin Kim, Wonkyung Jung, John Kim, Minsoo Rhu, and Jung Ho Ahn. 2022. BTS: An accelerator for bootstrappable fully homomorphic encryption. In International Symposium on Computer Architecture. 711\u2013725."},{"key":"e_1_3_3_31_2","first-page":"12:1\u201312:15","volume-title":"International Conference for High Performance Computing, Networking, Storage and Analysis","author":"Lakhotia Kartik","year":"2022","unstructured":"Kartik Lakhotia, Maciej Besta, Laura Monroe, Kelly Isham, Patrick Iff, Torsten Hoefler, and Fabrizio Petrini. 2022. PolarFly: A cost-effective and flexible low-diameter topology. In International Conference for High Performance Computing, Networking, Storage and Analysis. 12:1\u201312:15."},{"doi-asserted-by":"publisher","key":"e_1_3_3_32_2","DOI":"10.1109\/ACCESS.2023.3318433"},{"doi-asserted-by":"publisher","key":"e_1_3_3_33_2","DOI":"10.1109\/TCAD.2023.3328828"},{"key":"e_1_3_3_34_2","first-page":"1","volume-title":"International Conference on ASIC","author":"Li Xiaoyan","year":"2023","unstructured":"Xiaoyan Li, Zizheng Dong, Shuaipeng Li, Sai Gao, Jianfei Jiang, Guanghui He, and Zhigang Mao. 2023. MUG5: Modeling of universal chiplet interconnect express (UCIe) standard based on Gem5. In International Conference on ASIC. 1\u20134."},{"unstructured":"Aixin Liu Bei Feng Bing Xue Bingxuan Wang Bochao Wu Chengda Lu Chenggang Zhao Chengqi Deng Chenyu Zhang Chong Ruan et\u00a0al. 2024. Deepseek-v3 technical report. arXiv:2412.19437. Retrieved from https:\/\/arxiv.org\/abs\/2412.19437","key":"e_1_3_3_35_2"},{"key":"e_1_3_3_36_2","first-page":"2774","volume-title":"International Conference on Robotics and Automation","author":"Liu Zhijian","year":"2023","unstructured":"Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L Rus, and Song Han. 2023. Bevfusion: Multi-task multi-sensor fusion with unified bird\u2019s-eye view representation. In International Conference on Robotics and Automation. 2774\u20132781."},{"key":"e_1_3_3_37_2","first-page":"84","volume-title":"International System-on-Chip Conference","author":"Lu Siyuan","year":"2020","unstructured":"Siyuan Lu, Meiqi Wang, Shuang Liang, Jun Lin, and Zhongfeng Wang. 2020. Hardware accelerator for multi-head attention and position-wise feed-forward in the transformer. In International System-on-Chip Conference. 84\u201389."},{"key":"e_1_3_3_38_2","first-page":"158:1\u2013158:6","volume-title":"ACM\/IEEE Design Automation Conference","author":"Lu Zhaojun","year":"2024","unstructured":"Zhaojun Lu, Weizong Yu, Peng Xu, Wei Wang, Jiliang Zhang, and Dengguo Feng. 2024. An NTT\/INTT accelerator with ultra-high throughput and area efficiency for FHE. In ACM\/IEEE Design Automation Conference. 158:1\u2013158:6."},{"key":"e_1_3_3_39_2","first-page":"26:1\u201326:6","volume-title":"ACM\/IEEE Design Automation Conference","author":"Mu Jianan","year":"2024","unstructured":"Jianan Mu, Husheng Han, Shangyi Shi, Jing Ye, Zizhen Liu, Shengwen Liang, Meng Li, Mingzhe Zhang, Song Bian, Xing Hu, Huawei Li, and Xiaowei Li. 2024. Alchemist: A unified accelerator architecture for cross-scheme fully homomorphic encryption. In ACM\/IEEE Design Automation Conference. 26:1\u201326:6."},{"key":"e_1_3_3_40_2","first-page":"914","volume-title":"Design Automation Conference","author":"Murali Srinivasan","year":"2004","unstructured":"Srinivasan Murali and Giovanni De Micheli. 2004. SUNMAP: A tool for automatic topology selection and generation for NoCs. In Design Automation Conference. 914\u2013919."},{"doi-asserted-by":"crossref","unstructured":"Naveen Muralimanohar Rajeev Balasubramonian and Norman P. Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP Laboratories 27 28 (2009). https:\/\/shiftleft.com\/mirrors\/www.hpl.hp.com\/techreports\/2009\/HPL-2009-85.pdf","key":"e_1_3_3_41_2","DOI":"10.1109\/MM.2008.2"},{"doi-asserted-by":"publisher","key":"e_1_3_3_42_2","DOI":"10.1109\/MM.2024.3423785"},{"unstructured":"Maxim Naumov Dheevatsa Mudigere Hao-Jun Michael Shi Jianyu Huang Narayanan Sundaraman Jongsoo Park Xiaodong Wang Udit Gupta Carole-Jean Wu Alisson G Azzolini et\u00a0al. 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv:1906.00091. Retrieved from https:\/\/arxiv.org\/abs\/1906.00091","key":"e_1_3_3_43_2"},{"doi-asserted-by":"publisher","key":"e_1_3_3_44_2","DOI":"10.1007\/s10994-021-06044-0"},{"key":"e_1_3_3_45_2","first-page":"1","volume-title":"IEEE International Symposium on Circuits and Systems","author":"Reza Md Farhadur","year":"2018","unstructured":"Md Farhadur Reza, Tung Thanh Le, Bappaditya Dey, Magdy A. Bayoumi, and Dan Zhao. 2018. Neuro-NoC: Energy optimization in heterogeneous many-core NoC using neural networks in dark silicon era. In IEEE International Symposium on Circuits and Systems. 1\u20135."},{"key":"e_1_3_3_46_2","first-page":"238","volume-title":"IEEE\/ACM International Symposium on Microarchitecture","author":"Samardzic Nikola","year":"2021","unstructured":"Nikola Samardzic, Axel Feldmann, Aleksandar Krastev, Srinivas Devadas, Ronald G. Dreslinski, Christopher Peikert, and Daniel S\u00e1nchez. 2021. F1: A fast and programmable accelerator for fully homomorphic encryption. In IEEE\/ACM International Symposium on Microarchitecture. 238\u2013252."},{"key":"e_1_3_3_47_2","first-page":"173","volume-title":"International Symposium on Computer Architecture","author":"Samardzic Nikola","year":"2022","unstructured":"Nikola Samardzic, Axel Feldmann, Aleksandar Krastev, Nathan Manohar, Nicholas Genise, Srinivas Devadas, Karim Eldefrawy, Chris Peikert, and Daniel S\u00e1nchez. 2022. CraterLake: A hardware accelerator for efficient unbounded computation on encrypted data. In International Symposium on Computer Architecture. 173\u2013187."},{"doi-asserted-by":"publisher","key":"e_1_3_3_48_2","DOI":"10.1145\/3642921.3642956"},{"issue":"5","key":"e_1_3_3_49_2","first-page":"132:1\u2013132:21","article-title":"Florets for chiplets: Data flow-aware high-performance and energy-efficient network-on-interposer for CNN inference tasks","volume":"22","author":"Sharma Harsh","year":"2023","unstructured":"Harsh Sharma, Lukas Pfromm, Rasit Onur Topaloglu, Janardhan Rao Doppa, \u00dcmit Y. Ogras, Ananth Kalyanaraman, and Partha Pratim Pande. 2023. Florets for chiplets: Data flow-aware high-performance and energy-efficient network-on-interposer for CNN inference tasks. ACM Transactions on Embedded Computing Systems 22, 5s (2023), 132:1\u2013132:21.","journal-title":"ACM Transactions on Embedded Computing Systems"},{"key":"e_1_3_3_50_2","first-page":"201","volume-title":"IEEE\/ACM International Symposium on Networks-on-Chip","author":"Sun Chen","year":"2012","unstructured":"Chen Sun, Chia-Hsin Owen Chen, George Kurian, Lan Wei, Jason E. Miller, Anant Agarwal, Li-Shiuan Peh, and Vladimir Stojanovic. 2012. DSENT\u2014a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In IEEE\/ACM International Symposium on Networks-on-Chip. 201\u2013210."},{"key":"e_1_3_3_51_2","first-page":"1047","volume-title":"Design, Automation & Test in Europe Conference & Exhibition","author":"Taheri Ebadollah","year":"2022","unstructured":"Ebadollah Taheri, Sudeep Pasricha, and Mahdi Nikdast. 2022. DeFT: A deadlock-free and fault-tolerant routing algorithm for 2.5D chiplet networks. In Design, Automation & Test in Europe Conference & Exhibition. 1047\u20131052."},{"unstructured":"Tianqi Tang and Yuan Xie. 2022. Cost-aware exploration for chiplet-based architecture with advanced packaging technologies. arXiv:2206.07308. Retrieved from https:\/\/arxiv.org\/abs\/2206.07308","key":"e_1_3_3_52_2"},{"doi-asserted-by":"publisher","key":"e_1_3_3_53_2","DOI":"10.1109\/TVLSI.2024.3455332"},{"doi-asserted-by":"publisher","key":"e_1_3_3_54_2","DOI":"10.1016\/j.compbiolchem.2024.108183"},{"doi-asserted-by":"publisher","key":"e_1_3_3_55_2","DOI":"10.1109\/TC.2024.3500354"},{"key":"e_1_3_3_56_2","first-page":"357","volume-title":"IEEE International Conference on Computer Design","author":"Wang Zhiwei","year":"2023","unstructured":"Zhiwei Wang, Peinan Li, Rui Hou, and Dan Meng. 2023. NTTFusion: Efficient number theoretic transform acceleration on GPUs. In IEEE International Conference on Computer Design. 357\u2013365."},{"key":"e_1_3_3_57_2","first-page":"273:1\u2013273:6","volume-title":"ACM\/IEEE Design Automation Conference","author":"Wei Yuntao","year":"2024","unstructured":"Yuntao Wei, Xueyan Wang, Song Bian, Yicheng Huang, Weisheng Zhao, and Yier Jin. 2024. PPGNN: Fast and accurate privacy-preserving graph neural network inference via parallel and pipelined arithmetic-and-logic FHE accelerator. In ACM\/IEEE Design Automation Conference. 273:1\u2013273:6."},{"key":"e_1_3_3_58_2","first-page":"870","volume-title":"IEEE International Symposium on High-Performance Computer Architecture","author":"Yang Yinghao","year":"2023","unstructured":"Yinghao Yang, Huaizhi Zhang, Shengyu Fan, Hang Lu, Mingzhe Zhang, and Xiaowei Li. 2023. Poseidon: Practical homomorphic encryption accelerator. In IEEE International Symposium on High-Performance Computer Architecture. 870\u2013881."},{"issue":"1","key":"e_1_3_3_59_2","first-page":"19","article-title":"A 2.29-pJ\/b 112-Gb\/s wireline transceiver with RX four-tap FFE for medium-reach applications in 28-nm CMOS","volume":"58","author":"Ye Bingyi","year":"2022","unstructured":"Bingyi Ye, Kai Sheng, Weixin Gai, Haowei Niu, Boyang Zhang, Yandong He, Song Jia, Congcong Chen, and Jiaqi Yu. 2022. A 2.29-pJ\/b 112-Gb\/s wireline transceiver with RX four-tap FFE for medium-reach applications in 28-nm CMOS. IEEE Journal of Solid-State Circuits 58, 1 (2022), 19\u201329.","journal-title":"IEEE Journal of Solid-State Circuits"},{"doi-asserted-by":"publisher","key":"e_1_3_3_60_2","DOI":"10.1109\/TCAD.2023.3332832"},{"doi-asserted-by":"publisher","key":"e_1_3_3_61_2","DOI":"10.1109\/TVLSI.2024.3438549"},{"key":"e_1_3_3_62_2","first-page":"1","volume-title":"ACM International Conference on Nanoscale Computing and Communication","author":"Zhi Haocong","year":"2021","unstructured":"Haocong Zhi, Xianuo Xu, Weijian Han, Zhilin Gao, Xiaohang Wang, Maurizio Palesi, Amit Kumar Singh, and Letian Huang. 2021. A methodology for simulating multi-chiplet systems using open-source simulators. In ACM International Conference on Nanoscale Computing and Communication. 1\u20136."},{"key":"e_1_3_3_63_2","first-page":"144","volume-title":"International Symposium on Quality Electronic Design","author":"Zhong Wei","year":"2011","unstructured":"Wei Zhong, Bei Yu, Song Chen, Takeshi Yoshimura, Sheqin Dong, and Satoshi Goto. 2011. Application-specific network-on-chip synthesis: Cluster generation and network component insertion. In International Symposium on Quality Electronic Design. 144\u2013149."},{"key":"e_1_3_3_64_2","first-page":"352","volume-title":"IEEE\/ACM International Symposium on Microarchitecture","author":"Zhou Minxuan","year":"2024","unstructured":"Minxuan Zhou, Yujin Nam, Xuan Wang, Youhak Lee, Chris Wilkerson, Raghavan Kumar, Sachin Taneja, Sanu Mathew, Rosario Cammarota, and Tajana Rosing. 2024. UFC: A unified accelerator for fully homomorphic encryption. In IEEE\/ACM International Symposium on Microarchitecture. 352\u2013365."},{"unstructured":"Shengyu Zhu Ignavier Ng and Zhitang Chen. 2019. Causal discovery with reinforcement learning. arXiv:1906.04477. Retrieved from https:\/\/arxiv.org\/abs\/1906.04477","key":"e_1_3_3_65_2"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3762995","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,3]],"date-time":"2025-10-03T14:06:43Z","timestamp":1759500403000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3762995"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,26]]},"references-count":64,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3762995"],"URL":"https:\/\/doi.org\/10.1145\/3762995","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2025,9,26]]},"assertion":[{"value":"2025-08-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-13","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-26","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}