{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,28]],"date-time":"2025-11-28T12:37:50Z","timestamp":1764333470031,"version":"3.41.0"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,11,7]],"date-time":"2024-11-07T00:00:00Z","timestamp":1730937600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2022YFB4400704"],"award-info":[{"award-number":["2022YFB4400704"]}]},{"name":"Hong Kong Innovation and Technology Commission","award":["ITF Seed Fund ITS\/098\/22"],"award-info":[{"award-number":["ITF Seed Fund ITS\/098\/22"]}]},{"DOI":"10.13039\/100007567","name":"City University of Hong Kong","doi-asserted-by":"crossref","award":["9440356"],"award-info":[{"award-number":["9440356"]}],"id":[{"id":"10.13039\/100007567","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Hong Kong Innovation and Technology Commission","award":["InnoHK Project CIMDA"],"award-info":[{"award-number":["InnoHK Project CIMDA"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62372417"],"award-info":[{"award-number":["62372417"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Guangdong Provincial Key Laboratory IRADS","award":["2022B1212010006, R0400001-22"],"award-info":[{"award-number":["2022B1212010006, R0400001-22"]}]},{"name":"Guangdong Basic and Applied Basic Research Foundation-General","award":["2024A1515011274"],"award-info":[{"award-number":["2024A1515011274"]}]},{"name":"Guangdong Province General Universities Key Field","award":["2023ZDZX1033"],"award-info":[{"award-number":["2023ZDZX1033"]}]},{"name":"UIC Research","award":["UICR04202401-21"],"award-info":[{"award-number":["UICR04202401-21"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Reconfigurable Technol. Syst."],"published-print":{"date-parts":[[2024,12,31]]},"abstract":"<jats:p>Lattice-based cryptography (LBC) has been established as a prominent research field, with particular attention on post-quantum cryptography (PQC) and fully homomorphic encryption (FHE). As the implementing bottleneck of PQC and FHE, number theoretic transform (NTT) has been extensively studied. However, current works struggled with scalability, hindering their adaptation to various parameters, such as bit width and polynomial length. In this article, we proposed a novel Discrete Galois Transformation (DGT) algorithm utilizing the radix-4 variant to achieve a higher level of parallelism to the existing NTT. Furthermore, to implement the efficient radix-4 DGT adapting more LBCs, we proposed a set of scalable building blocks, including a modified Barrett modular multiplier accepting arbitrary modulus with only one integer multiplier, a radix-4 DGT butterfly unit, and a stream permutation network. The proposed modules are implemented on the Xilinx Virtex-7 and U250 FPGA to evaluate resource utilization and performance. Lastly, a design space exploration framework is proposed to generate optimized radix-4 DGT hardware constrained by polynomial and platform parameters. The sensitivity analysis showcases the generated hardware\u2019s performance and scalability. The implementation results on the Xilinx Virtex-7 and U250 FPGA show significant performance improvements over the state-of-the-art works, which reached at least 35%, 192%, and 68% area-time product improvements in terms of LUTs, BRAMs, and DSPs, respectively.<\/jats:p>","DOI":"10.1145\/3689437","type":"journal-article","created":{"date-parts":[[2024,8,24]],"date-time":"2024-08-24T13:33:01Z","timestamp":1724506381000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["ProgramGalois: A Programmable Generator of Radix-4 Discrete Galois Transformation Architecture for Lattice-Based Cryptography"],"prefix":"10.1145","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8399-9467","authenticated-orcid":false,"given":"Guangyan","family":"Li","sequence":"first","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3623-3554","authenticated-orcid":false,"given":"Zewen","family":"Ye","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, China and Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5357-7442","authenticated-orcid":false,"given":"Donglong","family":"Chen","sequence":"additional","affiliation":[{"name":"BNU-HKBU United International College, Zhuhai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5192-1649","authenticated-orcid":false,"given":"Wangchen","family":"Dai","sequence":"additional","affiliation":[{"name":"Sun Yat-sen University, Shenzhen, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7403-3081","authenticated-orcid":false,"given":"Gaoyu","family":"Mao","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3722-9979","authenticated-orcid":false,"given":"Kejie","family":"Huang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6764-0729","authenticated-orcid":false,"given":"Ray C. C.","family":"Cheung","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}]}],"member":"320","published-online":{"date-parts":[[2024,11,7]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"AMD. 2018. 7 Series DSP48E1 Slice: User Guide. Retrieved from https:\/\/www.xilinx.com\/support\/documentation\/user_guides\/ug479_7Series_DSP48E1.pdf"},{"key":"e_1_3_1_3_2","unstructured":"Github. 2024. SpinalHDL. Retrieved from January 16 2024 https:\/\/github.com\/SpinalHDL"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3214303"},{"key":"e_1_3_1_5_2","first-page":"206","volume-title":"2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM \u201920)","author":"Agrawal Rashmi","year":"2020","unstructured":"Rashmi Agrawal, Lake Bu, and Michel A. Kinsy. 2020. Fast arithmetic hardware library for RLWE-based homomorphic encryption. In 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM \u201920). IEEE, 206\u2013206."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TETC.2019.2902799"},{"key":"e_1_3_1_7_2","doi-asserted-by":"crossref","first-page":"666","DOI":"10.1007\/978-3-030-03402-3_47","volume-title":"Advances in Information and Communication Networks: Proceedings of the 2018 Future of Information and Communication Conference (FICC \u201919)","volume":"1","author":"Al Badawi Ahmad","year":"2019","unstructured":"Ahmad Al Badawi, Bharadwaj Veeravalli, and Khin Mi Mi Aung. 2019. Efficient polynomial multiplication via modified discrete galois transform and negacyclic convolution. In Advances in Information and Communication Networks: Proceedings of the 2018 Future of Information and Communication Conference (FICC \u201919), Vol. 1. Springer, 666\u2013682."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228584"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/165564.165571"},{"key":"e_1_3_1_10_2","first-page":"311","volume-title":"Advances in Cryptology\u2014CRYPTO\u201986: Proceedings","author":"Barrett Paul","year":"2000","unstructured":"Paul Barrett. 2000. Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor. In Advances in Cryptology\u2014CRYPTO\u201986: Proceedings. Springer, 311\u2013323."},{"key":"e_1_3_1_11_2","first-page":"353","volume-title":"2018 IEEE European Symposium on Security and Privacy (EuroS & P \u201918)","author":"Bos Joppe","year":"2018","unstructured":"Joppe Bos, L\u00e9o Ducas, Eike Kiltz, Tancr\u00e8de Lepoint, Vadim Lyubashevsky, John M. Schanck, Peter Schwabe, Gregor Seiler, and Damien Stehl\u00e9. 2018. CRYSTALS-Kyber: A CCA-secure module-lattice-based KEM. In 2018 IEEE European Symposium on Security and Privacy (EuroS & P \u201918). IEEE, 353\u2013367."},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/2633600"},{"key":"e_1_3_1_13_2","first-page":"94","article-title":"CFNTT: Scalable radix-2\/4 NTT multiplication architecture with an efficient conflict-free memory mapping scheme","volume":"1","author":"Chen Xiangren","year":"2022","unstructured":"Xiangren Chen, Bohan Yang, Shouyi Yin, Shaojun Wei, and Leibo Liu. 2022. CFNTT: Scalable radix-2\/4 NTT multiplication architecture with an efficient conflict-free memory mapping scheme. IACR Transactions on Cryptographic Hardware and Embedded Systems, 1 (2022), 94\u2013126.","journal-title":"IACR Transactions on Cryptographic Hardware and Embedded Systems"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1007\/978-3-319-70694-8_15","volume-title":"Advances in Cryptology\u2013ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security","author":"Cheon Jung Hee","year":"2017","unstructured":"Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. 2017. Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology\u2013ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security. Springer, 409\u2013437."},{"key":"e_1_3_1_15_2","volume-title":"Inside the FFT Black Box: Serial and Parallel Fast Fourier Transform Algorithms","author":"Chu Eleanor","year":"1999","unstructured":"Eleanor Chu and Alan George. 1999. Inside the FFT Black Box: Serial and Parallel Fast Fourier Transform Algorithms. CRC press."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.46586\/tches.v2018.i1.238-268"},{"key":"e_1_3_1_17_2","unstructured":"Junfeng Fan and Frederik Vercauteren. 2012. Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive. Retrieved from https:\/\/eprint.iacr.org\/2012\/144"},{"key":"e_1_3_1_18_2","first-page":"1","article-title":"Falcon: Fast-Fourier lattice-based compact signatures over NTRU","volume":"36","author":"Fouque Pierre-Alain","year":"2018","unstructured":"Pierre-Alain Fouque, Jeffrey Hoffstein, Paul Kirchner, Vadim Lyubashevsky, Thomas Pornin, Thomas Prest, Thomas Ricosset, Gregor Seiler, William Whyte, and Zhenfei Zhang. 2018. Falcon: Fast-Fourier lattice-based compact signatures over NTRU. Submission to the NIST\u2019s Post-Quantum Cryptography Standardization Process 36 (2018), 1\u201375.","journal-title":"Submission to the NIST\u2019s Post-Quantum Cryptography Standardization Process"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/1464291.1464352"},{"key":"e_1_3_1_20_2","first-page":"1","volume-title":"2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig \u201919)","author":"Kim Sunwoong","year":"2019","unstructured":"Sunwoong Kim, Keewoo Lee, Wonhee Cho, Jung Hee Cheon, and Rob A. Rutenbar. 2019. FPGA-based accelerators of fully pipelined modular multipliers for homomorphic encryption. In 2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig \u201919). IEEE, 1\u20138."},{"key":"e_1_3_1_21_2","first-page":"56","volume-title":"2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM \u201920)","author":"Kim Sunwoong","year":"2020","unstructured":"Sunwoong Kim, Keewoo Lee, Wonhee Cho, Yujin Nam, Jung Hee Cheon, and Rob A. Rutenbar. 2020. Hardware architecture of a number theoretic transform for a bootstrappable RNS-based homomorphic encryption scheme. In 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM \u201920). IEEE, 56\u201364."},{"issue":"4","key":"e_1_3_1_22_2","first-page":"1","article-title":"Algorithm-hardware co-design of split-radix discrete galois transformation for KyberKEM","volume":"11","author":"Li Guangyan","year":"2023","unstructured":"Guangyan Li, Donglong Chen, Gaoyu Mao, Wangchen Dai, Abdurrashid Ibrahim Sanka, and Ray C. C. Cheung. 2023. Algorithm-hardware co-design of split-radix discrete galois transformation for KyberKEM. IEEE Transactions on Emerging Topics in Computing, 11, 4 (2023), 1\u201315.","journal-title":"IEEE Transactions on Emerging Topics in Computing"},{"key":"e_1_3_1_23_2","doi-asserted-by":"crossref","unstructured":"Ahmet Can Mert Aikata Sunmin Kwon Youngsam Shin Donghoon Yoo Yongwoo Lee Sujoy Sinha Roy. 2022. Medha: Microcoded hardware accelerator for computing on encrypted data. arXiv:2210.05476. Retrieved from https:\/\/eprint.iacr.org\/2022\/480","DOI":"10.46586\/tches.v2023.i1.463-500"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2019.2943127"},{"issue":"1","key":"e_1_3_1_25_2","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1109\/TC.2016.2574340","article-title":"A custom accelerator for homomorphic encryption applications","volume":"66","author":"\u00d6zt\u00fcrk Erdin\u00e7","year":"2016","unstructured":"Erdin\u00e7 \u00d6zt\u00fcrk, Yarki\u0307n Dor\u00f6z, Erkay Sava\u015f, and Berk Sunar. 2016. A custom accelerator for homomorphic encryption applications. IEEE Transactions on Computers 66, 1 (2016), 3\u201316.","journal-title":"IEEE Transactions on Computers"},{"key":"e_1_3_1_26_2","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/HPCA51647.2021.00013","volume-title":"2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA \u201921)","author":"Reagen Brandon","year":"2021","unstructured":"Brandon Reagen, Woo-Seok Choi, Yeongil Ko, Vincent T. Lee, Hsien-Hsin S. Lee, Gu-Yeon Wei, and David Brooks. 2021. Cheetah: Optimizing and accelerating homomorphic encryption for private inference. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA \u201921). IEEE, 26\u201339."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2020.3017595"},{"key":"e_1_3_1_28_2","first-page":"1295","volume-title":"25th International Conference on Architectural Support for Programming Languages and Operating Systems","author":"Riazi M. Sadegh","year":"2020","unstructured":"M. Sadegh Riazi, Kim Laine, Blake Pelton, and Wei Dai. 2020. HEAX: An architecture for computing on encrypted data. In 25th International Conference on Architectural Support for Programming Languages and Operating Systems, 1295\u20131309."},{"key":"e_1_3_1_29_2","unstructured":"Sujoy Sinha Roy Ahmet Can Mert Aikata Sunmin Kwon Youngsam Shin and Donghoon Yoo. 2021. Accelerator for Computing on Encrypted Data. Cryptology ePrint Archive Paper 2021\/1555. Retrieved from https:\/\/eprint.iacr.org\/2021\/1555"},{"key":"e_1_3_1_30_2","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1109\/HPCA.2019.00052","volume-title":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA 19)","author":"Roy Sujoy Sinha","year":"2019","unstructured":"Sujoy Sinha Roy, Furkan Turan, Kimmo Jarvinen, Frederik Vercauteren, and Ingrid Verbauwhede. 2019. FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA 19). IEEE, 387\u2013398."},{"key":"e_1_3_1_31_2","first-page":"371","volume-title":"Cryptographic Hardware and Embedded Systems\u2013CHES 2014: 16th International Workshop","author":"Roy Sujoy Sinha","year":"2014","unstructured":"Sujoy Sinha Roy, Frederik Vercauteren, Nele Mentens, Donald Donglong Chen, and Ingrid Verbauwhede. 2014. Compact ring-LWE cryptoprocessor. In Cryptographic Hardware and Embedded Systems\u2013CHES 2014: 16th International Workshop. Springer, 371\u2013391."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480070"},{"key":"e_1_3_1_33_2","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1145\/3470496.3527393","volume-title":"49th Annual International Symposium on Computer Architecture","author":"Samardzic Nikola","year":"2022","unstructured":"Nikola Samardzic, Axel Feldmann, Aleksandar Krastev, Nathan Manohar, Nicholas Genise, Srinivas Devadas, Karim Eldefrawy, Chris Peikert, and Daniel Sanchez. 2022. Craterlake: A hardware accelerator for efficient unbounded computation on encrypted data. In 49th Annual International Symposium on Computer Architecture, 173\u2013187."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/2847263.2847277"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.laa.2016.07.020"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2022.3166355"},{"key":"e_1_3_1_37_2","first-page":"1","volume-title":"2021 IEEE International Symposium on Circuits and Systems (ISCAS \u201921)","author":"Xin Guozhu","year":"2021","unstructured":"Guozhu Xin, Yifan Zhao, and Jun Han. 2021. A multi-layer parallel hardware architecture for homomorphic computation in machine learning. In 2021 IEEE International Symposium on Circuits and Systems (ISCAS \u201921). IEEE, 1\u20135."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.46586\/tches.v2021.i2.328-356"},{"key":"e_1_3_1_39_2","first-page":"30","volume-title":"19th ACM International Conference on Computing Frontiers","author":"Yang Yang","year":"2022","unstructured":"Yang Yang, Sanmukh R. Kuppannagari, Rajgopal Kannan, and Viktor K. Prasanna. 2022. NTTGen: A framework for generating low latency NTT implementations on FPGA. In 19th ACM International Conference on Computing Frontiers, 30\u201339."},{"key":"e_1_3_1_40_2","first-page":"1","volume-title":"2022 IEEE High Performance Extreme Computing Conference (HPEC \u201922)","author":"Ye Tian","year":"2022","unstructured":"Tian Ye, Rajgopal Kannan, and Viktor K. Prasanna. 2022. FPGA acceleration of fully homomorphic encryption over the torus. In 2022 IEEE High Performance Extreme Computing Conference (HPEC \u201922). IEEE, 1\u20137."},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSII.2022.3184703"},{"issue":"4","key":"e_1_3_1_42_2","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1109\/TC.2019.2958334","article-title":"NTTU: An area-efficient low-power NTT-uncoupled architecture for NTT-based multiplication","volume":"69","author":"Zhang Neng","year":"2019","unstructured":"Neng Zhang, Qiao Qin, Hang Yuan, Chenggao Zhou, Shouyi Yin, Shaojun Wei, and Leibo Liu. 2019. NTTU: An area-efficient low-power NTT-uncoupled architecture for NTT-based multiplication. IEEE Transactions on Computers 69, 4 (2019), 520\u2013533.","journal-title":"IEEE Transactions on Computers"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.46586\/tches.v2020.i2.49-72"}],"container-title":["ACM Transactions on Reconfigurable Technology and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689437","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3689437","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:45Z","timestamp":1750291545000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689437"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,7]]},"references-count":42,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,12,31]]}},"alternative-id":["10.1145\/3689437"],"URL":"https:\/\/doi.org\/10.1145\/3689437","relation":{},"ISSN":["1936-7406","1936-7414"],"issn-type":[{"type":"print","value":"1936-7406"},{"type":"electronic","value":"1936-7414"}],"subject":[],"published":{"date-parts":[[2024,11,7]]},"assertion":[{"value":"2023-09-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-02","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-11-07","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}