{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T15:03:16Z","timestamp":1772463796346,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2024,7,11]],"date-time":"2024-07-11T00:00:00Z","timestamp":1720656000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Vietnam National University Ho Chi Minh City (VNU-HCM)","award":["NCM2021-20-02"],"award-info":[{"award-number":["NCM2021-20-02"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>In the modern era of the Internet of Things (IoT), especially with the rapid development of quantum computers, the implementation of postquantum cryptography algorithms in numerous terminals allows them to defend against potential future quantum attack threats. Lattice-based cryptography can withstand quantum computing attacks, making it a viable substitute for the currently prevalent classical public-key cryptography technique. However, the algorithm\u2019s significant time complexity places a substantial computational burden on the already resource-limited chip in the IoT terminal. In lattice-based cryptography algorithms, the polynomial multiplication on the finite field is well known as the most time-consuming process. Therefore, investigations into efficient methods for calculating polynomial multiplication are essential for adopting these quantum-resistant lattice-based algorithms on a low-profile IoT terminal. Number theoretic transform (NTT), a variant of fast Fourier transform (FFT), is a technique widely employed to accelerate polynomial multiplication on the finite field to achieve a subquadratic time complexity. This study presents an efficient FPGA-based implementation of number theoretic transform for the CRYSTAL Kyber, a lattice-based public-key cryptography algorithm. Our hybrid design, which supports both forward and inverse NTT, is able run at high frequencies up to 417 MHz on a low-profile Artix7-XC7A100T and achieve a low latency of 1.10\u03bcs while achieving state-of-the-art hardware efficiency, consuming only 541-LUTs, 680 FFs, and four 18 Kb BRAMs. This is made possible thanks to the newly proposed multilevel pipeline butterfly unit architecture in combination with employing an effective coefficient accessing pattern.<\/jats:p>","DOI":"10.3390\/info15070400","type":"journal-article","created":{"date-parts":[[2024,7,11]],"date-time":"2024-07-11T15:59:48Z","timestamp":1720713588000},"page":"400","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Compact and Low-Latency FPGA-Based Number Theoretic Transform Architecture for CRYSTALS Kyber Postquantum Cryptography Scheme"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7240-7203","authenticated-orcid":false,"given":"Binh","family":"Kieu-Do-Nguyen","sequence":"first","affiliation":[{"name":"Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet St., Dist. 10, Ho Chi Minh City 740050, Vietnam"},{"name":"Computer Engineering Department, Vietnam National University\u2014Ho Chi Minh City (VNU-HCM), Thu Duc, Ho Chi Minh City 700000, Vietnam"},{"name":"Department of Computer and Network Engineering, University of Electro-Communications (UEC), Tokyo 182-8585, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-9173-9450","authenticated-orcid":false,"given":"Nguyen","family":"The Binh","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet St., Dist. 10, Ho Chi Minh City 740050, Vietnam"},{"name":"Computer Engineering Department, Vietnam National University\u2014Ho Chi Minh City (VNU-HCM), Thu Duc, Ho Chi Minh City 700000, Vietnam"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2917-1244","authenticated-orcid":false,"given":"Cuong","family":"Pham-Quoc","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet St., Dist. 10, Ho Chi Minh City 740050, Vietnam"},{"name":"Computer Engineering Department, Vietnam National University\u2014Ho Chi Minh City (VNU-HCM), Thu Duc, Ho Chi Minh City 700000, Vietnam"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5583-6361","authenticated-orcid":false,"given":"Huynh Phuc","family":"Nghi","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet St., Dist. 10, Ho Chi Minh City 740050, Vietnam"},{"name":"Computer Engineering Department, Vietnam National University\u2014Ho Chi Minh City (VNU-HCM), Thu Duc, Ho Chi Minh City 700000, Vietnam"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3899-7566","authenticated-orcid":false,"given":"Ngoc-Thinh","family":"Tran","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet St., Dist. 10, Ho Chi Minh City 740050, Vietnam"},{"name":"Computer Engineering Department, Vietnam National University\u2014Ho Chi Minh City (VNU-HCM), Thu Duc, Ho Chi Minh City 700000, Vietnam"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4078-0836","authenticated-orcid":false,"given":"Trong-Thuc","family":"Hoang","sequence":"additional","affiliation":[{"name":"Department of Computer and Network Engineering, University of Electro-Communications (UEC), Tokyo 182-8585, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5255-4919","authenticated-orcid":false,"given":"Cong-Kha","family":"Pham","sequence":"additional","affiliation":[{"name":"Department of Computer and Network Engineering, University of Electro-Communications (UEC), Tokyo 182-8585, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2024,7,11]]},"reference":[{"key":"ref_1","first-page":"1484","article-title":"Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer","volume":"26","author":"Shor","year":"1997","journal-title":"SIAM J. Sci. Statist. Comput."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Grover, L.K. (1996, January 22\u201324). A Fast Quantum Mechanical Algorithm for Database Search. Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, Philadelphia, PA, USA.","DOI":"10.1145\/237814.237866"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1145\/359340.359342","article-title":"A Method for Obtaining Digital Signatures and Public-key Cryptosystems","volume":"21","author":"Rivest","year":"1978","journal-title":"Commun. ACM"},{"key":"ref_4","unstructured":"Miller, V.S. (1985, January 18\u201322). Use of Elliptic Curves in Cryptography. Proceedings of the Advances in Cryptology (CRYPTO), Santa Barbara, CA, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bos, J., Ducas, L., Kiltz, E., Lepoint, T., Lyubashevsky, V., Schanck, J.M., Schwabe, P., Seiler, G., and Stehle, D.D. (2018, January 24\u201326). CRYSTALS\u2014Kyber: A CCA-Secure Module-Lattice-Based KEM. In Proceedings of the European Symposium on Security and Privacy (EuroS&P), London, UK.","DOI":"10.1109\/EuroSP.2018.00032"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Lyubashevsky, V., Peikert, C., and Regev, O. (2010, January 15\u201319). On Ideal Lattices and Learning with Errors Over Rings. Proceedings of the Advances in Cryptology (EUROCRYPT), Santa Barbara, CA, USA.","DOI":"10.1007\/978-3-642-13190-5_1"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lindner, R., and Peikert, C. (2011, January 14\u201318). Better Key Sizes (and Attacks) for LWE-Based Encryption. Proceedings of the Topics in Cryptology (CT-RSA), San Francisco, CA, USA.","DOI":"10.1007\/978-3-642-19074-2_21"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Das, M., and Jajodia, B.B. (2022, January 19\u201322). Hardware Design of Optimized Large Integer Schoolbook Polynomial Multiplications on FPGA. Proceedings of the International SoC Design Conference (ISOCC), Gangneung-si, Republic of Korea.","DOI":"10.1109\/ISOCC56007.2022.10031366"},{"key":"ref_9","unstructured":"Zhang, Y., Cui, Y., Ni, Z., Kundi, D., Liu, D., and Liu, W. (June, January 27). A Lightweight and Efficient Schoolbook Polynomial Multiplier for Saber. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA."},{"key":"ref_10","first-page":"5079","article-title":"Area-Time-Efficient Scalable Schoolbook Polynomial Multiplier for Lattice-Based Cryptography","volume":"69","author":"Birgani","year":"2022","journal-title":"IEEE Trans. Circ. Syst. II Express Briefs (TCAS-II)"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yang, S., Liu, D., Hu, A., Li, A., Zhang, J., Li, X., Lu, J., and Mo, C. (2022, January 14\u201316). An Instruction-configurable Post-quantum Cryptographic Processor Towards NTRU. Proceedings of the Asian Hardware Oriented Security and Trust Symposium (AsianHOST), Singapore.","DOI":"10.1109\/AsianHOST56390.2022.10022178"},{"key":"ref_12","first-page":"1830","article-title":"KaratSaber: New Speed Records for Saber Polynomial Multiplication Using Efficient Karatsuba FPGA Architecture","volume":"72","author":"Wong","year":"2023","journal-title":"IEEE Trans. Comput."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2383","DOI":"10.1109\/JSSC.2023.3253425","article-title":"A 334 \u03bcW 0.158 mm2 ASIC for Post-Quantum Key-Encapsulation Mechanism Saber With Low-Latency Striding Toom\u2013Cook Multiplication","volume":"58","author":"Ghosh","year":"2023","journal-title":"IEEE J. Solid-State Circ."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1153","DOI":"10.1109\/TVLSI.2023.3277865","article-title":"TCPM: A Reconfigurable and Efficient Toom-Cook-Based Polynomial Multiplier Over Rings Using a Novel Compressed Postprocessing Algorithm","volume":"31","author":"Wang","year":"2023","journal-title":"IEEE Trans. Very Large Scale Integr. (Vlsi) Syst."},{"key":"ref_15","first-page":"4068","article-title":"PipeNTT: A Pipelined Number Theoretic Transform Architecture","volume":"69","author":"Ye","year":"2022","journal-title":"IEEE Trans. Circ. Syst. II Express Briefs (TCAS-II)"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Xin, M., Xu, C., Huang, K., Yu, H., Yao, H., Jiang, X., and Liu, D. (2022, January 14\u201316). Implementation of Number Theoretic Transform Unit for Polynomial Multiplication of Lattice-based Cryptography. Proceedings of the International Conference on Consumer\tElectronics and Computer Engineering (ICCECE), Guangzhou, China.","DOI":"10.1109\/ICCECE54139.2022.9712707"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"34918","DOI":"10.1109\/ACCESS.2024.3371581","article-title":"High-Speed NTT Accelerator for CRYSTAL-Kyber and CRYSTAL-Dilithium","volume":"12","author":"Nguyen","year":"2024","journal-title":"IEEE Access"},{"key":"ref_18","first-page":"265","article-title":"Nachlass: Theoria Interpolationis Methodo Nova Tractata","volume":"3","author":"Gauss","year":"1866","journal-title":"Carl Friedrich Gauss"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1090\/S0025-5718-1971-0301966-0","article-title":"The Fast Fourier Transform in a Finite Field","volume":"25","author":"Pollard","year":"1971","journal-title":"Math. Comput."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1090\/S0025-5718-1965-0178586-1","article-title":"An Algorithm for the Machine Calculation of Complex Fourier Series","volume":"19","author":"Cooley","year":"1965","journal-title":"Math. Comput."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, C., Liu, D., Liu, X., Zou, X., Niu, G., Liu, B., and Jiang, Q. (2021, January 22\u201328). Towards Efficient Hardware Implementation of NTT for Kyber on FPGAs. Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea.","DOI":"10.1109\/ISCAS51556.2021.9401170"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ni, Z., Khalid, A., Liu, W., and O\u2019Neill, M. (2023, January 21\u201325). Towards a Lightweight CRYSTALS-Kyber in FPGAs: An Ultra-lightweight BRAM-free NTT Core. Proceedings of the IEEE International Symposium on Circuits and Systems 2023, Monterey, CA, USA.","DOI":"10.1109\/ISCAS46773.2023.10181340"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Imran, M., Khan, S., Khalid, A., Rafferty, C., Shah, Y.A., Pagliarini, S., Rashid, M., and O\u2019Neill, M. (IEEE Embed. Syst. Lett., 2024). Evaluating NTT\/INTT Implementation Styles for Post-Quantum Cryptography, IEEE Embed. Syst. Lett., early access.","DOI":"10.1109\/LES.2024.3410516"},{"key":"ref_24","unstructured":"Ge, C., and Yung, M. (2024). Hardware Acceleration of NTT-Based Polynomial Multiplication in CRYSTALS-Kyber. Information Security and Cryptology, Springer."},{"key":"ref_25","unstructured":"Longa, P., and Naehrig, M. (2024, July 07). Speeding Up the Number Theoretic Transform for Faster Ideal Lattice-Based Cryptography. Cryptology ePrint Archive, Paper 2016\/504, 2016. Available online: https:\/\/eprint.iacr.org\/2016\/504."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4648","DOI":"10.1109\/TCSI.2021.3106639","article-title":"Instruction-Set Accelerated Implementation of CRYSTALS-Kyber","volume":"68","author":"Niasar","year":"2021","journal-title":"IEEE Trans. Circ. Syst. I Regul. Pap. (TCAS-I)"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bisheh-Niasar, M., Azarderakhsh, R., and Mozaffari-Kermani, M. (2021, January 14\u201316). High-Speed NTT-based Polynomial Multiplication Accelerator for Post-Quantum Cryptography. Proceedings of the 2021 IEEE 28th Symposium on Computer Arithmetic (ARITH), Lyngby, Denmark.","DOI":"10.1109\/ARITH51176.2021.00028"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yaman, F., Mert, A.C., \u00d6zt\u00fcrk, E., and Savas, E. (2021, January 1\u20135). A Hardware Accelerator for Polynomial Multiplication Operation of CRYSTALS-KYBER PQC Scheme. Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France.","DOI":"10.23919\/DATE51398.2021.9474139"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"328","DOI":"10.46586\/tches.v2021.i2.328-356","article-title":"A Compact Hardware Implementation of CCA-Secure Key Exchange Mechanism CRYSTALS-KYBER on FPGA","volume":"2021","author":"Xing","year":"2021","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"25501","DOI":"10.1109\/ACCESS.2024.3367109","article-title":"CRYPHTOR: A Memory-Unified NTT-Based Hardware Accelerator for Post-Quantum CRYSTALS Algorithms","volume":"12","author":"Matteo","year":"2024","journal-title":"IEEE Access"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/7\/400\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:15:17Z","timestamp":1760109317000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/7\/400"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,11]]},"references-count":30,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2024,7]]}},"alternative-id":["info15070400"],"URL":"https:\/\/doi.org\/10.3390\/info15070400","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,11]]}}}