{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T05:03:08Z","timestamp":1769749388646,"version":"3.49.0"},"reference-count":51,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,9,25]],"date-time":"2023-09-25T00:00:00Z","timestamp":1695600000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Cryptography"],"abstract":"<jats:p>The Number Theoretic Transform (NTT) has been widely used to speed up polynomial multiplication in lattice-based post-quantum algorithms. All NTT operands use modular arithmetic, especially modular multiplication, which significantly influences NTT hardware implementation efficiency. Until now, most hardware implementations used Digital Signal Processing (DSP) to multiply two integers and optimally perform modulo computations from the multiplication product. This paper presents a customized Lattice-DSP (L-DSP) for modular multiplication based on the Karatsuba algorithm, Vedic multiplier, and modular reduction methods. The proposed L-DSP performs both integer multiplication and modular reduction simultaneously for lattice-based cryptography. As a result, the speed and area efficiency of the L-DSPs are 283 MHz for 77 SLICEs, 272 MHz for 87 SLICEs, and 256 MHz for 101 SLICEs with the parameters q of 3329, 7681, and 12,289, respectively. In addition, the N\u22121 multiplier in the Inverse-NTT (INTT) calculation is also eliminated, reducing the size of the Butterfly Unit (BU) in CRYSTAL-Kyber to about 104 SLICEs, equivalent to a conventional multiplication in the other studies. Based on the proposed DSP, a Point-Wise Matrix Multiplication (PWMM) architecture for CRYSTAL-Kyber is designed on a hardware footprint equivalent to 386 SLICEs. Furthermore, this research is the first DSP designed for lattice-based Post-quantum Cryptography (PQC) modular multiplication.<\/jats:p>","DOI":"10.3390\/cryptography7040046","type":"journal-article","created":{"date-parts":[[2023,9,25]],"date-time":"2023-09-25T03:56:49Z","timestamp":1695614209000},"page":"46","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["A High-Efficiency Modular Multiplication Digital Signal Processing for Lattice-Based Post-Quantum Cryptography"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-0952-0534","authenticated-orcid":false,"given":"Trong-Hung","family":"Nguyen","sequence":"first","affiliation":[{"name":"Department of Computer and Network Engineering, University of Electro-Communications (UEC), 1-5-1 Chofugaoka, Tokyo 182-8585, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5255-4919","authenticated-orcid":false,"given":"Cong-Kha","family":"Pham","sequence":"additional","affiliation":[{"name":"Department of Computer and Network Engineering, University of Electro-Communications (UEC), 1-5-1 Chofugaoka, Tokyo 182-8585, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4078-0836","authenticated-orcid":false,"given":"Trong-Thuc","family":"Hoang","sequence":"additional","affiliation":[{"name":"Department of Computer and Network Engineering, University of Electro-Communications (UEC), 1-5-1 Chofugaoka, Tokyo 182-8585, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Alagic, G., Apon, D., Cooper, D., Dang, Q., Dang, T., Kelsey, J., Lichtinger, J., Liu, Y.K., Miller, C., and Moody, D. (2022). Status Report on the Third Round of the NIST Post-Quantum Cryptography Standardization Process.","DOI":"10.6028\/NIST.IR.8413"},{"key":"ref_2","unstructured":"Avanzi, R., Bos, J., Ducas, L., Kiltz, E., Lepoint, T., Lyubashevsky, V., Schanck, J.M., Schwabe, P., Seiler, G., and Stehl\u00e9, D. (2023, September 15). CRYSTALS-Kyber: Algorithm Specifications And Supporting Documentation (Version 3.01). January 2021. Available online: https:\/\/pq-crystals.org\/kyber\/data\/kyber-specification-round3-20210131.pdf."},{"key":"ref_3","unstructured":"Bai, S., Ducas, L., Kiltz, E., Lepoint, T., Lyubashevsky, V., Schwabe, P., Seiler, G., and Stehl\u00e9, D. (2023, September 15). CRYSTALS-Dilithium: Algorithm Specifications And Supporting Documentation (Version 3.01). February 2021. Available online: https:\/\/pq-crystals.org\/dilithium\/data\/dilithium-specification-round3-20210208.pdf."},{"key":"ref_4","unstructured":"Fouque, P.-A., Hoffstein, J., Kirchner, P., Lyubashevsky, V., Pornin, T., Prest, T., Ricosset, T., Seiler, G., Whyte, W., and Zhang, Z. (2023, September 15). Falcon: Fast-Fourier Lattice-Based Compact Signatures over NTRU (v1.2). October 2020. Available online: https:\/\/falcon-sign.info\/falcon.pdf."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1090\/S0025-5718-1985-0777282-X","article-title":"Modular Multiplication Without Trial Division","volume":"44","author":"Montgomery","year":"1985","journal-title":"Math. Comput."},{"key":"ref_6","unstructured":"Barrett, P. (1987, January 16\u201320). Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor. Proceedings of the Advances in Crypto (CRYPTO), Santa Barbara, CA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"70288","DOI":"10.1109\/ACCESS.2023.3294446","article-title":"Conceptual Review on Number Theoretic Transform and Comprehensive Review on Its Implementations","volume":"11","author":"Satriawan","year":"2023","journal-title":"IEEE Access"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Liu, Z., Seo, H., Sinha Roy, S., Gro\u00dfsch\u00e4dl, J., Kim, H., and Verbauwhede, I. (2015, January 13\u201316). Efficient Ring-LWE Encryption on 8-Bit AVR Processors. Proceedings of the Cryptographic Hardware and Embedded Systems (CHES), Saint-Malo, France.","DOI":"10.1007\/978-3-662-48324-4_33"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2332","DOI":"10.1109\/TVLSI.2017.2697841","article-title":"High-Throughput Ring-LWE Cryptoprocessors","volume":"25","year":"2017","journal-title":"IEEE Trans. Very Large Scale Integr. (VLSI) Syst."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2459","DOI":"10.1109\/TVLSI.2019.2922999","article-title":"Optimized Schoolbook Polynomial Multiplication for Compact Lattice-Based Cryptography on FPGA","volume":"27","author":"Liu","year":"2019","journal-title":"IEEE Trans. Very Large Scale Integr. (VLSI) Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1109\/TETC.2022.3144101","article-title":"Ultra High-Speed Polynomial Multiplications for Lattice-Based Cryptography on FPGAs","volume":"10","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Emerg. Top. Comp."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1506","DOI":"10.1109\/TETC.2021.3073475","article-title":"Efficient Word Size Modular Arithmetic","volume":"9","author":"Plantard","year":"2021","journal-title":"IEEE Trans. Emerg. Top. Comp."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"614","DOI":"10.46586\/tches.v2022.i4.614-636","article-title":"Improved Plantard Arithmetic for Lattice-based Cryptography","volume":"2022","author":"Huang","year":"2022","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES)"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"17","DOI":"10.46586\/tches.v2019.i4.17-61","article-title":"Sapphire: A Configurable Crypto-Processor for Post-Quantum Lattice-based Protocols","volume":"2019","author":"Banerjee","year":"2019","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES)"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Yaman, F., Mert, A.C., \u00d6zt\u00fcrk, E., and Sava\u015f, E. (2021, January 1\u20135). A Hardware Accelerator for Polynomial Multiplication Operation of CRYSTALS-KYBER PQC Scheme. Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France.","DOI":"10.23919\/DATE51398.2021.9474139"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, C., Liu, D., Liu, X., Zou, X., Niu, G., Liu, B., and Jiang, Q. (2021, January 22\u201328). Towards Efficient Hardware Implementation of NTT for Kyber on FPGAs. Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea.","DOI":"10.1109\/ISCAS51556.2021.9401170"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Longa, P., and Naehrig, M. (2016, January 14\u201316). Speeding up the Number Theoretic Transform for Faster Ideal Lattice-Based Cryptography. Proceedings of the International Conference on Cryptology and Network Security (CANS), Milan, Italy.","DOI":"10.1007\/978-3-319-48965-0_8"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Bisheh-Niasar, M., Azarderakhsh, R., and Mozaffari-Kermani, M. (2021, January 14\u201316). High-Speed NTT-based Polynomial Multiplication Accelerator for Post-Quantum Cryptography. Proceedings of the IEEE ymposium on Computer Arithmetic (ARITH), Lyngby, Denmark.","DOI":"10.1109\/ARITH51176.2021.00028"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lil, M., Tian, J., Hu, X., Cao, Y., and Wang, Z. (2022, January 11\u201313). High-Speed and Low-Complexity Modular Reduction Design for CRYSTALS-Kyber. Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Shenzhen, China.","DOI":"10.1109\/APCCAS55924.2022.10090253"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"44446","DOI":"10.1109\/ACCESS.2022.3169784","article-title":"Accelerating Falcon on ARMv8","volume":"10","author":"Kim","year":"2022","journal-title":"IEEE Access"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Alagic, G., Alagic, G., Alperin-Sheriff, J., Apon, D., Cooper, D., Dang, Q., Liu, Y.-K., Miller, C., Moody, D., and Peralta, R. (2019). Status Report on the First Round of the NIST Post-Quantum Cryptography Standardization Process.","DOI":"10.6028\/NIST.IR.8240"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Alagic, G., Alperin-Sheriff, J., Apon, D., Cooper, D., Dang, Q., Kelsey, J., Liu, Y.-K., Miller, C., Moody, D., and Peralta, R. (2020). Status Report on the Second Round of the NIST Post-Quantum Cryptography Standardization Process.","DOI":"10.6028\/NIST.IR.8240"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"180","DOI":"10.46586\/tches.v2019.i3.180-201","article-title":"NTTRU: Truly Fast NTRU Using NTT","volume":"2019","author":"Lyubashevsky","year":"2019","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES)"},{"key":"ref_24","unstructured":"Lyubashevsky, V., Micciancio, D., Peikert, C., and Rosen, A. (2008, January 10\u201313). SWIFFT: A Modest Proposal for FFT Hashing. Proceedings of the Fast Software Encryption (FSE), Lausanne, Switzerland."},{"key":"ref_25","unstructured":"Seiler, G. (2023, September 15). Faster AVX2 Optimized NTT Multiplication for Ring-LWE Lattice Cryptography. Cryptology ePrint Archive, Paper 2018\/039. Available online: https:\/\/eprint.iacr.org\/2018\/039."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"159","DOI":"10.46586\/tches.v2021.i2.159-188","article-title":"NTT Multiplication for NTT-unfriendly Rings: New Speed Records for Saber and NTRU on Cortex-M4 and AVX2","volume":"2021","author":"Chung","year":"2021","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES)"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1109\/TC.2022.3222954","article-title":"High-Speed Hardware Architectures and FPGA Benchmarking of CRYSTALS-Kyber, NTRU, and Saber","volume":"72","author":"Dang","year":"2023","journal-title":"IEEE Trans. Comp."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1715","DOI":"10.1109\/TC.2010.93","article-title":"Faster Interleaved Modular Multiplication Based on Barrett and Montgomery Reduction Methods","volume":"59","author":"Knezevic","year":"2010","journal-title":"IEEE Trans. Comp."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Botros, L., Kannwischer, M.J., and Schwabe, P. (2023, September 15). Memory-Efficient High-Speed Implementation of Kyber on Cortex-M4. Cryptology ePrint Archive, Paper 2019\/489. Available online: https:\/\/eprint.iacr.org\/2019\/489.","DOI":"10.1007\/978-3-030-23696-0_11"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"336","DOI":"10.46586\/tches.v2020.i3.336-357","article-title":"Cortex-M4 optimizations for R,M LWE schemes","volume":"2020","author":"Alkim","year":"2020","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES)"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Abdulrahman, A., Hwang, V., Kannwischer, M.J., and Sprenkels, A. (2023, September 15). Faster Kyber and Dilithium on the Cortex-M4. Cryptology ePrint Archive, Paper 2022\/112. Available online: https:\/\/eprint.iacr.org\/2022\/112.","DOI":"10.1007\/978-3-031-09234-3_42"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1109\/TCSI.2022.3219555","article-title":"KaLi: A Crystal for Post-Quantum Security Using Kyber and Dilithium","volume":"70","author":"Aikata","year":"2023","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_33","first-page":"1562","article-title":"An Efficient Implementation of KYBER","volume":"69","author":"Guo","year":"2022","journal-title":"IEEE Trans. Circ. Syst. II Express Briefs (TCAS-II)"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Guo, W., and Li, S. (2023). Highly-Efficient Hardware Architecture for CRYSTALS-Kyber with a Novel Conflict-Free Memory Access Pattern. IEEE Trans. Circuits Syst. I Regul. Pap., 1\u201311.","DOI":"10.1109\/TCSI.2023.3306347"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2540","DOI":"10.1109\/TCAD.2022.3230359","article-title":"Reconfigurable and High-Efficiency Polynomial Multiplication Accelerator for CRYSTALS-Kyber","volume":"42","author":"Li","year":"2023","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ni, Z., Khalid, A., Liu, W., and O\u2019Neill, M. (2023, January 23). Towards a Lightweight CRYSTALS-Kyber in FPGAs: An Ultra-lightweight BRAM-free NTT Core. Proceedings of the 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, USA.","DOI":"10.1109\/ISCAS46773.2023.10181340"},{"key":"ref_37","first-page":"4068","article-title":"PipeNTT: A Pipelined Number Theoretic Transform Architecture","volume":"69","author":"Ye","year":"2022","journal-title":"IEEE Trans. Circuits Syst. II Express Briefs"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Bansal, Y., Madhu, C., and Kaur, P. (2014, January 6\u20138). High Speed Vedic Multiplier Designs\u2014A Review. Proceedings of the Recent Advances in Engineering and Computational Sciences (RAECS), Chandigarh, India.","DOI":"10.1109\/RAECS.2014.6799502"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"2690","DOI":"10.1109\/TVLSI.2014.2371857","article-title":"Threshold Logic Computing: Memristive-CMOS Circuits for Fast Fourier Transform and Vedic Multiplication","volume":"23","author":"James","year":"2015","journal-title":"IEEE Trans. Very Large Scale Integr. (VLSI) Syst."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1109\/TCSI.2020.3046783","article-title":"Symmetric-Mapping LUT-Based Method and Architecture for Computing XY-Like Functions","volume":"68","author":"Chen","year":"2021","journal-title":"IEEE Trans. Circ. Syst. I Regul. Papers (TCAS-I)"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1090\/S0025-5718-1965-0178586-1","article-title":"An Algorithm for the Machine Calculation of Complex Fourier Series","volume":"19","author":"Cooley","year":"1965","journal-title":"Math. Comput."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Gentleman, W.M., and Sande, G. (1966, January 7\u201310). Fast Fourier Transforms: For Fun and Profit. Proceedings of the Fall Joint Computer Conference (AFIPS), San Francisco, CA, USA.","DOI":"10.1145\/1464291.1464352"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Roy, S.S., Vercauteren, F., Mentens, N., Chen, D.D., and Verbauwhede, I. (2014, January 23\u201326). Compact Ring-LWE Cryptoprocessor. Proceedings of the Cryptographic Hardware and Embedded Systems (CHES), Busan, Republic of Korea.","DOI":"10.1007\/978-3-662-44709-3_21"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"P\u00f6ppelmann, T., Oder, T., and G\u00fcneysu, T. (2015, January 23\u201326). High-Performance Ideal Lattice-Based Cryptography on 8-Bit ATxmega Microcontrollers. Proceedings of the Progress in Cryptology (LATINCRYPT), Guadalajara, Mexico.","DOI":"10.1007\/978-3-319-22174-8_19"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"49","DOI":"10.46586\/tches.v2020.i2.49-72","article-title":"Highly efficient architecture of NewHope-NIST on FPGA using low-complexity NTT\/INTT","volume":"2020","author":"Zhang","year":"2020","journal-title":"IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES)"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"12732","DOI":"10.1109\/ACCESS.2022.3145988","article-title":"Configurable Mixed-Radix Number Theoretic Transform Architecture for Lattice-Based Cryptography","volume":"10","author":"Lee","year":"2022","journal-title":"IEEE Access"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ni, Z., Khalid, A., O\u2019Neill, M., and Liu, W. (2023). HPKA: A High-Performance CRYSTALS-Kyber Accelerator Exploring Efficient Pipelining. IEEE Trans. Comp., 1\u201314.","DOI":"10.1109\/TC.2023.3296899"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Xing, Y., and Li, S. (2021). A Compact Hardware Implementation of CCAsecure Key Exchange Mechanism CRYSTALS-Kyber on FPGA. IACR Trans. Cryptogr. Hardw. Embed. Syst. (TCHES), 328\u2013356.","DOI":"10.46586\/tches.v2021.i2.328-356"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Yao, K., Wang, C., O\u2019Neill, M., and Liu, W. (2021, January 22\u201328). Towards CRYSTALS-Kyber: A M-LWE Cryptoprocessor with Area-Time Trade-Off. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea.","DOI":"10.1109\/ISCAS51556.2021.9401253"},{"key":"ref_50","unstructured":"Itabashi, Y., Ueno, R., and Homma, N. (September, January 31). Efficient Modular Polynomial Multiplier for NTT Accelerator of Crystals-Kyber. Proceedings of the Euromicro Conference on Digital System Design (DSD), Maspalomas, Spain."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"150798","DOI":"10.1109\/ACCESS.2021.3126208","article-title":"A RISC-V Post Quantum Cryptography Instruction Set Extension for Number Theoretic Transform to Speed-Up CRYSTALS Algorithms","volume":"9","author":"Nannipieri","year":"2021","journal-title":"IEEE Access"}],"container-title":["Cryptography"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2410-387X\/7\/4\/46\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:57:30Z","timestamp":1760129850000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2410-387X\/7\/4\/46"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,25]]},"references-count":51,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["cryptography7040046"],"URL":"https:\/\/doi.org\/10.3390\/cryptography7040046","relation":{},"ISSN":["2410-387X"],"issn-type":[{"value":"2410-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,25]]}}}