{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T13:21:21Z","timestamp":1753881681527,"version":"3.41.2"},"reference-count":27,"publisher":"World Scientific Pub Co Pte Ltd","issue":"04","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J CIRCUIT SYST COMP"],"published-print":{"date-parts":[[2025,3,15]]},"abstract":"<jats:p> Modular Multiplication (MM) plays a crucial role in public key cryptography and random numbers generation. Montgomery Modular Multiplication (MMM) Algorithm is the most popular method used for performing this operation. Currently, there are various Field Programmable Gate Array (FPGA)-based optimizations targeting the MMM implementation. However, many of these works suffer from complex critical path where multiplications and carry propagate additions are calculated at each clock cycle. This leads to a low computation throughput and limited frequency. This paper presents an efficient FPGA implementation of the MMM algorithm using compressors-4:2 (Comp-4:2s). Our objective is to realize a Hardware (HW) architecture which presents a best trade-off between computation throughput and occupied area. The execution performances of the MMM depend on both parameters: the radix-r and the modulus size. In fact, when the radix increases, the algorithm requires multiplications of form digits[Formula: see text][Formula: see text][Formula: see text]operands. On the other hand, when a long modulus is used, the HW implementation of the MMM needs long carry propagation paths. In this work we propose an approach to circumvent the required multiplications by decomposing the digits into coefficients powers of 2. Then, the multiplications are carried out by simple shifts and additions. The Carry Save Adders (CSAs) are employed to avoid the use of long carry chains. However, the combination of CSAs and the used radixes requires four-input adders. Our implementation is based on using the Comp-4:2s for the realization of regular CSAs structures. The challenge consists in the development of an optimized Comp-4:2 which must be completely integrated in single FPGA Slice. Our implementation solution allows the achievement of high computation throughput. The obtained critical path delay is independent of the operand size. The implementation on Virtex-7 circuit using a modulus of 1024-bit size shows that the MMM run in 1.31 and requires 2342 Slices. <\/jats:p>","DOI":"10.1142\/s0218126625500926","type":"journal-article","created":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T07:36:12Z","timestamp":1726817772000},"source":"Crossref","is-referenced-by-count":0,"title":["Hardware Implementation of Montgomery Modular Multiplication Based on New Compressor-4:2 Design"],"prefix":"10.1142","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8896-7827","authenticated-orcid":false,"given":"M.","family":"Issad","sequence":"first","affiliation":[{"name":"Department of System and Multimedia Architecture, Centre de D\u00e9veloppement des Technologies Avanc\u00e9es, BP 17 Cit\u00e9 20 Ao\u00fbt 1956, Baba Hassen, 16081 Alger, Alg\u00e9rie"}]},{"given":"M.","family":"Anane","sequence":"additional","affiliation":[{"name":"Ecole Sup\u00e9rieure d\u2019Informatique, BP 68 M Oued Smar, El Harrach, 16270 Alger, Alg\u00e9rie"}]},{"given":"B.","family":"Boudraa","sequence":"additional","affiliation":[{"name":"Faculty of Electronic and Informatics, Universit\u00e9 des Sciences et de la Technologie, Houari Boumediene, BP 32 El Alia, Bab Ezzouar, 16111 Alger, Alg\u00e9rie"}]}],"member":"219","published-online":{"date-parts":[[2024,11,16]]},"reference":[{"key":"S0218126625500926BIB001","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1976.1055638"},{"key":"S0218126625500926BIB002","doi-asserted-by":"publisher","DOI":"10.1145\/359340.359342"},{"key":"S0218126625500926BIB003","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04101-3"},{"key":"S0218126625500926BIB004","doi-asserted-by":"publisher","DOI":"10.1137\/0215025"},{"volume-title":"Synthesis of Arithmetic Circuits FPGA, ASIC and Embedded Systems","year":"2006","author":"Deschamps J. P.","key":"S0218126625500926BIB005"},{"key":"S0218126625500926BIB006","doi-asserted-by":"publisher","DOI":"10.1109\/12.565590"},{"key":"S0218126625500926BIB007","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2008.2000363"},{"key":"S0218126625500926BIB008","doi-asserted-by":"publisher","DOI":"10.1090\/S0025-5718-1985-0777282-X"},{"key":"S0218126625500926BIB009","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2017.2652979"},{"key":"S0218126625500926BIB010","first-page":"19","volume":"14","author":"Pajuelo-Holguera F.","year":"2022","journal-title":"IEEE ESL"},{"key":"S0218126625500926BIB011","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126619502293"},{"key":"S0218126625500926BIB012","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2014.2355854"},{"key":"S0218126625500926BIB013","doi-asserted-by":"publisher","DOI":"10.1109\/TIE.2010.2080653"},{"key":"S0218126625500926BIB014","doi-asserted-by":"publisher","DOI":"10.1109\/40.502403"},{"key":"S0218126625500926BIB015","doi-asserted-by":"publisher","DOI":"10.1109\/DSD.2016.70"},{"key":"S0218126625500926BIB016","doi-asserted-by":"publisher","DOI":"10.1142\/S0218126622501377"},{"key":"S0218126625500926BIB017","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45760-7_3"},{"key":"S0218126625500926BIB018","doi-asserted-by":"publisher","DOI":"10.1109\/ACSSC.2001.986892"},{"key":"S0218126625500926BIB019","doi-asserted-by":"publisher","DOI":"10.1109\/SPIN.2017.8049905"},{"key":"S0218126625500926BIB021","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2011.49"},{"key":"S0218126625500926BIB022","doi-asserted-by":"publisher","DOI":"10.1109\/iCCECE55162.2022.9875097"},{"key":"S0218126625500926BIB023","first-page":"2137","volume":"68","author":"Abd-Elkader A. A. H.","year":"2021","journal-title":"IEEE Trans. Circuits Syst. \u2014 II: Exp. Briefs"},{"key":"S0218126625500926BIB024","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2988379"},{"key":"S0218126625500926BIB025","doi-asserted-by":"publisher","DOI":"10.1016\/j.mejo.2020.104927"},{"key":"S0218126625500926BIB026","doi-asserted-by":"publisher","DOI":"10.1155\/2011\/127147"},{"key":"S0218126625500926BIB027","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2024.103142"},{"key":"S0218126625500926BIB028","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3282781"}],"container-title":["Journal of Circuits, Systems and Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218126625500926","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T07:28:29Z","timestamp":1742369309000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0218126625500926"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,16]]},"references-count":27,"journal-issue":{"issue":"04","published-print":{"date-parts":[[2025,3,15]]}},"alternative-id":["10.1142\/S0218126625500926"],"URL":"https:\/\/doi.org\/10.1142\/s0218126625500926","relation":{},"ISSN":["0218-1266","1793-6454"],"issn-type":[{"type":"print","value":"0218-1266"},{"type":"electronic","value":"1793-6454"}],"subject":[],"published":{"date-parts":[[2024,11,16]]},"article-number":"2550092"}}