{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:15:53Z","timestamp":1760058953102,"version":"build-2065373602"},"reference-count":19,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2025,5,8]],"date-time":"2025-05-08T00:00:00Z","timestamp":1746662400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Convolution plays a significant role in many scientific and technological computations, such as artificial intelligence and signal processing. Convolutional computations consist of many dot-product operations (multiplication\u2013accumulation, or MAC), for which the Winograd algorithm is currently the most widely used method to reduce the number of MACs. The Karatsuba algorithm, since its introduction in the 1960s, has been traditionally used as a fast arithmetic method to perform multiplication between large-bit-width operands. It had not been exploited to accelerate 2D convolution computations before. In this paper, we revisited the Karatsuba algorithm and exploited it to reduce the number of MACs in 2D convolutions. The matrices are first segmented into tiles in a divide-and-conquer method, and the resulting submatrices are overlapped to construct the final output matrix. Our analysis and benchmarks have shown that for convolution operations of the same dimensions, the Karatsuba algorithm requires the same number of multiplications but fewer additions as compared with the Winograd algorithm. A pseudocode implementation is also provided to demonstrate the complexity reduction in Karatsuba-based convolution. FPGA implementation of Karatsuba-based convolution also achieves 33.6% LUTs (Look -up Tables) reduction compared with Winograd-based implementation.<\/jats:p>","DOI":"10.3390\/e27050506","type":"journal-article","created":{"date-parts":[[2025,5,8]],"date-time":"2025-05-08T06:47:15Z","timestamp":1746686835000},"page":"506","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization"],"prefix":"10.3390","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2644-2873","authenticated-orcid":false,"given":"Qi","family":"Wang","sequence":"first","affiliation":[{"name":"The Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada"},{"name":"The Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianghan","family":"Zhu","sequence":"additional","affiliation":[{"name":"The Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Can","family":"He","sequence":"additional","affiliation":[{"name":"The Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shihang","family":"Wang","sequence":"additional","affiliation":[{"name":"The Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9349-2718","authenticated-orcid":false,"given":"Xingbo","family":"Wang","sequence":"additional","affiliation":[{"name":"The Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuan","family":"Ren","sequence":"additional","affiliation":[{"name":"The Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China"},{"name":"The Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4359-3550","authenticated-orcid":false,"given":"Terry Tao","family":"Ye","sequence":"additional","affiliation":[{"name":"School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 518172, China"},{"name":"Institute of Nanoscience and Applications, Southern University of Science and Technology, Shenzhen 518055, China"},{"name":"Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing 518055, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,5,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Podili, A., Zhang, C., and Prasanna, V. (2017, January 10\u201312). Fast and efficient implementation of Convolutional Neural Networks on FPGA. Proceedings of the 2017 IEEE 28th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), Seattle, WA, USA.","DOI":"10.1109\/ASAP.2017.7995253"},{"key":"ref_2","unstructured":"Mathieu, M., Henaff, M., and LeCun, Y. (2013). Fast training of convolutional networks through ffts. arXiv."},{"key":"ref_3","unstructured":"Vasilache, N., Johnson, J., Mathieu, M., Chintala, S., Piantino, S., and LeCun, Y. (2014). Fast convolutional nets with fbfft: A GPU performance evaluation. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Winograd, S. (1980). Arithmetic Complexity of Computations, SIAM.","DOI":"10.1137\/1.9781611970364"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lavin, A., and Gray, S. (2016, January 27\u201330). Fast algorithms for convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.435"},{"key":"ref_6","first-page":"4479","article-title":"Fast fourier convolution","volume":"33","author":"Chi","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_7","unstructured":"Meng, L., and Brothers, J. (2019). Efficient winograd convolution via integer arithmetic. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lu, S., Chu, J., and Liu, X.T. (2022, January 19\u201323). Im2win: Memory efficient convolution on SIMD architectures. Proceedings of the 2022 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA.","DOI":"10.1109\/HPEC55821.2022.9926408"},{"key":"ref_9","unstructured":"Lu, S., Chu, J., Guo, L., and Liu, X.T. (September, January 28). Im2win: An Efficient Convolution Paradigm on GPU. Proceedings of the European Conference on Parallel Processing, Limassol, Cyprus."},{"key":"ref_10","first-page":"293","article-title":"Multiplication of many-digital numbers by automatic computers","volume":"Volume 145","author":"Karatsuba","year":"1962","journal-title":"Proceedings of the Doklady Akademii Nauk"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1199","DOI":"10.1109\/TCSI.2021.3126797","article-title":"Optimized Interpolation of Four-Term Karatsuba Multiplication and a Method of Avoiding Negative Multiplicands","volume":"69","author":"Gu","year":"2022","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_12","first-page":"1091","article-title":"Subquadratic Space-Complexity Digit-Serial Multipliers Over GF(2m) Using Generalized (a, b)-Way Karatsuba Algorithm","volume":"62","author":"Lee","year":"2015","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Heideman, M.T. (1988). Convolution and polynomial multiplication. Multiplicative Complexity, Convolution, and the DFT, Springer.","DOI":"10.1007\/978-1-4612-3912-3"},{"key":"ref_14","unstructured":"Ghidirimschi, N. (2021). Convolution Algorithms for Integer Data Types. [Ph.D. Thesis, University of Groningen]."},{"key":"ref_15","unstructured":"Glorot, X., Bordes, A., and Bengio, Y. (July, January 28). Domain adaptation for large-scale sentiment classification: A deep learning approach. Proceedings of the ICML, Bellevue, WA, USA."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1354","DOI":"10.1109\/TVLSI.2018.2815603","article-title":"Optimizing the convolution operation to accelerate deep neural networks on FPGA","volume":"26","author":"Ma","year":"2018","journal-title":"IEEE Trans. Very Large Scale Integr. (VLSI) Syst."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1137\/0209021","article-title":"On multiplication of polynomials modulo a polynomial","volume":"9","author":"Winograd","year":"1980","journal-title":"SIAM J. Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1109\/TC.2005.49","article-title":"Five, six, and seven-term Karatsuba-like formulae","volume":"54","author":"Montgomery","year":"2005","journal-title":"IEEE Trans. Comput."},{"key":"ref_19","unstructured":"Dyka, Z., and Langend\u00f6rfer, P. (2005, January 7\u201311). Area Efficient Hardware Implementation of Elliptic Curve Cryptography by Iteratively Applying Karatsuba\u2019s Method. Proceedings of the 2005 Design, Automation and Test in Europe Conference and Exposition (DATE 2005), Munich, Germany."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/5\/506\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:29:26Z","timestamp":1760030966000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/5\/506"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,8]]},"references-count":19,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,5]]}},"alternative-id":["e27050506"],"URL":"https:\/\/doi.org\/10.3390\/e27050506","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2025,5,8]]}}}