{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T07:20:51Z","timestamp":1777706451927,"version":"3.51.4"},"reference-count":21,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T00:00:00Z","timestamp":1712016000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems: Applications in Engineering and Technology"],"published-print":{"date-parts":[[2026,2]]},"abstract":"<jats:p>Today, it is the amount of data that defines the existence of mankind. Scientists respond to the large amount of required calculations by developing hardware in several directions. One of them is to increase the number of arithmetic elements. Another direction is to create new architectures that represent new algorithms for processing numerical data. We have chosen the second direction by developing a new systolic core architecture, which implies an improvement in efficiency, i.e. performing the same task with the same number of arithmetic elements but reducing the latency. Measurements are made in terms of computational capacity and the number of arithmetic elements involved in the operations. The results of the tests are compared with data from a number of selected articles. Today, we have achieved 3.2GFlops with only two modules. In the future, we plan to integrate up to four of our cores in a system with its own memory and management processor and at a higher operating frequency.<\/jats:p>","DOI":"10.3233\/jifs-219361","type":"journal-article","created":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T13:58:36Z","timestamp":1712066316000},"page":"432-445","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Systolic Tensor Core for Arithmetics calculations"],"prefix":"10.1177","volume":"50","author":[{"given":"Mario Alfredo","family":"Ibarra Carrillo","sequence":"first","affiliation":[{"name":"Centro de Investigaci\u00f3n en Computaci\u00f3n, Instituto Polit\u00e9cnico Nacional, Laboratorio de Rob\u00f3tica yMecatr\u00f3nica, M\u00e9xico, Ciudad de M\u00e9xico, M\u00e9xico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jes\u00fas Yalj\u00e1","family":"Montiel P\u00e9rez","sequence":"additional","affiliation":[{"name":"Centro de Investigaci\u00f3n en Computaci\u00f3n, Instituto Polit\u00e9cnico Nacional, Laboratorio de Rob\u00f3tica yMecatr\u00f3nica, M\u00e9xico, Ciudad de M\u00e9xico, M\u00e9xico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Her\u00f3n","family":"Molina Lozano","sequence":"additional","affiliation":[{"name":"Centro de Investigaci\u00f3n en Computaci\u00f3n, Instituto Polit\u00e9cnico Nacional, Laboratorio deMicrotecnolog\u00eda y Sisatemas Embebidos, M\u00e9xico, Ciudad deM\u00e9xico, M\u00e9xico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2024,4,2]]},"reference":[{"key":"e_1_3_1_2_1","unstructured":"IEEE Computer Society IEEE Standard for Floating-Point Arithmetic. Microprocessor Standards Committee IEEE STd 754TM 2008."},{"key":"e_1_3_1_3_1","unstructured":"Wang ShiboKanwar Pankaj BFloat16: The secret to high performance on Cloud TPUs [on line]:AI & MACHINE LEARNINGGOOGLE CLOUD-TPUS August 23; 2019 [Consulting date March 1; 2022].Available in https:\/\/cloud.google.com\/blog\/products\/ai-machinelearning\/bfloat16-the-secret-to-high-performance-oncloud-tpus"},{"key":"e_1_3_1_4_1","doi-asserted-by":"crossref","unstructured":"Choquette Jack et al. 3.2 The A100 datacenter GPU and Ampere architecture On 2021 IEEE International Solid-State Circuits Conference (ISSCC). IEEE 2021. p. 48-50.","DOI":"10.1109\/ISSCC42613.2021.9365803"},{"key":"e_1_3_1_5_1","doi-asserted-by":"crossref","unstructured":"Fuketa Hiroshi et al. Image-classifier deep convolutional neural network training by 9-bit dedicated hardware to realize validation accuracy and energy efficiency superior to the half precision floating-point format On 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE 2018 1\u20135.","DOI":"10.1109\/ISCAS.2018.8350953"},{"key":"e_1_3_1_6_1","doi-asserted-by":"crossref","unstructured":"Rao PrakashR. et al. Implementation of the standard floating-point MAC using IEEE 754 floating-point adder in: 2018 Second International Conference on Computing Methodologies and Communication (ICCMC). IEEE 2018 717\u2013722.","DOI":"10.1109\/ICCMC.2018.8487626"},{"key":"e_1_3_1_7_1","doi-asserted-by":"crossref","unstructured":"MounikaK.RamanaP.V. FPGA Implementation of Power Efficient Floating Point Fused Multiply-Add Unit. En 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT). IEEE 2021 258\u2013263.","DOI":"10.1109\/CSNT51715.2021.9509678"},{"key":"e_1_3_1_8_1","doi-asserted-by":"crossref","unstructured":"SuneshN.V.SathishkumarP. Design and implementation of fast floating point multiplier unit. En 2015 International Conference on VLSI Systems Architecture Technology and Applications (VLSI-SATA). IEEE 2015 1\u20135.","DOI":"10.1109\/VLSI-SATA.2015.7050478"},{"key":"e_1_3_1_9_1","unstructured":"Liang JianTessier RussellMencer Oskar Floatingpoint unit generation and evaluation for FPGAs in: 11th Annual IEEE Symposium on Field-Programmable Custom ComputingMachines 2003.FCCM2003. IEEE 2003 185\u2013194."},{"key":"e_1_3_1_10_1","unstructured":"MATLAB Multiplier and adder components version 7.10.0 (R2010a) The MathWorks Inc."},{"key":"e_1_3_1_11_1","unstructured":"Ryan Fay et al. Asynthesizable vhdl floating-point package. https:\/\/github.com\/xesscorp\/Floating_Point_Library-JHU 2023. [Online; accessed 11 December 2023]."},{"key":"e_1_3_1_12_1","doi-asserted-by":"crossref","unstructured":"Mousouliotis PanagiotisG.Petrou LoukasP. Squeeze-Jet: high-level synthesis accelerator design for deep convolutional neural networks in: International Symposium on Applied Reconfigurable Computing. Springer Cham 2018 55\u201366.","DOI":"10.1007\/978-3-319-78890-6_5"},{"issue":"1","key":"e_1_3_1_13_1","first-page":"35","article-title":"Angel-eye: A complete design flow for mappingcnn onto embedded fpga","volume":"37","author":"Guo Kaiyuan","year":"2017","unstructured":"Guo Kaiyuan, et al. Angel-eye: A complete design flow for mappingcnn onto embedded fpga, IEEE Transactions on Computer-AidedDesign of Integrated Circuits and Systems37(1) (2017)35\u201347.","journal-title":"IEEE Transactions on Computer-AidedDesign of Integrated Circuits and Systems"},{"key":"e_1_3_1_14_1","doi-asserted-by":"crossref","unstructured":"Gokhale Vinayak et al. A 240 g-ops\/s mobile coprocessor for deep neural networks in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2014 682\u2013687.","DOI":"10.1109\/CVPRW.2014.106"},{"key":"e_1_3_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.1982.1653825"},{"key":"e_1_3_1_16_1","doi-asserted-by":"crossref","unstructured":"Wei Xuechao et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs in: Proceedings of the 54th Annual Design Automation Conference 2017 2017 1\u20136.","DOI":"10.1145\/3061639.3062207"},{"key":"e_1_3_1_17_1","doi-asserted-by":"crossref","unstructured":"Sano KentaroIizuka TakanoriYamamoto Satoru Systolic architecture for computational fluid dynamics on FPGAs in: 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007). IEEE 2007 107\u2013116.","DOI":"10.1109\/FCCM.2007.20"},{"key":"e_1_3_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.363476"},{"issue":"4","key":"e_1_3_1_19_1","first-page":"70","article-title":"Optimizing parallel reduction in CUDA","volume":"2","author":"Harris Mark","year":"2007","unstructured":"Harris Mark, et al. Optimizing parallel reduction in CUDA, Nvidia Developer Technology2(4) (2007)70\u2013.","journal-title":"Nvidia Developer Technology"},{"key":"e_1_3_1_20_1","doi-asserted-by":"crossref","unstructured":"Garg RahulHendren Laurie A portable and highperformance general matrix-multiply (GEMM) library for GPUs and single-chip CPU\/GPU systems in: 2014 22nd Euromicro International Conference on Parallel Distributed and Network-Based Processing. IEEE 2014 672\u2013680.","DOI":"10.1109\/PDP.2014.40"},{"key":"e_1_3_1_21_1","doi-asserted-by":"crossref","unstructured":"Sudrajat Muhammad Rifqi DaffaAdiono TrioSyafalni Infall GEMM-Based Quantized Neural Network FPGA Accelerator Design. En 2019 International Symposium on Electronics and Smart Devices (ISESD). IEEE 2019 1\u20135.","DOI":"10.1109\/ISESD.2019.8909538"},{"key":"e_1_3_1_22_1","doi-asserted-by":"crossref","unstructured":"Bomar BruceW.Madisetti VijayK.Williams DouglasB. Finite wordlength effects The Digital Signal Processing Handbook (1998) 3\u20131.","DOI":"10.1201\/9781420046076-c3"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems: Applications in Engineering and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-219361","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-219361","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-219361","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:46:52Z","timestamp":1777456012000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-219361"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,2]]},"references-count":21,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,2]]}},"alternative-id":["10.3233\/JIFS-219361"],"URL":"https:\/\/doi.org\/10.3233\/jifs-219361","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,2]]}}}