{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,19]],"date-time":"2026-06-19T16:28:24Z","timestamp":1781886504051,"version":"3.54.5"},"reference-count":19,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2022,6,7]],"date-time":"2022-06-07T00:00:00Z","timestamp":1654560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Convolution Neural Networks (CNNs) are gaining ground in deep learning and Artificial Intelligence (AI) domains, and they can benefit from rapid prototyping in order to produce efficient and low-power hardware designs. The inference process of a Deep Neural Network (DNN) is considered a computationally intensive process that requires hardware accelerators to operate in real-world scenarios due to the low latency requirements of real-time applications. As a result, High-Level Synthesis (HLS) tools are gaining popularity since they provide attractive ways to reduce design time complexity directly in register transfer level (RTL). In this paper, we implement a MobileNetV2 model using a state-of-the-art HLS tool in order to conduct a design space exploration and to provide insights on complex hardware designs which are tailored for DNN inference. Our goal is to combine design methodologies with sparsification techniques to produce hardware accelerators that achieve comparable error metrics within the same order of magnitude with the corresponding state-of-the-art systems while also significantly reducing the inference latency and resource utilization. Toward this end, we apply sparse matrix techniques on a MobileNetV2 model for efficient data representation, and we evaluate our designs in two different weight pruning approaches. Experimental results are evaluated with respect to the CIFAR-10 data set using several different design methodologies in order to fully explore their effects on the performance of the model under examination.<\/jats:p>","DOI":"10.3390\/s22124318","type":"journal-article","created":{"date-parts":[[2022,6,13]],"date-time":"2022-06-13T02:01:44Z","timestamp":1655085704000},"page":"4318","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Design Space Exploration of a Sparse MobileNetV2 Using High-Level Synthesis and Sparse Matrix Techniques on FPGAs"],"prefix":"10.3390","volume":"22","author":[{"given":"Antonios","family":"Tragoudaras","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Thessaly, 383 34 Volos, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9099-8238","authenticated-orcid":false,"given":"Pavlos","family":"Stoikos","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Thessaly, 383 34 Volos, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Konstantinos","family":"Fanaras","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Thessaly, 383 34 Volos, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0730-0076","authenticated-orcid":false,"given":"Athanasios","family":"Tziouvaras","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Thessaly, 383 34 Volos, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2867-9604","authenticated-orcid":false,"given":"George","family":"Floros","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Thessaly, 383 34 Volos, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Georgios","family":"Dimitriou","sequence":"additional","affiliation":[{"name":"Department of Informatics and Telecommunications, University of Thessaly, 35 100 Lamia, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kostas","family":"Kolomvatsos","sequence":"additional","affiliation":[{"name":"Department of Informatics and Telecommunications, University of Thessaly, 35 100 Lamia, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Georgios","family":"Stamoulis","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Thessaly, 383 34 Volos, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2022,6,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1145\/2656207","article-title":"Enhancing Design Space Exploration by Extending CPU\/GPU Specifications onto FPGAs","volume":"14","author":"Owaida","year":"2015","journal-title":"ACM Trans. Embed. Comput. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Liang, T., Glossner, J., Wang, L., Shi, S., and Zhang, X. (2021). Pruning and Quantization for Deep Neural Network Acceleration: A Survey. arXiv.","DOI":"10.1016\/j.neucom.2021.07.045"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1109\/MM.2018.032271057","article-title":"Motivation for and Evaluation of the First Tensor Processing Unit","volume":"38","author":"Jouppi","year":"2018","journal-title":"IEEE Micro"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Setiawan, E., and Adiono, T. (2018, January 12\u201315). Implementation of Systolic Co-processor for Deep Neural Network Inference based on SoC. Proceedings of the 2018 International SoC Design Conference (ISOCC), Daegu, Korea.","DOI":"10.1109\/ISOCC.2018.8649920"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Jha, N.K., Ravishankar, S., Mittal, S., Kaushik, A., Mandal, D., and Chandra, M. (2020, January 6\u20138). DRACO: Co-Optimizing Hardware Utilization, and Performance of DNNs on Systolic Accelerator. Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Limassol, Cyprus.","DOI":"10.1109\/ISVLSI49217.2020.00088"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1109\/MM.2019.2930057","article-title":"ERIDANUS: Efficiently Running Inference of DNNs Using Systolic Arrays","volume":"39","author":"Asgari","year":"2019","journal-title":"IEEE Micro"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1109\/TCAD.2020.3031240","article-title":"Enhancing the Utilization of Processing Elements in Spatial Deep Neural Network Accelerators","volume":"40","author":"Asadikouhanjani","year":"2021","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1109\/JETCAS.2019.2910232","article-title":"Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices","volume":"9","author":"Chen","year":"2019","journal-title":"IEEE J. Emerg. Sel. Top. Circuits Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1145\/3131289","article-title":"PLACID: A Platform for FPGA-Based Accelerator Creation for DCNNs","volume":"13","author":"Motamedi","year":"2017","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_10","first-page":"1217","article-title":"A Resource-Limited Hardware Accelerator for Convolutional Neural Networks in Embedded Vision Applications","volume":"64","author":"Moini","year":"2017","journal-title":"IEEE Trans. Circuits Syst. II Express Briefs"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1109\/LES.2017.2743247","article-title":"Tactics to Directly Map CNN Graphs on Embedded FPGAs","volume":"9","author":"Abdelouahab","year":"2017","journal-title":"IEEE Embed. Syst. Lett."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P., Jahre, M., and Vissers, K. (2017, January 22\u201324). FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA.","DOI":"10.1145\/3020078.3021744"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhao, R., Song, W., Zhang, W., Xing, T., Lin, J., Srivastava, M., Gupta, R., and Zhang, Z. (2017, January 22\u201324). Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs. Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA.","DOI":"10.1145\/3020078.3021741"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Rahman, A., Oh, S., Lee, J., and Choi, K. (2017, January 27\u201331). Design space exploration of FPGA accelerators for convolutional neural networks. Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland.","DOI":"10.23919\/DATE.2017.7927162"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Motamedi, M., Gysel, P., Akella, V., and Ghiasi, S. (2016, January 25\u201328). Design space exploration of FPGA-based Deep Convolutional Neural Networks. Proceedings of the 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), Macao, China.","DOI":"10.1109\/ASPDAC.2016.7428073"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L. (2018, January 18\u201323). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_17","unstructured":"Luo, C., He, X., Zhan, J., Wang, L., Gao, W., and Dai, J. (2020). Comparison and Benchmarking of AI Models and Frameworks on Mobile Devices. arXiv."},{"key":"ref_18","unstructured":"Zhang, T., Zhang, K., Ye, S., Li, J., Tang, J., Wen, W., Lin, X., Fardad, M., and Wang, Y. (2018). Adam-admm: A unified, systematic framework of structured weight pruning for dnns. arXiv."},{"key":"ref_19","unstructured":"Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images, University of Toronto. Technical Report TR-2009."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/12\/4318\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:25:25Z","timestamp":1760138725000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/12\/4318"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,7]]},"references-count":19,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2022,6]]}},"alternative-id":["s22124318"],"URL":"https:\/\/doi.org\/10.3390\/s22124318","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,7]]}}}