{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T03:05:15Z","timestamp":1760151915801,"version":"build-2065373602"},"reference-count":29,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T00:00:00Z","timestamp":1669766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia (FCT)","doi-asserted-by":"publisher","award":["UIDB\/50021\/2020","IPL\/2022\/eS2ST_ISEL"],"award-info":[{"award-number":["UIDB\/50021\/2020","IPL\/2022\/eS2ST_ISEL"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Instituto Polit\u00e9cnico de Lisboa","award":["UIDB\/50021\/2020","IPL\/2022\/eS2ST_ISEL"],"award-info":[{"award-number":["UIDB\/50021\/2020","IPL\/2022\/eS2ST_ISEL"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Electronics"],"abstract":"<jats:p>Binary convolutional neural networks (BCNN) have shown good accuracy for small to medium neural network models. Their extreme quantization of weights and activations reduces off-chip data transfer and greatly reduces the computational complexity of convolutions. Further reduction in the complexity of a BCNN model for fast execution can be achieved with model size reduction at the cost of network accuracy. In this paper, a multi-model inference technique is proposed to reduce the execution time of the binarized inference process without accuracy reduction. The technique considers a cascade of neural network models with different computation\/accuracy ratios. A parameterizable binarized neural network with different trade-offs between complexity and accuracy is used to obtain multiple network models. We also propose a hardware accelerator to run multi-model inference throughput in embedded systems. The multi-model inference accelerator is demonstrated on low-density Zynq-7010 and Zynq-7020 FPGA devices, classifying images from the CIFAR-10 dataset. The proposed accelerator improves the frame rate per number of LUTs by 7.2\u00d7 those of previous solutions on a ZYNQ7020 FPGA with similar accuracy. This shows the effectiveness of the multi-model inference technique and the efficiency of the proposed hardware accelerator.<\/jats:p>","DOI":"10.3390\/electronics11233966","type":"journal-article","created":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T04:32:53Z","timestamp":1669782773000},"page":"3966","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Multi-Model Inference Accelerator for Binary Convolutional Neural Networks"],"prefix":"10.3390","volume":"11","author":[{"given":"Andr\u00e9 L.","family":"de Sousa","sequence":"first","affiliation":[{"name":"INESC-ID, Instituto Superior de Engenharia de Lisboa, Instituto Polit\u00e9cnico de Lisboa, 1959-007 Lisbon, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8556-4507","authenticated-orcid":false,"given":"M\u00e1rio P.","family":"V\u00e9stias","sequence":"additional","affiliation":[{"name":"INESC-ID, Instituto Superior T\u00e9cnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3621-8322","authenticated-orcid":false,"given":"Hor\u00e1cio C.","family":"Neto","sequence":"additional","affiliation":[{"name":"INESC-ID, Instituto Superior de Engenharia de Lisboa, Instituto Polit\u00e9cnico de Lisboa, 1959-007 Lisbon, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Sainath, T.N., Mohamed, A., Kingsbury, B., and Ramabhadran, B. (2013, January 26\u201331). Deep convolutional neural networks for LVCSR. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, USA.","DOI":"10.1109\/ICASSP.2013.6639347"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Do, T., Duong, M., Dang, Q., and Le, M. (2018, January 23\u201324). Real-Time Self-Driving Car Navigation Using Deep Neural Network. Proceedings of the 2018 4th International Conference on Green Technology and Sustainable Development (GTSD), Ho Chi Minh City, Vietnam.","DOI":"10.1109\/GTSD.2018.8595590"},{"key":"ref_3","unstructured":"Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013). Maxout networks. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3212","DOI":"10.1109\/TNNLS.2018.2876865","article-title":"Object Detection with Deep Learning: A Review","volume":"30","author":"Zhao","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_5","unstructured":"Courbariaux, M., and Bengio, Y. (2016). BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or \u22121. arXiv."},{"key":"ref_6","unstructured":"Krizhevsky, A., and Hinton, G. (2009). Learning Multiple Layers of Features from Tiny Images, University of Toronto. Technical Report."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Simons, T., and Lee, D.J. (2019). A Review of Binarized Neural Networks. Electronics, 8.","DOI":"10.3390\/electronics8060661"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"107281","DOI":"10.1016\/j.patcog.2020.107281","article-title":"Binary neural networks: A survey","volume":"105","author":"Qin","year":"2020","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P.H.W., Jahre, M., and Vissers, K.A. (2016). FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. arXiv.","DOI":"10.1145\/3020078.3021744"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Blott, M., Preu\u00dfer, T.B., Fraser, N.J., Gambardella, G., O\u2019brien, K., Umuroglu, Y., Leeser, M., and Vissers, K. (2018). FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks. ACM Trans. Reconfigurable Technol. Syst., 11.","DOI":"10.1145\/3242897"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Fraser, N.J., Umuroglu, Y., Gambardella, G., Blott, M., Leong, P., Jahre, M., and Vissers, K. (2017, January 25). Scaling Binarized Neural Networks on Reconfigurable Logic. Proceedings of the PARMA-DITAM\u201917, Stockholm, Sweden.","DOI":"10.1145\/3029580.3029586"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Nakahara, H., Fujii, T., and Sato, S. (2017, January 4\u20138). A fully connected layer elimination for a binarizec convolutional neural network on an FPGA. Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium.","DOI":"10.23919\/FPL.2017.8056771"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhao, R., Song, W., Zhang, W., Xing, T., Lin, J.H., Srivastava, M., Gupta, R., and Zhang, Z. (2017, January 22\u201324). Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs. Proceedings of the 2017 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA\u201917, Monterey, CA, USA.","DOI":"10.1145\/3020078.3021741"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Guo, P., Ma, H., Chen, R., Li, P., Xie, S., and Wang, D. (2018, January 26\u201330). FBNA: A Fully Binarized Neural Network Accelerator. Proceedings of the 2018 28th International Conference on Field Programmable Logic and Applications (FPL), Dublin, Ireland.","DOI":"10.1109\/FPL.2018.00016"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Fu, C., Zhu, S., Su, H., Lee, C.E., and Zhao, J. (2019, January 24\u201326). Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA. Proceedings of the 2019 ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA\u201919, Seaside, CA, USA.","DOI":"10.1145\/3289602.3293990"},{"key":"ref_16","first-page":"451","article-title":"A Resource-Efficient Inference Accelerator for Binary Convolutional Neural Networks","volume":"68","author":"Kim","year":"2021","journal-title":"IEEE Trans. Circuits Syst. II Express Briefs"},{"key":"ref_17","unstructured":"Xie, X., Jones, M.W., and Tam, G.K.L. (2015). Real-Time Pedestrian Detection with Deep Network Cascades. Proceedings of the British Machine Vision Conference (BMVC), BMVA Press."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Diba, A., Sharma, V., Pazandeh, A., Pirsiavash, H., and Van Gool, L. (2017, January 21\u201326). Weakly Supervised Cascaded Convolutional Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.545"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kong, S., Park, J., Lee, S., and Jang, S. (2019, January 15\u201318). Lightweight Traffic Sign Recognition Algorithm based on Cascaded CNN. Proceedings of the 2019 19th International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea.","DOI":"10.23919\/ICCAS47443.2019.8971735"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1023\/B:VISI.0000013087.49260.fb","article-title":"Robust Real-Time Face Detection","volume":"57","author":"Viola","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kouris, A., Venieris, S.I., and Bouganis, C. (2018, January 26\u201330). CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks. Proceedings of the 2018 28th International Conference on Field Programmable Logic and Applications (FPL), Dublin, Ireland.","DOI":"10.1109\/FPL.2018.00034"},{"key":"ref_22","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv."},{"key":"ref_23","unstructured":"Abdelouahab, K., Pelcat, M., S\u00e9rot, J., and Berry, F. (2018). Accelerating CNN inference on FPGAs: A Survey. arXiv."},{"key":"ref_24","unstructured":"Courbariaux, M., Bengio, Y., and David, J. (2015). BinaryConnect: Training Deep Neural Networks with binary weights during propagations. arXiv."},{"key":"ref_25","unstructured":"Viola, P., and Jones, M. (2001, January 7\u201314). Robust real-time face detection. Proceedings of the Eighth IEEE International Conference on Computer Vision, ICCV 2001, Vancouver, BC, Canada."},{"key":"ref_26","unstructured":"Tornetta, G.N. (2021). Entropy methods for the confidence assessment of probabilistic classification models. arXiv."},{"key":"ref_27","unstructured":"Pappalardo, A. (2022, October 10). Xilinx\/Brevitas. Available online: https:\/\/doi.org\/10.5281\/zenodo.3333552."},{"key":"ref_28","first-page":"1","article-title":"A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks","volume":"14","author":"Li","year":"2018","journal-title":"J. Emerg. Technol. Comput. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep Residual Learning for Image Recognition. arXiv.","DOI":"10.1109\/CVPR.2016.90"}],"container-title":["Electronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-9292\/11\/23\/3966\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:30:09Z","timestamp":1760146209000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-9292\/11\/23\/3966"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,30]]},"references-count":29,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["electronics11233966"],"URL":"https:\/\/doi.org\/10.3390\/electronics11233966","relation":{},"ISSN":["2079-9292"],"issn-type":[{"type":"electronic","value":"2079-9292"}],"subject":[],"published":{"date-parts":[[2022,11,30]]}}}