{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T03:00:01Z","timestamp":1760151601520,"version":"build-2065373602"},"reference-count":28,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T00:00:00Z","timestamp":1649721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project n\u00ba 037902]","award":["Funding Reference: POCI-01-0247-FEDER-037902]"],"award-info":[{"award-number":["Funding Reference: POCI-01-0247-FEDER-037902]"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Due to a point cloud\u2019s sparse nature, a sparse convolution block design is necessary to deal with its particularities. Mechanisms adopted in computer vision have recently explored the advantages of data processing in more energy-efficient hardware, such as the FPGA, as a response to the need to run these algorithms on resource-constrained edge devices. However, implementing it in hardware has not been properly explored, resulting in a small number of studies aimed at analyzing the potential of sparse convolutions and their efficiency on resource-constrained hardware platforms. This article presents the design of a customizable hardware block for the voting convolution. We carried out an in-depth analysis to determine under which conditions the use of the voting scheme is justified instead of dense convolutions. The proposed hardware design achieves an energy consumption about 8.7 times lower than similar works in the literature by ignoring unnecessary arithmetic operations with null weights and leveraging data dependency. Access to data memory was also reduced to the minimum necessary, leading to improvements of around 55% in processing time. To evaluate both the performance and applicability of the proposed solution, the voting convolution was integrated into the well-known PointPillars model, where it achieves improvements between 23.05% and 80.44% without a significant effect on detection performance.<\/jats:p>","DOI":"10.3390\/s22082943","type":"journal-article","created":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T22:48:45Z","timestamp":1649803725000},"page":"2943","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Efficient Hardware Design and Implementation of the Voting Scheme-Based Convolution"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1723-421X","authenticated-orcid":false,"given":"Pedro","family":"Pereira","sequence":"first","affiliation":[{"name":"Algoritmi Centre, University of Minho, 4800-058 Guimar\u00e3es, Portugal"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4772-8659","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Silva","sequence":"additional","affiliation":[{"name":"Algoritmi Centre, University of Minho, 4800-058 Guimar\u00e3es, Portugal"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7075-3364","authenticated-orcid":false,"given":"Ant\u00f3nio","family":"Silva","sequence":"additional","affiliation":[{"name":"Algoritmi Centre, University of Minho, 4800-058 Guimar\u00e3es, Portugal"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9736-5812","authenticated-orcid":false,"given":"Duarte","family":"Fernandes","sequence":"additional","affiliation":[{"name":"Algoritmi Centre, University of Minho, 4800-058 Guimar\u00e3es, Portugal"},{"name":"Associa\u00e7\u00e3o Laborat\u00f3rio Colaborativo em Transforma\u00e7\u00e3o Digital\u2014DTx Colab, 4800-058 Guimar\u00e3es, Portugal"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9929-8705","authenticated-orcid":false,"given":"Rui","family":"Machado","sequence":"additional","affiliation":[{"name":"Algoritmi Centre, University of Minho, 4800-058 Guimar\u00e3es, Portugal"},{"name":"Associa\u00e7\u00e3o Laborat\u00f3rio Colaborativo em Transforma\u00e7\u00e3o Digital\u2014DTx Colab, 4800-058 Guimar\u00e3es, Portugal"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,4,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4338","DOI":"10.1109\/TPAMI.2020.3005434","article-title":"Deep learning for 3d point clouds: A survey","volume":"43","author":"Guo","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1478","DOI":"10.1109\/TNS.2020.2983662","article-title":"Understanding the Impact of Quantization, Accuracy, and Radiation on the Reliability of Convolutional Neural Networks on FPGAs","volume":"67","author":"Libano","year":"2020","journal-title":"IEEE Trans. Nucl. Sci."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.inffus.2020.11.002","article-title":"Point-cloud based 3D object detection and classification methods for self-driving applications: A survey and taxonomy","volume":"68","author":"Fernandes","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_5","first-page":"10","article-title":"Voting for voting in online point cloud object detection","volume":"1","author":"Wang","year":"2015","journal-title":"Robot. Sci. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Engelcke, M., Rao, D., Wang, D., Tong, C., and Posner, I. (June, January 29). Vote3deep: Fast object detection in 3d point clouds using efficient convolutional neural networks. Proceedings of the 2017 IEEE International Conference On Robotics And Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989161"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Graham, B., and Maaten, L. (2017). Submanifold sparse convolutional networks. arXiv.","DOI":"10.1109\/CVPR.2018.00961"},{"key":"ref_8","unstructured":"Abdelouahab, K., Pelcat, M., Serot, J., and Berry, F. (2018). Accelerating CNN inference on FPGAs: A survey. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1007\/s00521-018-3761-1","article-title":"A survey of FPGA-based accelerators for convolutional neural networks","volume":"32","author":"Mittal","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Rahman, A., Oh, S., Lee, J., and Choi, K. (2017, January 27\u201331). Design space exploration of FPGA accelerators for convolutional neural networks. Proceedings of the Design, Automation & Test In Europe Conference & Exhibition (DATE), Lausanne, Switzerland.","DOI":"10.23919\/DATE.2017.7927162"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1109\/JETCAS.2019.2910232","article-title":"Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices","volume":"9","author":"Chen","year":"2019","journal-title":"IEEE J. Emerg. Sel. Top. Circuits Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Qiu, J., Wang, J., Yao, S., Guo, K., Li, B., Zhou, E., Yu, J., Tang, T., Xu, N., and Song, S. (2016, January 21\u201323). Going Deeper with Embedded FPGA Platform for Convolutional Neural Network. Proceedings of the 2016 ACM\/SIGDA International Symposium On Field-Programmable Gate Arrays, Monterey, CA, USA.","DOI":"10.1145\/2847263.2847265"},{"key":"ref_13","unstructured":"(2021, January 20). Xilinx Adaptable and Real-Time AI Inference Acceleration. Available online: https:\/\/www.xilinx.com\/products\/design-tools\/vitis\/vitis-ai.html."},{"key":"ref_14","unstructured":"(2021, July 01). CDL A Deep Learning Platform Optimized for Implementation of FPGAs. Available online: https:\/\/coredeeplearning.ai\/,."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_17","unstructured":"Abdelouahab, K., Pelcat, M., Serot, J., Bourrasset, C., Quinton, J., and Berry, F. (2017). Hardware Automated Dataflow Deployment of CNNs. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1145\/3007787.3001177","article-title":"Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks","volume":"44","author":"Chen","year":"2016","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3007787.3001138","article-title":"Cnvlutin: Ineffectual-neuron-free deep neural network computing","volume":"44","author":"Albericio","year":"2016","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., and Chen, Y. (2016, January 15\u201319). Cambricon-X: An accelerator for sparse neural networks. Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium On Microarchitecture (MICRO), Taipei, Taiwan.","DOI":"10.1109\/MICRO.2016.7783723"},{"key":"ref_21","unstructured":"Han, S., Kang, J., Mao, H., Hu, Y., Li, X., Li, Y., Xie, D., Luo, H., Yao, S., and Wang, Y. (2017, January 22\u201324). Ese: Efficient speech recognition engine with sparse lstm on fpga. Proceedings of the 2017 ACM\/SIGDA International Symposium On Field-Programmable Gate Arrays, Monterey, CA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Lu, L., Xie, J., Huang, R., Zhang, J., Lin, W., and Liang, Y. (May, January 28). An efficient hardware accelerator for sparse convolutional neural networks on FPGAs. Proceedings of the 2019 IEEE 27th Annual International Symposium On Field-Programmable Custom Computing Machines (FCCM), San Diego, CA, USA.","DOI":"10.1109\/FCCM.2019.00013"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Lang, A., Vora, S., Caesar, H., Zhou, L., Yang, J., and Beijbom, O. (2019, January 15\u201320). Pointpillars: Fast encoders for object detection from point clouds. Proceedings of the IEEE\/CVF Conference On Computer Vision And Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01298"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Yan, Y., Mao, Y., and Li, B. (2018). Second: Sparsely embedded convolutional detection. Sensors, 18.","DOI":"10.3390\/s18103337"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Silva, J., Pereira, P., Machado, R., N\u00e9voa, R., Melo-Pinto, P., and Fernandes, D. (2022). Customizable FPGA-based Hardware Accelerator For Standard Convolution Processes Empowered with Quantization Applied to LiDAR Data. Sensors, 22.","DOI":"10.3390\/s22062184"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4196","DOI":"10.1109\/TCSI.2018.2840092","article-title":"Energy-Efficient Convolution Architecture Based on Rescheduled Dataflow","volume":"65","author":"Jo","year":"2018","journal-title":"IEEE Trans. Circuits Syst. I Regul. Pap."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1109\/JSSC.2016.2616357","article-title":"Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks","volume":"52","author":"Chen","year":"2017","journal-title":"IEEE J. Solid-State Circuits"},{"key":"ref_28","unstructured":"(2021, January 15). Versatran01 Kittiarchives. GitHub Repository. Available online: https:\/\/gist.github.com\/versatran01\/19bbb78c42e0cafb1807625bbb99bd85."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/8\/2943\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:52:24Z","timestamp":1760136744000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/8\/2943"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,12]]},"references-count":28,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,4]]}},"alternative-id":["s22082943"],"URL":"https:\/\/doi.org\/10.3390\/s22082943","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,4,12]]}}}