{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T13:53:52Z","timestamp":1770818032237,"version":"3.50.1"},"reference-count":57,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2021,2,13]],"date-time":"2021-02-13T00:00:00Z","timestamp":1613174400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,2,13]],"date-time":"2021-02-13T00:00:00Z","timestamp":1613174400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100009033","name":"Center of Innovation Program","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100009033","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003382","name":"Core Research for Evolutional Science and Technology","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003382","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003051","name":"New Energy and Industrial Technology Development Organization","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003051","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Sign Process Syst"],"published-print":{"date-parts":[[2021,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Convolutional neural networks (CNNs) exhibit state-of-the-art performance while performing computer-vision tasks. CNNs require high-speed, low-power, and high-accuracy hardware for various scenarios, such as edge environments. However, the number of weights is so large that embedded systems cannot store them owing to their limited on-chip memory. A different method is used to minimize the input image size, for real-time processing, but it causes a considerable drop in accuracy. Although pruned sparse CNNs and special accelerators are proposed, the requirement of random access incurs a large number of wide multiplexers for a high degree of parallelism, which becomes more complicated and unsuitable for FPGA implementation. To address this problem, we propose <jats:italic>filter-wise pruning with distillation<\/jats:italic> and block RAM (BRAM)-based zero-weight skipping accelerator. It eliminates weights such that each filter has the same number of nonzero weights, performing retraining with distillation, while retaining comparable accuracy. Further, filter-wise pruning enables our accelerator to exploit <jats:italic>inter-filter parallelism<\/jats:italic>, where a processing block for a layer executes filters concurrently, with a straightforward architecture. We also propose an <jats:italic>overlapped tiling algorithm<\/jats:italic>, where tiles are extracted with overlap to prevent both accuracy degradation and high utilization of BRAMs storing high-resolution images. Our evaluation using semantic-segmentation tasks showed a 1.8 times speedup and 18.0 times increase in power efficiency of our FPGA design compared with a desktop GPU. Additionally, compared with the conventional FPGA implementation, the speedup and accuracy improvement were 1.09 times and 6.6 points, respectively. Therefore, our approach is useful for FPGA implementation and exhibits considerable accuracy for applications in embedded systems.<\/jats:p>","DOI":"10.1007\/s11265-021-01642-6","type":"journal-article","created":{"date-parts":[[2021,2,14]],"date-time":"2021-02-14T09:25:44Z","timestamp":1613294744000},"page":"499-512","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["FPGA-Based Inter-layer Pipelined Accelerators for Filter-Wise Weight-Balanced Sparse Fully Convolutional Networks with Overlapped Tiling"],"prefix":"10.1007","volume":"93","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4627-0957","authenticated-orcid":false,"given":"Masayuki","family":"Shimoda","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Youki","family":"Sada","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hiroki","family":"Nakahara","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,2,13]]},"reference":[{"key":"1642_CR1","doi-asserted-by":"publisher","unstructured":"Albericio, J., Judd, P., Hetherington, T., Aamodt, T., Jerger, N. E., & Moshovos, A. (2016). Cnvlutin: ineffectual-neuron-free deep neural network computing. In 2016 ACM\/IEEE 43rd annual international symposium on computer architecture (ISCA). https:\/\/doi.org\/10.1109\/ISCA.2016.11 (pp. 1\u201313).","DOI":"10.1109\/ISCA.2016.11"},{"key":"1642_CR2","unstructured":"Alvarez, J. M., & Salzmann, M. (2017). Compression-aware training of deep networks. In Advances in neural information processing systems (pp. 856\u2013867)."},{"issue":"12","key":"1642_CR3","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","volume":"39","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481\u20132495. https:\/\/doi.org\/10.1109\/TPAMI.2016.2644615.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1642_CR4","unstructured":"Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J., & et al. (2016). End to end learning for self-driving cars. arXiv:1604.07316."},{"key":"1642_CR5","doi-asserted-by":"publisher","unstructured":"Cao, S., Zhang, C., Yao, Z., Xiao, W., Nie, L., Zhan, D., Liu, Y., Wu, M., & Zhang, L. (2019). Efficient and effective sparse lstm on fpga with bank-balanced sparsity. In Proceedings of the 2019 ACM\/SIGDA international symposium on field-programmable gate arrays, FPGA \u201919. https:\/\/doi.org\/10.1145\/3289602.3293898 (pp. 63\u201372). New York: Association for Computing Machinery.","DOI":"10.1145\/3289602.3293898"},{"key":"1642_CR6","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017). Xception: deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1251\u2013 1258).","DOI":"10.1109\/CVPR.2017.195"},{"key":"1642_CR7","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3213\u20133223).","DOI":"10.1109\/CVPR.2016.350"},{"key":"1642_CR8","unstructured":"Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks: training deep neural networks with weights and activations constrained to +\u20091 or \u2212\u20091. arXiv:1602.02830."},{"key":"1642_CR9","doi-asserted-by":"publisher","unstructured":"Deng, C., Liao, S., Xie, Y., Parhi, K. K., Qian, X., & Yuan, B. (2018). Permdnn: efficient compressed dnn architecture with permuted diagonal matrices. In 2018 51st Annual IEEE\/ACM international symposium on microarchitecture (MICRO). https:\/\/doi.org\/10.1109\/MICRO.2018.00024 (pp. 189\u2013202).","DOI":"10.1109\/MICRO.2018.00024"},{"key":"1642_CR10","doi-asserted-by":"publisher","unstructured":"Fan, H., Liu, S., Ferianc, M., Ng, H., Que, Z., Liu, S., Niu, X., & Luk, W. (2018). A real-time object detection accelerator with compressed ssdlite on fpga. In 2018 International conference on field-programmable technology (FPT). https:\/\/doi.org\/10.1109\/FPT.2018.00014 (pp. 14\u201321).","DOI":"10.1109\/FPT.2018.00014"},{"key":"1642_CR11","doi-asserted-by":"crossref","unstructured":"Fang, S., Tian, L., Wang, J., Liang, S., Xie, D., Chen, Z., Sui, L., Yu, Q., Sun, X., Yao, S., Shan, Y., & Wang, Y. (2018). Real-time object detection and semantic segmentation hardware system with deep learning networks. In International conference on field programmable technology, ICFPT 2018, Okinawa, Japan, December 10\u201314, 2018, p. (to be appear).","DOI":"10.1109\/FPT.2018.00081"},{"key":"1642_CR12","unstructured":"Gray, S., Radford, A., & Kingma, D. P. (2017). Gpu kernels for block-sparse weights. arXiv:1711.09224, 3."},{"key":"1642_CR13","unstructured":"Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (pp. 1135\u20131143)."},{"key":"1642_CR14","doi-asserted-by":"crossref","unstructured":"Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., & Dally, W. J. (2016). Eie: efficient inference engine on compressed deep neural network. In 2016 ACM\/IEEE 43rd annual international symposium on computer architecture (ISCA) (pp. 243\u2013254). IEEE.","DOI":"10.1109\/ISCA.2016.30"},{"key":"1642_CR15","doi-asserted-by":"publisher","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on computer vision and pattern recognition (CVPR). https:\/\/doi.org\/10.1109\/CVPR.2016.90 (pp. 770\u2013778).","DOI":"10.1109\/CVPR.2016.90"},{"key":"1642_CR16","doi-asserted-by":"crossref","unstructured":"He, Y., Zhang, X., & Sun, J. (2017). Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE international conference on computer vision (pp. 1389\u20131397).","DOI":"10.1109\/ICCV.2017.155"},{"key":"1642_CR17","doi-asserted-by":"crossref","unstructured":"He, Y., Liu, P., Wang, Z., Hu, Z., & Yang, Y. (2019). Filter pruning via geometric median for deep convolutional neural networks acceleration. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4340\u20134349).","DOI":"10.1109\/CVPR.2019.00447"},{"key":"1642_CR18","unstructured":"Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv:1503.02531."},{"key":"1642_CR19","unstructured":"Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861."},{"key":"1642_CR20","unstructured":"Ioffe, S., & Szegedy, C. (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167."},{"key":"1642_CR21","doi-asserted-by":"crossref","unstructured":"Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., & Kalenichenko, D. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2704\u20132713).","DOI":"10.1109\/CVPR.2018.00286"},{"key":"1642_CR22","doi-asserted-by":"publisher","unstructured":"Kang, H. (2019). Real-time object detection on 640x480 image with vgg16+ssd. In 2019 International conference on field-programmable technology (ICFPT). https:\/\/doi.org\/10.1109\/ICFPT47387.2019.00082 (pp. 419\u2013422).","DOI":"10.1109\/ICFPT47387.2019.00082"},{"issue":"7","key":"1642_CR23","first-page":"2093","volume":"30","author":"HJ Kang","year":"2019","unstructured":"Kang, H. J. (2019). Accelerator-aware pruning for convolutional neural networks. IEEE Transactions on Circuits and Systems for Video Technology, 30(7), 2093\u20132103. IEEE.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"1642_CR24","unstructured":"Krishnamoorthi, R. (2018). Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv:1806.08342."},{"key":"1642_CR25","unstructured":"Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th international conference on neural information processing systems, NIPS\u201912. http:\/\/dl.acm.org\/citation.cfm?id=2999134.2999257, (Vol. 1 pp. 1097\u20131105). Curran Associates Inc."},{"issue":"5","key":"1642_CR26","doi-asserted-by":"publisher","first-page":"1218","DOI":"10.1109\/TVLSI.2019.2897052","volume":"27","author":"B Lai","year":"2019","unstructured":"Lai, B., Pan, J., & Lin, C. (2019). Enhancing utilization of simd-like accelerator for sparse convolutional neural networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(5), 1218\u20131222. https:\/\/doi.org\/10.1109\/TVLSI.2019.2897052.","journal-title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems"},{"issue":"7553","key":"1642_CR27","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y Lecun","year":"2015","unstructured":"Lecun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436\u2013444. https:\/\/doi.org\/10.1038\/nature14539.","journal-title":"Nature"},{"key":"1642_CR28","doi-asserted-by":"publisher","unstructured":"Li, J., Yan, G., Lu, W., Jiang, S., Gong, S., Wu, J., & Li, X. (2018). Ccr: a concise convolution rule for sparse neural network accelerators. In 2018 Design, automation test in europe conference exhibition (DATE). https:\/\/doi.org\/10.23919\/DATE.2018.8342001 (pp. 189\u2013194).","DOI":"10.23919\/DATE.2018.8342001"},{"key":"1642_CR29","doi-asserted-by":"crossref","unstructured":"Lin, C.Y., & Lai, B.C. (2018). Supporting compressed-sparse activations and weights on simd-like accelerator for sparse convolutional neural networks. In Proceedings of the 23rd Asia and South Pacific design automation conference, ASPDAC \u201918. http:\/\/dl.acm.org\/citation.cfm?id=3201607.3201630 (pp. 105\u2013110). Piscataway: IEEE Press.","DOI":"10.1109\/ASPDAC.2018.8297290"},{"key":"1642_CR30","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S. E., Fu, C. Y., & Berg, A. C. (2016). Ssd: single shot multibox detector. European conference on computer vision.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"1642_CR31","doi-asserted-by":"crossref","unstructured":"Luo, J. H., Wu, J., & Lin, W. (2017). Thinet: a filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision (pp. 5058\u20135066).","DOI":"10.1109\/ICCV.2017.541"},{"key":"1642_CR32","doi-asserted-by":"publisher","unstructured":"Lyu, Y., Bai, L., & Huang, X. (2018). Real-time road segmentation using lidar data processing on an fpga. In 2018 IEEE international symposium on circuits and systems (ISCAS). https:\/\/doi.org\/10.1109\/ISCAS.2018.8351244(pp. 1\u20135).","DOI":"10.1109\/ISCAS.2018.8351244"},{"key":"1642_CR33","doi-asserted-by":"crossref","unstructured":"Mao, H., Han, S., Pool, J., Li, W., Liu, X., Wang, Y., & Dally, W. J. (2017). Exploring the regularity of sparse structure in convolutional neural networks. arXiv:1705.08922.","DOI":"10.1109\/CVPRW.2017.241"},{"key":"1642_CR34","unstructured":"Molchanov, D., Ashukha, A., & Vetrov, D. (2017). Variational dropout sparsifies deep neural networks. arXiv:1701.05369."},{"key":"1642_CR35","unstructured":"Narang, S., Undersander, E., & Diamos, G. (2017). Block-sparse recurrent neural networks. arXiv:1711.02782."},{"key":"1642_CR36","unstructured":"NVIDIA. (2020). TensorRT. https:\/\/developer.nvidia.com\/tensorrt."},{"key":"1642_CR37","doi-asserted-by":"publisher","unstructured":"Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J., Keckler, S. W., & Dally, W. J. (2017). Scnn: an accelerator for compressed-sparse convolutional neural networks. In 2017 ACM\/IEEE 44th annual international symposium on computer architecture (ISCA). https:\/\/doi.org\/10.1145\/3079856.3080254 (pp. 27\u201340).","DOI":"10.1145\/3079856.3080254"},{"key":"1642_CR38","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., & Chintala, S. (2019). Pytorch: an imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d\u2019 Alch\u00e9-Buc, E. Fox, & R. Garnett (Eds.) Advances in neural information processing systems (Vol. 32, pp. 8024\u20138035). http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfhttp:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfhttp:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf. Curran Associates, Inc."},{"issue":"1","key":"1642_CR39","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1109\/TITS.2017.2750080","volume":"19","author":"E Romera","year":"2017","unstructured":"Romera, E., Alvarez, J. M., Bergasa, L. M., & Arroyo, R. (2017). Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1), 263\u2013272.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"1642_CR40","doi-asserted-by":"crossref","unstructured":"Saqib, M., Daud Khan, S., Sharma, N., & Blumenstein, M. (2017). A study on detecting drones using deep convolutional neural networks. In 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS) (pp. 1\u20135).","DOI":"10.1109\/AVSS.2017.8078541"},{"issue":"4","key":"1642_CR41","doi-asserted-by":"publisher","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","volume":"39","author":"E Shelhamer","year":"2017","unstructured":"Shelhamer, E., Long, J., & Darrell, T. (2017). Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640\u2013651. https:\/\/doi.org\/10.1109\/TPAMI.2016.2572683.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1642_CR42","doi-asserted-by":"publisher","unstructured":"Shimoda, M., Sada, Y., & Nakahara, H. (2019). Filter-wise pruning approach to FPGA implementation of fully convolutional network for semantic segmentation. In Applied reconfigurable computing - 15th international symposium, ARC 2019, Darmstadt, Germany, April 9\u201311, 2019, Proceedings. https:\/\/doi.org\/10.1007\/978-3-030-17227-5_26(pp. 371\u2013386).","DOI":"10.1007\/978-3-030-17227-5_26"},{"key":"1642_CR43","unstructured":"Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556."},{"key":"1642_CR44","unstructured":"Tokui, S., Oono, K., Hido, S., & Clayton, J. (2015). Chainer: a next-generation open source framework for deep learning. In Proceedings of workshop on machine learning systems (LearningSys) in the twenty-ninth annual conference on neural information processing systems (NIPS). http:\/\/learningsys.org\/papers\/LearningSys_2015_paper_33.pdf."},{"key":"1642_CR45","doi-asserted-by":"publisher","unstructured":"Wang, J., Yuan, Z., Liu, R., Yang, H., & Liu, Y. (2019). An n-way group association architecture and sparse data group association load balancing algorithm for sparse cnn accelerators. In Proceedings of the 24th Asia and South Pacific design automation conference, ASPDAC \u201919. https:\/\/doi.org\/10.1145\/3287624.3287626. http:\/\/doi.acm.org\/10.1145\/3287624.3287626 (pp. 329\u2013334). New York: ACM.","DOI":"10.1145\/3287624.3287626"},{"key":"1642_CR46","unstructured":"Wong, H.T.H. A superscalar out-of-order x86 soft processor for fpga. Ph.D. thesis."},{"key":"1642_CR47","doi-asserted-by":"crossref","unstructured":"Wu, B., Wan, A., Yue, X., Jin, P., Zhao, S., Golmant, N., Gholaminejad, A., Gonzalez, J., & Keutzer, K. (2018). Shift: a zero flop, zero parameter alternative to spatial convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9127\u20139135).","DOI":"10.1109\/CVPR.2018.00951"},{"key":"1642_CR48","doi-asserted-by":"crossref","unstructured":"Xiang, Y., Schmidt, T., Narayanan, V., & Fox, D. (2017). Posecnn: a convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv:1711.00199.","DOI":"10.15607\/RSS.2018.XIV.019"},{"key":"1642_CR49","doi-asserted-by":"crossref","unstructured":"Yang, Y., Huang, Q., Wu, B., Zhang, T., Ma, L., Gambardella, G., Blott, M., Lavagno, L., Vissers, K., Wawrzynek, J., & et al. (2019). Synetgy: algorithm-hardware co-design for convnet accelerators on embedded fpgas. In Proceedings of the 2019 ACM\/SIGDA international symposium on field-programmable gate arrays (pp. 23\u201332).","DOI":"10.1145\/3289602.3293902"},{"key":"1642_CR50","doi-asserted-by":"publisher","unstructured":"Yu, J., Lukefahr, A., Palframan, D., Dasika, G., Das, R., & Mahlke, S. (2017). Scalpel: customizing dnn pruning to the underlying hardware parallelism. In Proceedings of the 44th annual international symposium on computer architecture, ISCA \u201917. (pp. 548\u2013560). New York: ACM https:\/\/doi.org\/10.1145\/3079856.3080215.","DOI":"10.1145\/3079856.3080215"},{"key":"1642_CR51","doi-asserted-by":"publisher","unstructured":"Yuan, Z., Yue, J., Yang, H., Wang, Z., Li, J., Yang, Y., Guo, Q., Li, X., Chang, M., Yang, H., & Liu, Y. (2018). Sticker: a 0.41-62.1 tops\/w 8bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In 2018 IEEE symposium on VLSI circuits. https:\/\/doi.org\/10.1109\/VLSIC.2018.8502404 (pp. 33\u201334).","DOI":"10.1109\/VLSIC.2018.8502404"},{"key":"1642_CR52","doi-asserted-by":"publisher","unstructured":"Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., & Chen, Y. (2016). Cambricon-x: an accelerator for sparse neural networks. In 2016 49th Annual IEEE\/ACM international symposium on microarchitecture (MICRO). https:\/\/doi.org\/10.1109\/MICRO.2016.7783723 (pp. 1\u201312).","DOI":"10.1109\/MICRO.2016.7783723"},{"key":"1642_CR53","doi-asserted-by":"publisher","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In CVPR. https:\/\/doi.org\/10.1109\/CVPR.2017.660 (pp. 6230\u20136239).","DOI":"10.1109\/CVPR.2017.660"},{"key":"1642_CR54","doi-asserted-by":"publisher","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In 2017 IEEE conference on computer vision and pattern recognition (CVPR). https:\/\/doi.org\/10.1109\/CVPR.2017.660 (pp. 6230\u20136239).","DOI":"10.1109\/CVPR.2017.660"},{"key":"1642_CR55","doi-asserted-by":"crossref","unstructured":"Zhao, H., Qi, X., Shen, X., Shi, J., & Jia, J. (2018). ICNet for real-time semantic segmentation on high-resolution images. In ECCV.","DOI":"10.1007\/978-3-030-01219-9_25"},{"key":"1642_CR56","doi-asserted-by":"publisher","unstructured":"Zhou, X., Du, Z., Guo, Q., Liu, S., Liu, C., Wang, C., Zhou, X., Li, L., Chen, T., & Chen, Y. (2018). Cambricon-s: addressing irregularity in sparse neural networks through a cooperative software\/hardware approach. In 2018 51st Annual IEEE\/ACM international symposium on microarchitecture (MICRO). https:\/\/doi.org\/10.1109\/MICRO.2018.00011 (pp. 15\u201328).","DOI":"10.1109\/MICRO.2018.00011"},{"key":"1642_CR57","unstructured":"Zhu, M., & Gupta, S. (2017). To prune, or not to prune: exploring the efficacy of pruning for model compression. CoRR arXiv:1710.01878."}],"container-title":["Journal of Signal Processing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-021-01642-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11265-021-01642-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11265-021-01642-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,19]],"date-time":"2021-05-19T05:05:38Z","timestamp":1621400738000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11265-021-01642-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,13]]},"references-count":57,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,5]]}},"alternative-id":["1642"],"URL":"https:\/\/doi.org\/10.1007\/s11265-021-01642-6","relation":{},"ISSN":["1939-8018","1939-8115"],"issn-type":[{"value":"1939-8018","type":"print"},{"value":"1939-8115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,13]]},"assertion":[{"value":"11 May 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 December 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 January 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 February 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}