{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:44:56Z","timestamp":1772725496810,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["2112562, 1955246, 1937435"],"award-info":[{"award-number":["2112562, 1955246, 1937435"]}]},{"name":"ARO","award":["W911NF-19-2-0107"],"award-info":[{"award-number":["W911NF-19-2-0107"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,18]]},"DOI":"10.1145\/3470496.3527419","type":"proceedings-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T19:06:01Z","timestamp":1654023961000},"page":"522-535","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":29,"title":["Cascading structured pruning"],"prefix":"10.1145","author":[{"given":"Edward","family":"Hanson","sequence":"first","affiliation":[{"name":"Duke University"}]},{"given":"Shiyu","family":"Li","sequence":"additional","affiliation":[{"name":"Duke University"}]},{"given":"Hai 'Helen'","family":"Li","sequence":"additional","affiliation":[{"name":"Duke University"}]},{"given":"Yiran","family":"Chen","sequence":"additional","affiliation":[{"name":"Duke University"}]}],"member":"320","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"382","volume-title":"MICRO 2017","author":"Albericio J.","year":"2017","unstructured":"J. Albericio , A. Delmas , P. Judd , S. Sharify , G. O'Leary , R. Genov , and A. Moshovos , \" Bit-pragmatic deep neural network computing,\" in Proceedings of the 50th Annual IEEE\/ACM International Symposium on Microarchitecture , MICRO 2017 , Cambridge, MA, USA, October 14--18 , 2017 . ACM, 2017, pp. 382 -- 394 . J. Albericio, A. Delmas, P. Judd, S. Sharify, G. O'Leary, R. Genov, and A. Moshovos, \"Bit-pragmatic deep neural network computing,\" in Proceedings of the 50th Annual IEEE\/ACM International Symposium on Microarchitecture, MICRO 2017, Cambridge, MA, USA, October 14--18, 2017. ACM, 2017, pp. 382--394."},{"key":"e_1_3_2_1_2_1","first-page":"1","volume-title":"Cnvlutin: Ineffectual-neuron-free deep neural network computing,\" in 2016 ACM\/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)","author":"Albericio J.","year":"2016","unstructured":"J. Albericio , P. Judd , T. Hetherington , T. Aamodt , N. E. Jerger , and A. Moshovos , \" Cnvlutin: Ineffectual-neuron-free deep neural network computing,\" in 2016 ACM\/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) , 2016 , pp. 1 -- 13 . J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N. E. Jerger, and A. Moshovos, \"Cnvlutin: Ineffectual-neuron-free deep neural network computing,\" in 2016 ACM\/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, pp. 1--13."},{"key":"e_1_3_2_1_3_1","first-page":"269","article-title":"DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning,\" in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '14. New York, NY","author":"Chen T.","year":"2014","unstructured":"T. Chen , Z. Du , N. Sun , J. Wang , C. Wu , Y. Chen , and O. Temam , \" DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning,\" in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '14. New York, NY , USA: Association for Computing Machinery , 2014 , p. 269 -- 284 . T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, \"DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning,\" in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '14. New York, NY, USA: Association for Computing Machinery, 2014, p. 269--284.","journal-title":"USA: Association for Computing Machinery"},{"key":"e_1_3_2_1_4_1","volume-title":"transformers. zip: Compressing transformers with pruning and quantization,\" Technical report","author":"Cheong R.","year":"2019","unstructured":"R. Cheong and R. Daniel , \" transformers. zip: Compressing transformers with pruning and quantization,\" Technical report , Stanford University , 2019 . R. Cheong and R. Daniel, \"transformers. zip: Compressing transformers with pruning and quantization,\" Technical report, Stanford University, 2019."},{"key":"e_1_3_2_1_5_1","volume-title":"PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices,\" in Proceedings of the 51st Annual IEEE\/ACM International Symposium on Microarchitecture, ser. MICRO-51","author":"Deng C.","year":"2018","unstructured":"C. Deng , S. Liao , Y. Xie , K. K. Parhi , X. Qian , and B. Yuan , \" PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices,\" in Proceedings of the 51st Annual IEEE\/ACM International Symposium on Microarchitecture, ser. MICRO-51 . IEEE Press , 2018 , p. 189--202. C. Deng, S. Liao, Y. Xie, K. K. Parhi, X. Qian, and B. Yuan, \"PermDNN: Efficient compressed DNN architecture with permuted diagonal matrices,\" in Proceedings of the 51st Annual IEEE\/ACM International Symposium on Microarchitecture, ser. MICRO-51. IEEE Press, 2018, p. 189--202."},{"key":"e_1_3_2_1_6_1","first-page":"264","volume-title":"TIE: Energy-efficient tensor train-based inference engine for deep neural network,\" in 2019 ACM\/IEEE 46th Annual International Symposium on Computer Architecture (ISCA)","author":"Deng C.","year":"2019","unstructured":"C. Deng , F. Sun , X. Qian , J. Lin , Z. Wang , and B. Yuan , \" TIE: Energy-efficient tensor train-based inference engine for deep neural network,\" in 2019 ACM\/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) , 2019 , pp. 264 -- 277 . C. Deng, F. Sun, X. Qian, J. Lin, Z. Wang, and B. Yuan, \"TIE: Energy-efficient tensor train-based inference engine for deep neural network,\" in 2019 ACM\/IEEE 46th Annual International Symposium on Computer Architecture (ISCA), 2019, pp. 264--277."},{"key":"e_1_3_2_1_7_1","first-page":"248","volume-title":"ImageNet: A large-scale hierarchical image database,\" in 2009 IEEE Conference on Computer Vision and Pattern Recognition","author":"Deng J.","year":"2009","unstructured":"J. Deng , W. Dong , R. Socher , L.-J. Li , K. Li , and L. Fei-Fei , \" ImageNet: A large-scale hierarchical image database,\" in 2009 IEEE Conference on Computer Vision and Pattern Recognition , 2009 , pp. 248 -- 255 . J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, \"ImageNet: A large-scale hierarchical image database,\" in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248--255."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3124552"},{"key":"e_1_3_2_1_9_1","first-page":"1607","volume-title":"Proceedings of Machine Learning Research","volume":"97","author":"Ding X.","year":"2019","unstructured":"X. Ding , G. Ding , Y. Guo , J. Han , and C. Yan , \" Approximated oracle filter pruning for destructive CNN width optimization,\" in Proceedings of the 36th International Conference on Machine Learning, ser . Proceedings of Machine Learning Research , vol. 97 . PMLR, 09--15 Jun 2019 , pp. 1607 -- 1616 . X. Ding, G. Ding, Y. Guo, J. Han, and C. Yan, \"Approximated oracle filter pruning for destructive CNN width optimization,\" in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 97. PMLR, 09--15 Jun 2019, pp. 1607--1616."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358291"},{"key":"e_1_3_2_1_11_1","first-page":"243","volume-title":"ISCA 2016","author":"Han S.","year":"2016","unstructured":"S. Han , X. Liu , H. Mao , J. Pu , A. Pedram , M. A. Horowitz , and W. J. Dally , \" EIE: efficient inference engine on compressed deep neural network,\" in 43rd ACM\/IEEE Annual International Symposium on Computer Architecture , ISCA 2016 , Seoul, South Korea, June 18--22 , 2016 . IEEE Computer Society, 2016, pp. 243 -- 254 . S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, \"EIE: efficient inference engine on compressed deep neural network,\" in 43rd ACM\/IEEE Annual International Symposium on Computer Architecture, ISCA 2016, Seoul, South Korea, June 18--22, 2016. IEEE Computer Society, 2016, pp. 243--254."},{"key":"e_1_3_2_1_12_1","volume-title":"ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings","author":"Han S.","year":"2016","unstructured":"S. Han , H. Mao , and W. J. Dally , \" Deep Compression: Compressing deep neural network with pruning, trained quantization and huffman coding,\" in 4th International Conference on Learning Representations , ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings , 2016 . S. Han, H. Mao, and W. J. Dally, \"Deep Compression: Compressing deep neural network with pruning, trained quantization and huffman coding,\" in 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings, 2016."},{"key":"e_1_3_2_1_13_1","first-page":"770","article-title":"Deep residual learning for image recognition","author":"He K.","year":"2016","unstructured":"K. He , X. Zhang , S. Ren , and J. Sun , \" Deep residual learning for image recognition ,\" in Proceedings of the IEEE conference on computer vision and pattern recognition , 2016 , pp. 770 -- 778 . K. He, X. Zhang, S. Ren, and J. Sun, \"Deep residual learning for image recognition,\" in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770--778.","journal-title":"Proceedings of the IEEE conference on computer vision and pattern recognition"},{"key":"e_1_3_2_1_14_1","first-page":"2234","volume-title":"IJCAI'18","author":"He Y.","year":"2018","unstructured":"Y. He , G. Kang , X. Dong , Y. Fu , and Y. Yang , \" Soft filter pruning for accelerating deep convolutional neural networks,\" in Proceedings of the 27th International Joint Conference on Artificial Intelligence, ser . IJCAI'18 . AAAI Press , 2018 , p. 2234 -- 2240 . Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang, \"Soft filter pruning for accelerating deep convolutional neural networks,\" in Proceedings of the 27th International Joint Conference on Artificial Intelligence, ser. IJCAI'18. AAAI Press, 2018, p. 2234--2240."},{"key":"e_1_3_2_1_15_1","first-page":"4340","article-title":"Filter pruning via geometric median for deep convolutional neural networks acceleration","author":"He Y.","year":"2019","unstructured":"Y. He , P. Liu , Z. Wang , Z. Hu , and Y. Yang , \" Filter pruning via geometric median for deep convolutional neural networks acceleration ,\" in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2019 , pp. 4340 -- 4349 . Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, \"Filter pruning via geometric median for deep convolutional neural networks acceleration,\" in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 4340--4349.","journal-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358275"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2019.2911674"},{"key":"e_1_3_2_1_18_1","volume-title":"Hinton et al., \"Learning multiple layers of features from tiny images","author":"Krizhevsky A.","year":"2009","unstructured":"A. Krizhevsky , G. Hinton et al., \"Learning multiple layers of features from tiny images ,\" 2009 . A. Krizhevsky, G. Hinton et al., \"Learning multiple layers of features from tiny images,\" 2009."},{"key":"e_1_3_2_1_19_1","first-page":"1097","volume-title":"NIPS'12","author":"Krizhevsky A.","year":"2012","unstructured":"A. Krizhevsky , I. Sutskever , and G. E. Hinton , \" ImageNet classification with deep convolutional neural networks,\" in Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, ser . NIPS'12 . Red Hook, NY, USA: Curran Associates Inc. , 2012 , p. 1097 -- 1105 . A. Krizhevsky, I. Sutskever, and G. E. Hinton, \"ImageNet classification with deep convolutional neural networks,\" in Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, ser. NIPS'12. Red Hook, NY, USA: Curran Associates Inc., 2012, p. 1097--1105."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304028"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358252"},{"key":"e_1_3_2_1_22_1","first-page":"2554","volume-title":"Fast ConvNets using group-wise brain damage,\" in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Lebedev V.","year":"2016","unstructured":"V. Lebedev and V. Lempitsky , \" Fast ConvNets using group-wise brain damage,\" in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Los Alamitos, CA, USA : IEEE Computer Society , jun 2016 , pp. 2554 -- 2564 . V. Lebedev and V. Lempitsky, \"Fast ConvNets using group-wise brain damage,\" in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA, USA: IEEE Computer Society, jun 2016, pp. 2554--2564."},{"key":"e_1_3_2_1_23_1","volume-title":"ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. Open- Review.net","author":"Li H.","year":"2017","unstructured":"H. Li , A. Kadav , I. Durdanovic , H. Samet , and H. P. Graf , \" Pruning filters for efficient convnets,\" in 5th International Conference on Learning Representations , ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. Open- Review.net , 2017 . H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, \"Pruning filters for efficient convnets,\" in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. Open- Review.net, 2017."},{"key":"e_1_3_2_1_24_1","volume-title":"Sanger: A co-design framework for enabling sparse attention using reconfigurable architecture,\" MICRO-54:  54th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Lu L.","year":"2021","unstructured":"L. Lu , Y. Jin , H. Bi , Z. Luo , P. Li , T. Wang , and Y. Liang , \" Sanger: A co-design framework for enabling sparse attention using reconfigurable architecture,\" MICRO-54: 54th Annual IEEE\/ACM International Symposium on Microarchitecture , 2021 . L. Lu, Y. Jin, H. Bi, Z. Luo, P. Li, T. Wang, and Y. Liang, \"Sanger: A co-design framework for enabling sparse attention using reconfigurable architecture,\" MICRO-54: 54th Annual IEEE\/ACM International Symposium on Microarchitecture, 2021."},{"key":"e_1_3_2_1_25_1","first-page":"1927","volume-title":"Exploring the granularity of sparsity in convolutional neural networks,\" in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","author":"Mao H.","year":"2017","unstructured":"H. Mao , S. Han , J. Pool , W. Li , X. Liu , Y. Wang , and W. J. Dally , \" Exploring the granularity of sparsity in convolutional neural networks,\" in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) , 2017 , pp. 1927 -- 1934 . H. Mao, S. Han, J. Pool, W. Li, X. Liu, Y. Wang, and W. J. Dally, \"Exploring the granularity of sparsity in convolutional neural networks,\" in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 1927--1934."},{"key":"e_1_3_2_1_26_1","first-page":"311","volume-title":"ACL '02","author":"Papineni K.","year":"2002","unstructured":"K. Papineni , S. Roukos , T. Ward , and W.-J. Zhu , \"BLEU : A method for automatic evaluation of machine translation,\" in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ser . ACL '02 . USA: Association for Computational Linguistics , 2002 , p. 311 -- 318 . K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, \"BLEU: A method for automatic evaluation of machine translation,\" in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ser. ACL '02. USA: Association for Computational Linguistics, 2002, p. 311--318."},{"key":"e_1_3_2_1_27_1","first-page":"304","volume-title":"ISPASS 2019","author":"Parashar A.","year":"2019","unstructured":"A. Parashar , P. Raina , Y. S. Shao , Y. Chen , V. A. Ying , A. Mukkara , R. Venkatesan , B. Khailany , S. W. Keckler , and J. S. Emer , \" Timeloop: A systematic approach to DNN accelerator evaluation,\" in IEEE International Symposium on Performance Analysis of Systems and Software , ISPASS 2019 , Madison, WI, USA, March 24--26 , 2019 . IEEE, 2019, pp. 304 -- 315 . A. Parashar, P. Raina, Y. S. Shao, Y. Chen, V. A. Ying, A. Mukkara, R. Venkatesan, B. Khailany, S. W. Keckler, and J. S. Emer, \"Timeloop: A systematic approach to DNN accelerator evaluation,\" in IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2019, Madison, WI, USA, March 24--26, 2019. IEEE, 2019, pp. 304--315."},{"key":"e_1_3_2_1_28_1","first-page":"27","volume-title":"USA: Association for Computing Machinery","author":"Parashar A.","year":"2017","unstructured":"A. Parashar , M. Rhu , A. Mukkara , A. Puglielli , R. Venkatesan , B. Khailany , J. Emer , S. W. Keckler , and W. J. Dally , \" SCNN: An accelerator for compressed-sparse convolutional neural networks,\" vol. 45, no. 2. New York, NY , USA: Association for Computing Machinery , Jun. 2017 , p. 27 -- 40 . A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. Emer, S. W. Keckler, and W. J. Dally, \"SCNN: An accelerator for compressed-sparse convolutional neural networks,\" vol. 45, no. 2. New York, NY, USA: Association for Computing Machinery, Jun. 2017, p. 27--40."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304076"},{"key":"e_1_3_2_1_30_1","first-page":"02883","article-title":"SCALE-Sim: Systolic CNN accelerator","volume":"1811","author":"Samajdar A.","year":"2018","unstructured":"A. Samajdar , Y. Zhu , P. N. Whatmough , M. Mattina , and T. Krishna , \" SCALE-Sim: Systolic CNN accelerator ,\" CoRR , vol. abs\/ 1811 . 02883 , 2018 . A. Samajdar, Y. Zhu, P. N. Whatmough, M. Mattina, and T. Krishna, \"SCALE-Sim: Systolic CNN accelerator,\" CoRR, vol. abs\/1811.02883, 2018.","journal-title":"CoRR"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322255"},{"key":"e_1_3_2_1_32_1","volume-title":"Very deep convolutional networks for large-scale image recognition,\" arXiv preprint arXiv:1409.1556","author":"Simonyan K.","year":"2014","unstructured":"K. Simonyan and A. Zisserman , \" Very deep convolutional networks for large-scale image recognition,\" arXiv preprint arXiv:1409.1556 , 2014 . K. Simonyan and A. Zisserman, \"Very deep convolutional networks for large-scale image recognition,\" arXiv preprint arXiv:1409.1556, 2014."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2017.2761740"},{"key":"e_1_3_2_1_34_1","first-page":"00567","article-title":"Rethinking the inception architecture for computer vision","volume":"1512","author":"Szegedy C.","year":"2015","unstructured":"C. Szegedy , V. Vanhoucke , S. Ioffe , J. Shlens , and Z. Wojna , \" Rethinking the inception architecture for computer vision ,\" CoRR , vol. abs\/ 1512 . 00567 , 2015 . C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, \"Rethinking the inception architecture for computer vision,\" CoRR, vol. abs\/1512.00567, 2015.","journal-title":"CoRR"},{"key":"e_1_3_2_1_35_1","first-page":"5998","volume-title":"USA","author":"Vaswani A.","year":"2017","unstructured":"A. Vaswani , N. Shazeer , N. Parmar , J. Uszkoreit , L. Jones , A. N. Gomez , L. Kaiser , and I. Polosukhin , \" Attention is all you need,\" in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA , USA , 2017 , pp. 5998 -- 6008 . A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, \"Attention is all you need,\" in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA, 2017, pp. 5998--6008."},{"key":"e_1_3_2_1_36_1","first-page":"2074","volume-title":"Barcelona","author":"Wen W.","year":"2016","unstructured":"W. Wen , C. Wu , Y. Wang , Y. Chen , and H. Li , \" Learning structured sparsity in deep neural networks,\" in Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5--10, 2016 , Barcelona , Spain , 2016 , pp. 2074 -- 2082 . W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, \"Learning structured sparsity in deep neural networks,\" in Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5--10, 2016, Barcelona, Spain, 2016, pp. 2074--2082."},{"key":"e_1_3_2_1_37_1","first-page":"548","volume-title":"ISCA 2017","author":"Yu J.","year":"2017","unstructured":"J. Yu , A. Lukefahr , D. J. Palframan , G. S. Dasika , R. Das , and S. A. Mahlke , \" Scalpel: Customizing DNN pruning to the underlying hardware parallelism,\" in Proceedings of the 44th Annual International Symposium on Computer Architecture , ISCA 2017 , Toronto, ON, Canada, June 24--28 , 2017 . ACM, 2017, pp. 548 -- 560 . J. Yu, A. Lukefahr, D. J. Palframan, G. S. Dasika, R. Das, and S. A. Mahlke, \"Scalpel: Customizing DNN pruning to the underlying hardware parallelism,\" in Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24--28, 2017. ACM, 2017, pp. 548--560."},{"key":"e_1_3_2_1_38_1","first-page":"1","volume-title":"MICRO 2016","author":"Zhang S.","year":"2016","unstructured":"S. Zhang , Z. Du , L. Zhang , H. Lan , S. Liu , L. Li , Q. Guo , T. Chen , and Y. Chen , \" Cambricon-x: An accelerator for sparse neural networks,\" in 49th Annual IEEE\/ACM International Symposium on Microarchitecture , MICRO 2016 , Taipei, Taiwan, October 15--19 , 2016 . IEEE Computer Society, 2016, pp. 20: 1 -- 20 :12. S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, and Y. Chen, \"Cambricon-x: An accelerator for sparse neural networks,\" in 49th Annual IEEE\/ACM International Symposium on Microarchitecture, MICRO 2016, Taipei, Taiwan, October 15--19, 2016. IEEE Computer Society, 2016, pp. 20:1--20:12."},{"key":"e_1_3_2_1_39_1","first-page":"1","volume-title":"DAC 2020","author":"Zhao X.","year":"2020","unstructured":"X. Zhao , Y. Wang , C. Liu , C. Shi , K. Tu , and L. Zhang , \" BitPruner: Network pruning for bit-serial accelerators,\" in 57th ACM\/IEEE Design Automation Conference , DAC 2020 , San Francisco, CA, USA, July 20--24 , 2020 . IEEE, 2020, pp. 1 -- 6 . X. Zhao, Y. Wang, C. Liu, C. Shi, K. Tu, and L. Zhang, \"BitPruner: Network pruning for bit-serial accelerators,\" in 57th ACM\/IEEE Design Automation Conference, DAC 2020, San Francisco, CA, USA, July 20--24, 2020. IEEE, 2020, pp. 1--6."},{"key":"e_1_3_2_1_40_1","first-page":"954","volume-title":"ISCA 2020","author":"Zhao Y.","year":"2020","unstructured":"Y. Zhao , X. Chen , Y. Wang , C. Li , H. You , Y. Fu , Y. Xie , Z. Wang , and Y. Lin , \" Smartexchange: Trading higher-cost memory storage\/access for lower-cost computation,\" in 47th ACM\/IEEE Annual International Symposium on Computer Architecture , ISCA 2020 , Valencia, Spain, May 30 - June 3, 2020 . IEEE, 2020, pp. 954 -- 967 . Y. Zhao, X. Chen, Y. Wang, C. Li, H. You, Y. Fu, Y. Xie, Z. Wang, and Y. Lin, \"Smartexchange: Trading higher-cost memory storage\/access for lower-cost computation,\" in 47th ACM\/IEEE Annual International Symposium on Computer Architecture, ISCA 2020, Valencia, Spain, May 30 - June 3, 2020. IEEE, 2020, pp. 954--967."},{"key":"e_1_3_2_1_41_1","first-page":"15","volume-title":"MICRO 2018","author":"Zhou X.","year":"2018","unstructured":"X. Zhou , Z. Du , Q. Guo , S. Liu , C. Liu , C. Wang , X. Zhou , L. Li , T. Chen , and Y. Chen , \" Cambricon-s: Addressing irregularity in sparse neural networks through A cooperative software\/hardware approach,\" in 51st Annual IEEE\/ACM International Symposium on Microarchitecture , MICRO 2018 , Fukuoka, Japan, October 20--24 , 2018 . IEEE Computer Society, 2018, pp. 15 -- 28 . X. Zhou, Z. Du, Q. Guo, S. Liu, C. Liu, C. Wang, X. Zhou, L. Li, T. Chen, and Y. Chen, \"Cambricon-s: Addressing irregularity in sparse neural networks through A cooperative software\/hardware approach,\" in 51st Annual IEEE\/ACM International Symposium on Microarchitecture, MICRO 2018, Fukuoka, Japan, October 20--24, 2018. IEEE Computer Society, 2018, pp. 15--28."}],"event":{"name":"ISCA '22: The 49th Annual International Symposium on Computer Architecture","location":"New York New York","acronym":"ISCA '22","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCAA IEEE CS technical committee on architectural acoustics"]},"container-title":["Proceedings of the 49th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527419","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3527419","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3527419","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:28Z","timestamp":1750188628000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527419"}},"subtitle":["enabling high data reuse for sparse DNN accelerators"],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":41,"alternative-id":["10.1145\/3470496.3527419","10.1145\/3470496"],"URL":"https:\/\/doi.org\/10.1145\/3470496.3527419","relation":{},"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"2022-06-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}