{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T09:53:11Z","timestamp":1761126791357,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,9]],"date-time":"2021-08-09T00:00:00Z","timestamp":1628467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,9]]},"DOI":"10.1145\/3472456.3472513","type":"proceedings-article","created":{"date-parts":[[2021,10,5]],"date-time":"2021-10-05T18:39:57Z","timestamp":1633459197000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["BitX: Empower Versatile Inference with Hardware Runtime Pruning"],"prefix":"10.1145","author":[{"given":"Hongyan","family":"Li","sequence":"first","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hang","family":"Lu","sequence":"additional","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiawen","family":"Huang","sequence":"additional","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenxu","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mingzhe","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Chen","sequence":"additional","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Liang","family":"Chang","sequence":"additional","affiliation":[{"name":"University of Electronic Science and Technology of China, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaowei","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laborotary of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,5]]},"reference":[{"unstructured":"[\n  1\n  ]  pytorch 1.6. https:\/\/pytorch.org\/  [1] pytorch 1.6. https:\/\/pytorch.org\/","key":"e_1_3_2_1_1_1"},{"key":"e_1_3_2_1_2_1","volume-title":"Proceedings ofthe International Symposium on Microarchitecture (MICRO).","author":"Albericio Jorge","year":"2016","unstructured":"Jorge Albericio , Patrick Judd , Alberto Delmas , Sayeh Sharify , and Andreas Moshovos . 2016 . Bit-pragmatic Deep Neural Network Computing . In Proceedings ofthe International Symposium on Microarchitecture (MICRO). Jorge Albericio, Patrick Judd, Alberto Delmas, Sayeh Sharify, and Andreas Moshovos. 2016. Bit-pragmatic Deep Neural Network Computing. In Proceedings ofthe International Symposium on Microarchitecture (MICRO)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_3_1","DOI":"10.1145\/3007787.3001138"},{"unstructured":"Sajid Anwar Kyuyeon Hwang and Wonyong Sung. 2015. Structured Pruning of Deep Convolutional Neural Networks. ACM Journal on Emerging Technologies in Computing Systems (JETC).  Sajid Anwar Kyuyeon Hwang and Wonyong Sung. 2015. Structured Pruning of Deep Convolutional Neural Networks. ACM Journal on Emerging Technologies in Computing Systems (JETC).","key":"e_1_3_2_1_4_1"},{"unstructured":"Tom\u00a0B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel\u00a0M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165.  Tom\u00a0B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel\u00a0M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165.","key":"e_1_3_2_1_5_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_6_1","DOI":"10.1109\/ICCV.2017.89"},{"key":"e_1_3_2_1_7_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805.","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805."},{"key":"e_1_3_2_1_8_1","series-title":"SIAM Journal on Computing (SICOMP)","volume-title":"Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition","author":"Drineas Petros","unstructured":"Petros Drineas , Ravi Kannan , and Michael\u00a0 W Mahoney . 2006. Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition . SIAM Journal on Computing (SICOMP) . Petros Drineas, Ravi Kannan, and Michael\u00a0W Mahoney. 2006. Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition. SIAM Journal on Computing (SICOMP)."},{"key":"e_1_3_2_1_9_1","volume-title":"Proceedings ofthe ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA).","author":"Han Song","year":"2016","unstructured":"Song Han , Junlong Kang , Huizi Mao , Yiming Hu , Xin Li , Yubin Li , Dongliang Xie , Hong Luo , Song Yao , Yu Wang , Huazhong Yang , and William\u00a0 J. Dally . 2016 . ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA . In Proceedings ofthe ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA). Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, and William\u00a0J. Dally. 2016. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA. In Proceedings ofthe ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_10_1","DOI":"10.1109\/CVPR.2016.90"},{"unstructured":"Hengyuan Hu Rui Peng Yu-Wing Tai and Chi-Keung Tang. 2016. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv:1607.03250.  Hengyuan Hu Rui Peng Yu-Wing Tai and Chi-Keung Tang. 2016. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv:1607.03250.","key":"e_1_3_2_1_11_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_12_1","DOI":"10.1109\/CVPR.2017.243"},{"unstructured":"[\n  13\n  ]  IEEE 754.https:\/\/standards.ieee.org\/standard\/754-2019.html  [13] IEEE 754.https:\/\/standards.ieee.org\/standard\/754-2019.html","key":"e_1_3_2_1_13_1"},{"unstructured":"[\n  14\n  ]  CoCo Dataset.https:\/\/cocodataset.org\/  [14] CoCo Dataset.https:\/\/cocodataset.org\/","key":"e_1_3_2_1_14_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_15_1","DOI":"10.1109\/CVPR.2009.5206848"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_16_1","DOI":"10.1109\/LCA.2016.2597140"},{"doi-asserted-by":"crossref","unstructured":"Matthias Jung Christian Weis and Norbert Wehn. 2015. DRAMSys: A Flexible DRAM Subsystem Design Space Exploration Framework. IPSJ Transactions on System LSI Design Methodology (T-SLDM).  Matthias Jung Christian Weis and Norbert Wehn. 2015. DRAMSys: A Flexible DRAM Subsystem Design Space Exploration Framework. IPSJ Transactions on System LSI Design Methodology (T-SLDM).","key":"e_1_3_2_1_17_1","DOI":"10.2197\/ipsjtsldm.8.63"},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Krizhevsky Alex","year":"2009","unstructured":"Alex Krizhevsky and Geoff Hinton . 2009 . Learning multiple layers of features from tiny images . In Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR). Alex Krizhevsky and Geoff Hinton. 2009. Learning multiple layers of features from tiny images. In Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_1_19_1","volume-title":"Proceedings ofthe Architectural Support for Programming Languages and Operating Systems (ASPLOS).","author":"Lascorz Alberto Delmas","year":"2019","unstructured":"Alberto Delmas Lascorz , Patrick Judd , Dylan Malone Stuart , Zissis Poulos , and Mostafa Mahmoud . 2019 . Bit-Tactical: A Software\/Hardware Approach to Exploiting Value and Bit Sparsity in Neural Networks . In Proceedings ofthe Architectural Support for Programming Languages and Operating Systems (ASPLOS). Alberto Delmas Lascorz, Patrick Judd, Dylan Malone Stuart, Zissis Poulos, and Mostafa Mahmoud. 2019. Bit-Tactical: A Software\/Hardware Approach to Exploiting Value and Bit Sparsity in Neural Networks. In Proceedings ofthe Architectural Support for Programming Languages and Operating Systems (ASPLOS)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_20_1","DOI":"10.1109\/ISSCC.2018.8310262"},{"key":"e_1_3_2_1_21_1","volume-title":"Proceedings ofthe International Conference on Learning Representations (ICLR).","author":"Li Hao","year":"2017","unstructured":"Hao Li , Asim Kadav , Igor Durdanovic , Hanan Samet , and Hans\u00a0Peter Graf . 2017 . Pruning filters for efficient convnets . In Proceedings ofthe International Conference on Learning Representations (ICLR). Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans\u00a0Peter Graf. 2017. Pruning filters for efficient convnets. In Proceedings ofthe International Conference on Learning Representations (ICLR)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_22_1","DOI":"10.1109\/ICCV.2017.298"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_23_1","DOI":"10.1145\/3240765.3240855"},{"volume-title":"Architecting Effectual Computation for Machine Learning Accelerators","author":"Lu Hang","unstructured":"Hang Lu , Mingzhe Zhang , Yinhe Han , Qi Wang , Huawei Li , and Xiaowei Li. 2020. Architecting Effectual Computation for Machine Learning Accelerators . IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) . Hang Lu, Mingzhe Zhang, Yinhe Han, Qi Wang, Huawei Li, and Xiaowei Li. 2020. Architecting Effectual Computation for Machine Learning Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD).","key":"e_1_3_2_1_24_1"},{"unstructured":"Jian-Hao Luo and Jianxin Wu. 2017. An entropy-based pruning method for CNN compression. arXiv:1706.05791.  Jian-Hao Luo and Jianxin Wu. 2017. An entropy-based pruning method for CNN compression. arXiv:1706.05791.","key":"e_1_3_2_1_25_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_26_1","DOI":"10.1109\/ICCV.2017.541"},{"key":"e_1_3_2_1_27_1","volume":"201","author":"Parashar Angshuman","unstructured":"Angshuman Parashar , Minsoo Rhu , Anurag Mukkara , Antonio Puglielli , Rangharajan Venkatesan , Brucek Khailany , Joel Emer , Stephen W. Keckler , and William J. Dally. 201 7. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proceedings ofthe International Symposium on Computer Architecture (ISCA). Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proceedings ofthe International Symposium on Computer Architecture (ISCA).","journal-title":"William J. Dally."},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Redmon Joseph","year":"2018","unstructured":"Joseph Redmon and Ali Farhadi . 2018 . YOLOv3: An Incremental Improvement . In Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR). Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. In Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR)."},{"unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.  Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.","key":"e_1_3_2_1_29_1"},{"key":"e_1_3_2_1_30_1","volume":"201","author":"Han Song","unstructured":"Song Han , Xingyu Liu , Huizi Mao , Jing Pu , and William J. Dally. 201 6. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings ofthe International Symposium on Computer Architecture (ISCA). Song Han, Xingyu Liu, Huizi Mao, Jing Pu, and William J. Dally. 2016. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings ofthe International Symposium on Computer Architecture (ISCA).","journal-title":"William J. Dally."},{"doi-asserted-by":"crossref","unstructured":"Du Tran Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri. 2014. Learning Spatiotemporal Features with 3D Convolutional Networks. arXiv:1412.0767.  Du Tran Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri. 2014. Learning Spatiotemporal Features with 3D Convolutional Networks. arXiv:1412.0767.","key":"e_1_3_2_1_31_1","DOI":"10.1109\/ICCV.2015.510"},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Wang Xiaolong","year":"2014","unstructured":"Xiaolong Wang , Ross Girshick , Abhinav Gupta , and Kaiming He . 2014 . Non-local Neural Networks . In Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR). Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2014. Non-local Neural Networks. In Proceedings ofthe Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_1_33_1","volume-title":"Proceedings ofthe Conference and Workshop on Neural Information Processing Systems (NeurIPS).","author":"Wen Wei","year":"2016","unstructured":"Wei Wen , Chunpeng Wu , Yandan Wang , Yiran Chen , and Hai Li . 2016 . Learning Structured Sparsity in Deep Neural Networks . In Proceedings ofthe Conference and Workshop on Neural Information Processing Systems (NeurIPS). Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning Structured Sparsity in Deep Neural Networks. In Proceedings ofthe Conference and Workshop on Neural Information Processing Systems (NeurIPS)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_34_1","DOI":"10.1109\/MICRO.2018.00011"}],"event":{"acronym":"ICPP 2021","name":"ICPP 2021: 50th International Conference on Parallel Processing","location":"Lemont IL USA"},"container-title":["50th International Conference on Parallel Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472513","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472456.3472513","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:12Z","timestamp":1750193292000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472456.3472513"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,9]]},"references-count":34,"alternative-id":["10.1145\/3472456.3472513","10.1145\/3472456"],"URL":"https:\/\/doi.org\/10.1145\/3472456.3472513","relation":{},"subject":[],"published":{"date-parts":[[2021,8,9]]},"assertion":[{"value":"2021-10-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}