{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T03:28:07Z","timestamp":1762918087353,"version":"3.41.0"},"reference-count":80,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2020,1,31]],"date-time":"2020-01-31T00:00:00Z","timestamp":1580428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2020,1,31]]},"abstract":"<jats:p>As the adoption of Neural Networks continues to proliferate different classes of applications and systems, edge devices have been left behind. Their strict energy and storage limitations make them unable to cope with the sizes of common network models. While many compression methods such as precision reduction and sparsity have been proposed to alleviate this, they don\u2019t go quite far enough. To push size reduction to its absolute limits, we combine binarization with sparsity in Pruned-Permuted-Packed XNOR Networks (3PXNet), which can be efficiently implemented on even the smallest of embedded microcontrollers. 3PXNets can reduce model sizes by up to 38X and reduce runtime by up to 3X compared with already compact conventional binarized implementations with less than 3% accuracy reduction. We have created the first software implementation of sparse-binarized Neural Networks, released as open source library targeting edge devices. Our library is complete with training methodology and model generating scripts, making it easy and fast to deploy.<\/jats:p>","DOI":"10.1145\/3371157","type":"journal-article","created":{"date-parts":[[2020,2,7]],"date-time":"2020-02-07T07:03:59Z","timestamp":1581059039000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["3PXNet"],"prefix":"10.1145","volume":"19","author":[{"given":"Wojciech","family":"Romaszkan","sequence":"first","affiliation":[{"name":"University of California Los Angeles, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1078-6743","authenticated-orcid":false,"given":"Tianmu","family":"Li","sequence":"additional","affiliation":[{"name":"University of California Los Angeles, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Puneet","family":"Gupta","sequence":"additional","affiliation":[{"name":"University of California Los Angeles, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,2,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2017.7966166"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2017.2682138"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3005348"},{"key":"e_1_2_1_4_1","volume-title":"Compact deep convolutional neural networks with coarse pruning. CoRR abs\/1610.09639","author":"Anwar Sajid","year":"2016","unstructured":"Sajid Anwar and Wonyong Sung . 2016. Compact deep convolutional neural networks with coarse pruning. CoRR abs\/1610.09639 ( 2016 ). http:\/\/arxiv.org\/abs\/1610.09639 Sajid Anwar and Wonyong Sung. 2016. Compact deep convolutional neural networks with coarse pruning. CoRR abs\/1610.09639 (2016). http:\/\/arxiv.org\/abs\/1610.09639"},{"key":"e_1_2_1_5_1","unstructured":"ARM. 2018. ARM Compute Library. Retrieved from https:\/\/github.com\/ARM-software\/ComputeLibrary.  ARM. 2018. ARM Compute Library. Retrieved from https:\/\/github.com\/ARM-software\/ComputeLibrary."},{"key":"e_1_2_1_6_1","unstructured":"ARM. 2018. NEON. Retrieved from https:\/\/github.com\/ARM-software\/CMSIS_5.  ARM. 2018. NEON. Retrieved from https:\/\/github.com\/ARM-software\/CMSIS_5."},{"key":"e_1_2_1_7_1","unstructured":"ARM. 2019. ARM CMSIS-NN. Retrieved from https:\/\/developer.arm.com\/architectures\/instruction-sets\/simd-isas\/neon.  ARM. 2019. ARM CMSIS-NN. Retrieved from https:\/\/developer.arm.com\/architectures\/instruction-sets\/simd-isas\/neon."},{"key":"e_1_2_1_8_1","unstructured":"ARM Limited. 2017. ARMv6-M Architecture Reference Manual. 1 138 pages. DOI:https:\/\/doi.org\/ARM DDI 0419D  ARM Limited. 2017. ARMv6-M Architecture Reference Manual. 1 138 pages. DOI:https:\/\/doi.org\/ARM DDI 0419D"},{"key":"e_1_2_1_9_1","unstructured":"ARM Limited. 2018. ARMv7-M Architecture Reference Manual. 1 138 pages. DOI:https:\/\/doi.org\/ARM DDI 0403E  ARM Limited. 2018. ARMv7-M Architecture Reference Manual. 1 138 pages. DOI:https:\/\/doi.org\/ARM DDI 0403E"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CoolChips.2018.8373076"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2018.2869150"},{"key":"e_1_2_1_12_1","volume-title":"High performance convolutional neural networks for document processing. (Oct","author":"Chellapilla Kumar","year":"2006","unstructured":"Kumar Chellapilla , Sidd Puri , and Patrice Simard . 2006. High performance convolutional neural networks for document processing. (Oct . 2006 ). Retrieved from https:\/\/hal.inria.fr\/inria-00112631. Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. (Oct. 2006). Retrieved from https:\/\/hal.inria.fr\/inria-00112631."},{"key":"e_1_2_1_13_1","unstructured":"Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arxiv:cs.DC\/1512.01274  Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arxiv:cs.DC\/1512.01274"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001177"},{"key":"e_1_2_1_15_1","volume-title":"A survey of model compression and acceleration for deep neural networks. CoRR abs\/1710.0","author":"Cheng Yu","year":"2017","unstructured":"Yu Cheng , Duo Wang , Pan Zhou , and Tao Zhang . 2017. A survey of model compression and acceleration for deep neural networks. CoRR abs\/1710.0 ( 2017 ), 1--10. DOI:https:\/\/doi.org\/10.1109\/ICRA.2016.7487304 arxiv:1710.09282 10.1109\/ICRA.2016.7487304 Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. 2017. A survey of model compression and acceleration for deep neural networks. CoRR abs\/1710.0 (2017), 1--10. DOI:https:\/\/doi.org\/10.1109\/ICRA.2016.7487304 arxiv:1710.09282"},{"key":"e_1_2_1_16_1","volume-title":"Training binary multilayer neural networks for image classification using expectation backpropagation. CoRR abs\/1503.03562","author":"Cheng Zhiyong","year":"2015","unstructured":"Zhiyong Cheng , Daniel Soudry , Zexi Mao , and Zhen-zhong Lan. 2015. Training binary multilayer neural networks for image classification using expectation backpropagation. CoRR abs\/1503.03562 ( 2015 ). arxiv:1503.03562 http:\/\/arxiv.org\/abs\/1503.03562 Zhiyong Cheng, Daniel Soudry, Zexi Mao, and Zhen-zhong Lan. 2015. Training binary multilayer neural networks for image classification using expectation backpropagation. CoRR abs\/1503.03562 (2015). arxiv:1503.03562 http:\/\/arxiv.org\/abs\/1503.03562"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488873"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2857019"},{"key":"e_1_2_1_19_1","volume-title":"BinaryNet: Training deep neural networks with weights and activations constrained to +1 or &minus;1. CoRR abs\/1602.02830","author":"Courbariaux Matthieu","year":"2016","unstructured":"Matthieu Courbariaux and Yoshua Bengio . 2016. BinaryNet: Training deep neural networks with weights and activations constrained to +1 or &minus;1. CoRR abs\/1602.02830 ( 2016 ). http:\/\/arxiv.org\/abs\/1602.02830 Matthieu Courbariaux and Yoshua Bengio. 2016. BinaryNet: Training deep neural networks with weights and activations constrained to +1 or &minus;1. CoRR abs\/1602.02830 (2016). http:\/\/arxiv.org\/abs\/1602.02830"},{"key":"e_1_2_1_20_1","first-page":"598","article-title":"Optimal brain damage","volume":"2","author":"Cun Yann Le","year":"1990","unstructured":"Yann Le Cun , John S Denker , and Sara a Solla . 1990 . Optimal brain damage . Advances in Neural Information Processing Systems 2 , 1 (1990), 598 -- 605 . DOI:https:\/\/doi.org\/10.1.1.32.7223 arxiv:arXiv:1011.1669v3 Yann Le Cun, John S Denker, and Sara a Solla. 1990. Optimal brain damage. Advances in Neural Information Processing Systems 2, 1 (1990), 598--605. DOI:https:\/\/doi.org\/10.1.1.32.7223 arxiv:arXiv:1011.1669v3","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2018.01.010"},{"key":"e_1_2_1_22_1","volume-title":"Leong","author":"Faraone Julian","year":"2017","unstructured":"Julian Faraone , Nicholas Fraser , Giulio Gambardella , Michaela Blott , and Philip H. W . Leong . 2017 . Compressing low precision deep neural networks using sparsity-induced regularization in ternary networks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10635 LNCS ( 2017), 393--404. DOI:https:\/\/doi.org\/10.1007\/978-3-319-70096-0_41 arxiv:1709.06262 10.1007\/978-3-319-70096-0_41 Julian Faraone, Nicholas Fraser, Giulio Gambardella, Michaela Blott, and Philip H. W. Leong. 2017. Compressing low precision deep neural networks using sparsity-induced regularization in ternary networks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10635 LNCS (2017), 393--404. DOI:https:\/\/doi.org\/10.1007\/978-3-319-70096-0_41 arxiv:1709.06262"},{"key":"#cr-split#-e_1_2_1_23_1.1","doi-asserted-by":"crossref","unstructured":"Nicholas J. Fraser Yaman Umuroglu Giulio Gambardella Michaela Blott Philip Leong Magnus Jahre and Kees Vissers. 2017. Scaling binarized neural networks on reconfigurable logic. In Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM'17). ACM New York NY 25--30. DOI:https:\/\/doi.org\/10.1145\/3029580.3029586 10.1145\/3029580.3029586","DOI":"10.1145\/3029580.3029586"},{"key":"#cr-split#-e_1_2_1_23_1.2","doi-asserted-by":"crossref","unstructured":"Nicholas J. Fraser Yaman Umuroglu Giulio Gambardella Michaela Blott Philip Leong Magnus Jahre and Kees Vissers. 2017. Scaling binarized neural networks on reconfigurable logic. In Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM'17). ACM New York NY 25--30. DOI:https:\/\/doi.org\/10.1145\/3029580.3029586","DOI":"10.1145\/3029580.3029586"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the34th International Conference on Machine Learning (ICML 2017)","author":"Gupta Chirag","year":"2017","unstructured":"Chirag Gupta , Arun Sai Suggala , Ankit Goyal , Harsha Vardhan Simhadri , Bhargavi Paranjape , Ashish Kumar , Saurabh Goyal , Raghavendra Udupa , Manik Varma , and Prateek Jain . 2017 . ProtoNN: Compressed and accurate kNN for resource-scarce devices . In Proceedings of the34th International Conference on Machine Learning (ICML 2017) 70 (2017), 1331--1340. http:\/\/proceedings.mlr.press\/v70\/gupta17a.html. Chirag Gupta, Arun Sai Suggala, Ankit Goyal, Harsha Vardhan Simhadri, Bhargavi Paranjape, Ashish Kumar, Saurabh Goyal, Raghavendra Udupa, Manik Varma, and Prateek Jain. 2017. ProtoNN: Compressed and accurate kNN for resource-scarce devices. In Proceedings of the34th International Conference on Machine Learning (ICML 2017) 70 (2017), 1331--1340. http:\/\/proceedings.mlr.press\/v70\/gupta17a.html."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 32nd International Conference on International Conference on Machine Learning -","volume":"37","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta , Ankur Agrawal , Kailash Gopalakrishnan , and Pritish Narayanan . 2015 . Deep learning with limited numerical precision . In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML\u201915). JMLR.org, 1737--1746. http:\/\/dl.acm.org\/citation.cfm?id&equals;3045118.3045303 Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML\u201915). JMLR.org, 1737--1746. http:\/\/dl.acm.org\/citation.cfm?id&equals;3045118.3045303"},{"key":"e_1_2_1_26_1","volume-title":"Dally","author":"Han Song","year":"2015","unstructured":"Song Han , Huizi Mao , and William J . Dally . 2015 . Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. (2015), 1--14. DOI:https:\/\/doi.org\/abs\/1510.00149\/1510.00149 arxiv:1510.00149 Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. (2015), 1--14. DOI:https:\/\/doi.org\/abs\/1510.00149\/1510.00149 arxiv:1510.00149"},{"volume-title":"Advances in Neural Information Processing Systems 28","author":"Han Song","key":"e_1_2_1_27_1","unstructured":"Song Han , Jeff Pool , John Tran , and William Dally . 2015. Learning both weights and connections for efficient neural network . In Advances in Neural Information Processing Systems 28 , C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc. , 1135--1143. http:\/\/papers.nips.cc\/paper\/5784-learning-both-weights-and-connections-for-efficient-neural-network.pdf. Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 1135--1143. http:\/\/papers.nips.cc\/paper\/5784-learning-both-weights-and-connections-for-efficient-neural-network.pdf."},{"key":"e_1_2_1_28_1","volume-title":"Wolff","author":"Hassibi Babak","year":"1993","unstructured":"Babak Hassibi , David G. Stork , and Gregory J . Wolff . 1993 . Optimal brain surgeon and general network pruning. 293--299 pages. DOI:https:\/\/doi.org\/10.1109\/ICNN.1993.298572 10.1109\/ICNN.1993.298572 Babak Hassibi, David G. Stork, and Gregory J. Wolff. 1993. Optimal brain surgeon and general network pruning. 293--299 pages. DOI:https:\/\/doi.org\/10.1109\/ICNN.1993.298572"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2019.00102"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11548-018-1797-4"},{"key":"e_1_2_1_32_1","volume-title":"MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR abs\/1704.04861","author":"Howard Andrew G.","year":"2017","unstructured":"Andrew G. Howard , Menglong Zhu , Bo Chen , Dmitry Kalenichenko , Weijun Wang , Tobias Weyand , Marco Andreetto , and Hartwig Adam . 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR abs\/1704.04861 ( 2017 ). arxiv:1704.04861 http:\/\/arxiv.org\/abs\/1704.04861 Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR abs\/1704.04861 (2017). arxiv:1704.04861 http:\/\/arxiv.org\/abs\/1704.04861"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2018.00034"},{"volume-title":"Advances in Neural Information Processing Systems 29","author":"Hubara Itay","key":"e_1_2_1_34_1","unstructured":"Itay Hubara , Matthieu Courbariaux , Daniel Soudry , Ran El-Yaniv , and Yoshua Bengio . 2016. Binarized neural networks . In Advances in Neural Information Processing Systems 29 , D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc. , 4107--4115. http:\/\/papers.nips.cc\/paper\/6573-binarized-neural-networks.pdf. Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 4107--4115. http:\/\/papers.nips.cc\/paper\/6573-binarized-neural-networks.pdf."},{"key":"e_1_2_1_35_1","volume-title":"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1MB model size. CoRR abs\/1602.07360","author":"Iandola Forrest N.","year":"2016","unstructured":"Forrest N. Iandola , Matthew W. Moskewicz , Khalid Ashraf , Song Han , William J. Dally , and Kurt Keutzer . 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1MB model size. CoRR abs\/1602.07360 ( 2016 ). arxiv:1602.07360 http:\/\/arxiv.org\/abs\/1602.07360 Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1MB model size. CoRR abs\/1602.07360 (2016). arxiv:1602.07360 http:\/\/arxiv.org\/abs\/1602.07360"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISLPED.2017.8009163"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2016.2597140"},{"key":"e_1_2_1_39_1","unstructured":"Patrick Judd Alberto Delmas Sayeh Sharify and Andreas Moshovos. 2017. Cnvlutin2: Ineffectual-activation-and-weight-free deep neural network computing. (2017) 1--6. arxiv:1705.00125 http:\/\/arxiv.org\/abs\/1705.00125  Patrick Judd Alberto Delmas Sayeh Sharify and Andreas Moshovos. 2017. Cnvlutin2: Ineffectual-activation-and-weight-free deep neural network computing. (2017) 1--6. arxiv:1705.00125 http:\/\/arxiv.org\/abs\/1705.00125"},{"key":"e_1_2_1_40_1","volume-title":"Bitwise neural networks. 37","author":"Kim Minje","year":"2016","unstructured":"Minje Kim and Paris Smaragdis . 2016. Bitwise neural networks. 37 ( 2016 ). arxiv:1601.06071 http:\/\/arxiv.org\/abs\/1601.06071 Minje Kim and Paris Smaragdis. 2016. Bitwise neural networks. 37 (2016). arxiv:1601.06071 http:\/\/arxiv.org\/abs\/1601.06071"},{"key":"e_1_2_1_41_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba . 2014 . Adam : A Method for Stochastic Optimization. http:\/\/arxiv.org\/abs\/1412.6980 cite arxiv:1412.6980 Comment : Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego , 2015. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. http:\/\/arxiv.org\/abs\/1412.6980 cite arxiv:1412.6980 Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015."},{"key":"e_1_2_1_42_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E . Hinton . 2012 . ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems ( 2012), 1--9. DOI:https:\/\/doi.org\/10.1016\/j.protcy.2014.09.007 arxiv:1102.0183 10.1016\/j.protcy.2014.09.007 Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (2012), 1--9. DOI:https:\/\/doi.org\/10.1016\/j.protcy.2014.09.007 arxiv:1102.0183"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 34th International Conference on Machine Learning (ICML 2017)","author":"Kumar Ashish","year":"2017","unstructured":"Ashish Kumar , Saurabh Goyal , and Manik Varma . 2017 . Resource-efficient machine learning in 2KB RAM for the Internet of Things . In Proceedings of the 34th International Conference on Machine Learning (ICML 2017) 70 (2017), 1935--1944. https:\/\/www.microsoft.com\/en-us\/research\/publication\/resource-efficient-machine-learning-2-kb-ram-internet-things\/. Ashish Kumar, Saurabh Goyal, and Manik Varma. 2017. Resource-efficient machine learning in 2KB RAM for the Internet of Things. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017) 70 (2017), 1935--1944. https:\/\/www.microsoft.com\/en-us\/research\/publication\/resource-efficient-machine-learning-2-kb-ram-internet-things\/."},{"key":"e_1_2_1_44_1","unstructured":"Abhisek Kundu Kunal Banerjee Naveen Mellempudi Dheevatsa Mudigere Dipankar Das Bharat Kaul and Pradeep Dubey. 2017. Ternary residual networks. (2017) 1--16. arxiv:1707.04679 http:\/\/arxiv.org\/abs\/1707.04679  Abhisek Kundu Kunal Banerjee Naveen Mellempudi Dheevatsa Mudigere Dipankar Das Bharat Kaul and Pradeep Dubey. 2017. Ternary residual networks. (2017) 1--16. arxiv:1707.04679 http:\/\/arxiv.org\/abs\/1707.04679"},{"key":"e_1_2_1_45_1","unstructured":"Liangzhen Lai Naveen Suda and Vikas Chandra. 2018. CMSIS-NN: Efficient neural network kernels for ARM Cortex-M CPUs. (2018) 1--10. arxiv:1801.06601 http:\/\/arxiv.org\/abs\/1801.06601  Liangzhen Lai Naveen Suda and Vikas Chandra. 2018. CMSIS-NN: Efficient neural network kernels for ARM Cortex-M CPUs. (2018) 1--10. arxiv:1801.06601 http:\/\/arxiv.org\/abs\/1801.06601"},{"key":"#cr-split#-e_1_2_1_46_1.1","doi-asserted-by":"crossref","unstructured":"Vadim Lebedev and Victor Lempitsky. 2015. Fast ConvNets using group-wise brain damage. (2015). DOI:https:\/\/doi.org\/10.1109\/CVPR.2016.280 arxiv:1506.02515 10.1109\/CVPR.2016.280","DOI":"10.1109\/CVPR.2016.280"},{"key":"#cr-split#-e_1_2_1_46_1.2","doi-asserted-by":"crossref","unstructured":"Vadim Lebedev and Victor Lempitsky. 2015. Fast ConvNets using group-wise brain damage. (2015). DOI:https:\/\/doi.org\/10.1109\/CVPR.2016.280 arxiv:1506.02515","DOI":"10.1109\/CVPR.2016.280"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_48_1","volume-title":"Ternary weight networks. Nips","author":"Li Fengfu","year":"2016","unstructured":"Fengfu Li , Bo Zhang , and Bin Liu . 2016. Ternary weight networks. Nips ( 2016 ). arxiv:1605.04711 http:\/\/arxiv.org\/abs\/1605.04711 Fengfu Li, Bo Zhang, and Bin Liu. 2016. Ternary weight networks. Nips (2016). arxiv:1605.04711 http:\/\/arxiv.org\/abs\/1605.04711"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021786"},{"key":"e_1_2_1_50_1","unstructured":"Yixing Li and Fengbo Ren. 2018. Build a compact binary neural network through bit-level sensitivity and data pruning. (2018). arxiv:1802.00904 http:\/\/arxiv.org\/abs\/1802.00904  Yixing Li and Fengbo Ren. 2018. Build a compact binary neural network through bit-level sensitivity and data pruning. (2018). arxiv:1802.00904 http:\/\/arxiv.org\/abs\/1802.00904"},{"key":"e_1_2_1_51_1","volume-title":"A GPU-outperforming FPGA accelerator architecture for binary convolutional neural networks. CoRR","author":"Li Yixing","year":"2017","unstructured":"Yixing Li , Kai Xu , and Hao Yu. 2017. A GPU-outperforming FPGA accelerator architecture for binary convolutional neural networks. CoRR ( 2017 ). arxiv:1702.06392 Yixing Li, Kai Xu, and Hao Yu. 2017. A GPU-outperforming FPGA accelerator architecture for binary convolutional neural networks. CoRR (2017). arxiv:1702.06392"},{"key":"e_1_2_1_52_1","unstructured":"Ling Liang Lei Deng Yueling Zeng Xing Hu Yu Ji Xin Ma Guoqi Li and Yuan Xie. 2018. Crossbar-aware neural network pruning. (2018) 1--13. arxiv:1807.10816 http:\/\/arxiv.org\/abs\/1807.10816  Ling Liang Lei Deng Yueling Zeng Xing Hu Yu Ji Xin Ma Guoqi Li and Yuan Xie. 2018. Crossbar-aware neural network pruning. (2018) 1--13. arxiv:1807.10816 http:\/\/arxiv.org\/abs\/1807.10816"},{"key":"e_1_2_1_53_1","first-page":"1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2017-July","author":"Lin Jeng Hau","year":"2017","unstructured":"Jeng Hau Lin , Tianwei Xing , Ritchie Zhao , Zhiru Zhang , Mani Srivastava , Zhuowen Tu , and Rajesh K. Gupta . 2017. Binarized convolutional neural networks with separable filters for efficient hardware acceleration . In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2017-July , 1 ( 2017 ), 344--352. DOI:https:\/\/doi.org\/10.1109\/CVPRW.2017.48 arxiv:1707.04693 10.1109\/CVPRW.2017.48 Jeng Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, and Rajesh K. Gupta. 2017. Binarized convolutional neural networks with separable filters for efficient hardware acceleration. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2017-July, 1 (2017), 344--352. DOI:https:\/\/doi.org\/10.1109\/CVPRW.2017.48 arxiv:1707.04693"},{"key":"e_1_2_1_54_1","volume-title":"Towards accurate binary convolutional neural network. 3","author":"Lin Xiaofan","year":"2017","unstructured":"Xiaofan Lin , Cong Zhao , and Wei Pan . 2017. Towards accurate binary convolutional neural network. 3 ( 2017 ), 1--14. arxiv:1711.11294 http:\/\/arxiv.org\/abs\/1711.11294 Xiaofan Lin, Cong Zhao, and Wei Pan. 2017. Towards accurate binary convolutional neural network. 3 (2017), 1--14. arxiv:1711.11294 http:\/\/arxiv.org\/abs\/1711.11294"},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Liu Baoyuan","year":"2015","unstructured":"Baoyuan Liu , Min Wang , Hassan Foroosh , Marshall Tappen , and Marianna Pensky . 2015 . Sparse convolutional neural networks . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_1_56_1","unstructured":"Bradley McDanel Surat Teerapittayanon and H. T. Kung. 2017. Embedded binarized neural networks. (2017) 1--6. arxiv:1709.02260 http:\/\/arxiv.org\/abs\/1709.02260  Bradley McDanel Surat Teerapittayanon and H. T. Kung. 2017. Embedded binarized neural networks. (2017) 1--6. arxiv:1709.02260 http:\/\/arxiv.org\/abs\/1709.02260"},{"key":"e_1_2_1_57_1","unstructured":"Naveen Mellempudi Abhisek Kundu Dheevatsa Mudigere Dipankar Das Bharat Kaul and Pradeep Dubey. 2017. Ternary neural networks with fine-grained quantization. (2017). arxiv:1705.01462 http:\/\/arxiv.org\/abs\/1705.01462  Naveen Mellempudi Abhisek Kundu Dheevatsa Mudigere Dipankar Das Bharat Kaul and Pradeep Dubey. 2017. Ternary neural networks with fine-grained quantization. (2017). arxiv:1705.01462 http:\/\/arxiv.org\/abs\/1705.01462"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/bxx046"},{"key":"e_1_2_1_59_1","volume-title":"Dally","author":"Parashar Angshuman","year":"2017","unstructured":"Angshuman Parashar , Minsoo Rhu , Anurag Mukkara , Antonio Puglielli , Rangharajan Venkatesan , Brucek Khailany , Joel Emer , Stephen W. Keckler , and William J . Dally . 2017 . SCNN : An accelerator for compressed-sparse convolutional neural networks. (2017). DOI:https:\/\/doi.org\/10.1145\/3079856.3080254 arxiv:1708.04485 10.1145\/3079856.3080254 Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An accelerator for compressed-sparse convolutional neural networks. (2017). DOI:https:\/\/doi.org\/10.1145\/3079856.3080254 arxiv:1708.04485"},{"key":"e_1_2_1_60_1","unstructured":"Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS-W.  Adam Paszke Sam Gross Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Alban Desmaison Luca Antiga and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS-W."},{"key":"e_1_2_1_61_1","volume-title":"Espresso: Efficient forward propagation for BCNNs.","author":"Pedersoli Fabrizio","year":"2018","unstructured":"Fabrizio Pedersoli , George Tzanetakis , and Andrea Tagliasacchi . 2018 . Espresso: Efficient forward propagation for BCNNs. (2018), 1--10. arxiv:arXiv:1705.07175v2 Fabrizio Pedersoli, George Tzanetakis, and Andrea Tagliasacchi. 2018. Espresso: Efficient forward propagation for BCNNs. (2018), 1--10. arxiv:arXiv:1705.07175v2"},{"key":"e_1_2_1_62_1","volume-title":"XNOR-Net: ImageNet classification using binary convolutional neural networks. arXiv preprint","author":"Rastegari Mohammad","year":"2016","unstructured":"Mohammad Rastegari , Vicente Ordonez , Joseph Redmon , and Ali Farhadi . 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. arXiv preprint ( 2016 ), 1--17. DOI:https:\/\/doi.org\/10.1007\/978-3-319-46493-0 arxiv:1603.05279 10.1007\/978-3-319-46493-0 Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks. arXiv preprint (2016), 1--17. DOI:https:\/\/doi.org\/10.1007\/978-3-319-46493-0 arxiv:1603.05279"},{"key":"e_1_2_1_63_1","volume-title":"BRein memory: A single-chip binary\/ternary reconfigurable in-memory deep neural network. 53, 4","author":"Sato Shimpei","year":"2018","unstructured":"Shimpei Sato , Hiroki Nakahara , and Shinya Takamaeda-yamazaki. 2018. BRein memory: A single-chip binary\/ternary reconfigurable in-memory deep neural network. 53, 4 ( 2018 ), 983--994. Shimpei Sato, Hiroki Nakahara, and Shinya Takamaeda-yamazaki. 2018. BRein memory: A single-chip binary\/ternary reconfigurable in-memory deep neural network. 53, 4 (2018), 983--994."},{"key":"e_1_2_1_64_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ), 1--14. DOI:https:\/\/doi.org\/10.1016\/j.infsof.2008.09.005 arxiv:1409.1556 10.1016\/j.infsof.2008.09.005 Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014), 1--14. DOI:https:\/\/doi.org\/10.1016\/j.infsof.2008.09.005 arxiv:1409.1556"},{"key":"e_1_2_1_65_1","unstructured":"Ranko Sredojevic Shaoyi Cheng Lazar Supic Rawan Naous and Vladimir Stojanovic. 2017. Structured deep neural network pruning via matrix pivoting. (2017) 1--16. arxiv:1712.01084 http:\/\/arxiv.org\/abs\/1712.01084  Ranko Sredojevic Shaoyi Cheng Lazar Supic Rawan Naous and Vladimir Stojanovic. 2017. Structured deep neural network pruning via matrix pivoting. (2017) 1--16. arxiv:1712.01084 http:\/\/arxiv.org\/abs\/1712.01084"},{"key":"e_1_2_1_66_1","unstructured":"STMicroelectronics. 2018. STM32 Nucleo-144 boards. https:\/\/www.st.com\/resource\/en\/data_brief\/nucleo-f746zg.pdf.  STMicroelectronics. 2018. STM32 Nucleo-144 boards. https:\/\/www.st.com\/resource\/en\/data_brief\/nucleo-f746zg.pdf."},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021744"},{"key":"e_1_2_1_68_1","volume-title":"Proceedings of the Deep Learning and Unsupervised Feature Learning Workshop, NIPS","author":"Vanhoucke Vincent","year":"2011","unstructured":"Vincent Vanhoucke , Andrew Senior , and Mark Z. Mao . 2011. Improving the speed of neural networks on CPUs . In Proceedings of the Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011 . Vincent Vanhoucke, Andrew Senior, and Mark Z. Mao. 2011. Improving the speed of neural networks on CPUs. In Proceedings of the Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011."},{"key":"e_1_2_1_69_1","unstructured":"Huan Wang Qiming Zhang Yuehai Wang and Roland Hu. 2018. Structured deep neural network pruning by varying regularization parameters. (2018). arxiv:1804.09461 http:\/\/arxiv.org\/abs\/1804.09461  Huan Wang Qiming Zhang Yuehai Wang and Roland Hu. 2018. Structured deep neural network pruning by varying regularization parameters. (2018). arxiv:1804.09461 http:\/\/arxiv.org\/abs\/1804.09461"},{"key":"e_1_2_1_70_1","volume-title":"Speech commands: A dataset for limited-vocabulary speech recognition. CoRR abs\/1804.03209","author":"Warden Pete","year":"2018","unstructured":"Pete Warden . 2018. Speech commands: A dataset for limited-vocabulary speech recognition. CoRR abs\/1804.03209 ( 2018 ). arxiv:1804.03209 http:\/\/arxiv.org\/abs\/1804.03209 Pete Warden. 2018. Speech commands: A dataset for limited-vocabulary speech recognition. CoRR abs\/1804.03209 (2018). arxiv:1804.03209 http:\/\/arxiv.org\/abs\/1804.03209"},{"key":"e_1_2_1_71_1","doi-asserted-by":"crossref","unstructured":"Haojin Yang Martin Fritzsche Christian Bartz and Christoph Meinel. 2017. BMXNet: An open-source binary neural network implementation based on MXNet. (2017). arxiv:arXiv:1705.09864  Haojin Yang Martin Fritzsche Christian Bartz and Christoph Meinel. 2017. BMXNet: An open-source binary neural network implementation based on MXNet. (2017). arxiv:arXiv:1705.09864","DOI":"10.1145\/3123266.3129393"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/3218603.3218615"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISMVL.2018.00038"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080215"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783723"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021741"},{"key":"e_1_2_1_77_1","volume-title":"DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. 1, 1","author":"Zhou Shuchang","year":"2016","unstructured":"Shuchang Zhou , Yuxin Wu , Zekun Ni , Xinyu Zhou , He Wen , and Yuheng Zou . 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. 1, 1 ( 2016 ), 1--13. arxiv:1606.06160 http:\/\/arxiv.org\/abs\/1606.06160 Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. 1, 1 (2016), 1--13. arxiv:1606.06160 http:\/\/arxiv.org\/abs\/1606.06160"},{"key":"e_1_2_1_78_1","volume-title":"Dally","author":"Zhu Chenzhuo","year":"2016","unstructured":"Chenzhuo Zhu , Song Han , Huizi Mao , and William J . Dally . 2016 . Trained ternary quantization. (2016), 1--10. arxiv:1612.01064 http:\/\/arxiv.org\/abs\/1612.01064 Chenzhuo Zhu, Song Han, Huizi Mao, and William J. Dally. 2016. Trained ternary quantization. (2016), 1--10. arxiv:1612.01064 http:\/\/arxiv.org\/abs\/1612.01064"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3371157","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3371157","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:05:44Z","timestamp":1750273544000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3371157"}},"subtitle":["Pruned-Permuted-Packed XNOR Networks for Edge Machine Learning"],"short-title":[],"issued":{"date-parts":[[2020,1,31]]},"references-count":80,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1,31]]}},"alternative-id":["10.1145\/3371157"],"URL":"https:\/\/doi.org\/10.1145\/3371157","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2020,1,31]]},"assertion":[{"value":"2019-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-02-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}