{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,20]],"date-time":"2025-11-20T18:53:10Z","timestamp":1763664790555,"version":"3.41.0"},"reference-count":202,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2022,9,30]],"date-time":"2022-09-30T00:00:00Z","timestamp":1664496000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Electronic Components and Systems for European Leadership Joint Undertaking ANDANTE","award":["876925"],"award-info":[{"award-number":["876925"]}]},{"name":"Swiss National Science Foundation (SNSF) BRIDGE","award":["40B2-0_181010"],"award-info":[{"award-number":["40B2-0_181010"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2022,9,30]]},"abstract":"<jats:p>Implementing embedded neural network processing at the edge requires efficient hardware acceleration that combines high computational throughput with low power consumption. Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly being adapted to support the improved functionalities. Hardware designers can refer to a myriad of accelerator implementations in the literature to evaluate and compare hardware design choices. However, the sheer number of publications and their diverse optimization directions hinder an effective assessment. Existing surveys provide an overview of these works but are often limited to system-level and benchmark-specific performance metrics, making it difficult to quantitatively compare the individual effects of each utilized optimization technique. This complicates the evaluation of optimizations for new accelerator designs, slowing-down the research progress.<\/jats:p>\n          <jats:p>\n            In contrast to previous surveys, this work provides a\n            <jats:italic>quantitative<\/jats:italic>\n            overview of neural network accelerator optimization approaches that have been used in recent works and reports their\n            <jats:italic>individual<\/jats:italic>\n            effects on edge processing performance. The list of optimizations and their quantitative effects are presented as a construction kit, allowing to assess the design choices for each building block individually. Reported optimizations range from up to 10,000\u00d7 memory savings to 33\u00d7 energy reductions, providing chip designers with an overview of design choices for implementing efficient low power neural network accelerators.\n          <\/jats:p>","DOI":"10.1145\/3520127","type":"journal-article","created":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T13:42:43Z","timestamp":1646660563000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["A Construction Kit for Efficient Low Power Neural Network Accelerator Designs"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8846-4413","authenticated-orcid":false,"given":"Petar","family":"Jokic","sequence":"first","affiliation":[{"name":"CSEM, Switzerland and ETH Zurich, Zurich, ZH, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Erfan","family":"Azarkhish","sequence":"additional","affiliation":[{"name":"CSEM, Zurich, ZH, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrea","family":"Bonetti","sequence":"additional","affiliation":[{"name":"CSEM, Zurich, ZH, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marc","family":"Pons","sequence":"additional","affiliation":[{"name":"CSEM, Zurich, ZH, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephane","family":"Emery","sequence":"additional","affiliation":[{"name":"CSEM, Zurich, ZH, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luca","family":"Benini","sequence":"additional","affiliation":[{"name":"ETH Zurich, Zurich, Switzerland and University of Bologna, Bologna, BO, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,10,8]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2876865"},{"key":"e_1_3_1_3_2","article-title":"Hello Edge: Keyword spotting on microcontrollers","author":"Zhang Y.","year":"2017","unstructured":"Y. Zhang, N. Suda, L. Lai, and V. Chandra. 2017. Hello Edge: Keyword spotting on microcontrollers. CoRR, vol. abs\/1711.07128, 2017.","journal-title":"CoRR"},{"key":"e_1_3_1_4_2","volume-title":"International Conference on Neural Information Processing Systems - Volume 1","author":"Krizhevsky A.","year":"2012","unstructured":"A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In International Conference on Neural Information Processing Systems - Volume 1, USA, 2012."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_6_2","unstructured":"G. Huang Z. Liu and K. Q. Weinberger. 2016. Densely connected convolutional networks. CoRR vol. abs\/1608.06993 2016."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/MAHC.2010.28"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2013.01.010"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2016.2579198"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2019.2918951"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2015.2476786"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/2637166.2637230"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2016.7418003"},{"key":"e_1_3_1_14_2","unstructured":"Google Google Clips specifications 2017. [Online]. Available: https:\/\/support.google.com\/googleclips\/answer\/7545447?hl=en. [Accessed 21 05 2019]."},{"key":"e_1_3_1_15_2","unstructured":"Xiaomi Xiaomo AI door bell overview 2019. [Online]. Available: https:\/\/www.xiaomitoday.com\/2019\/10\/29\/xiaomi-xiaomo-mijia-ai-face-identifcation-1080p-door-bell\/. [Accessed 09 02 2021]."},{"key":"e_1_3_1_16_2","unstructured":"Orcam Orcam MyEye 2 specifications 2020. [Online]. Available: https:\/\/www.orcam.com\/en\/myeye2\/specification [Accessed 09 02 2021]."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2911899"},{"key":"e_1_3_1_18_2","doi-asserted-by":"crossref","unstructured":"S. Oh et al. 2019. IoT2 \u2014 The Internet of Tiny Things: Realizing mm-scale sensors through 3D die stacking. In Design Automation Test in Europe Conference Exhibition 2019.","DOI":"10.23919\/DATE.2019.8715201"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/SIITME50350.2020.9292253"},{"key":"e_1_3_1_20_2","volume-title":"Symposium on VLSI Circuits","author":"Giraldo J. S. P.","year":"2019","unstructured":"J. S. P. Giraldo, S. Lauwereins, K. Badami, H. V. Hamme, and M. Verhelst. 2019. 18\u03bcW SoC for near-microphone keyword spotting and speaker verification. In Symposium on VLSI Circuits, 2019."},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2021.3058217"},{"key":"e_1_3_1_22_2","unstructured":"Z. Jia B. Tillman M. Maggioni and D. P. Scarpazza. 2019. Dissecting the graphcore IPU architecture via microbenchmarking. CoRR vol. abs\/1912.03413 2019."},{"key":"e_1_3_1_23_2","unstructured":"K. Guo et al. 2020. Neural network accelerator comparison. 2020. [Online]. Available: http:\/\/nicsefc.ee.tsinghua.edu.cn\/projects\/neural-network-accelerator\/. [Accessed 22 03 2021]."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2019.8916327"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC43674.2020.9286149"},{"key":"e_1_3_1_26_2","unstructured":"C. D. Schuman et al. 2017. A survey of neuromorphic computing and neural networks in hardware. CoRR vol. abs\/1705.06963 2017."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-01766-7_2"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2017.2761740"},{"key":"e_1_3_1_29_2","article-title":"Hardware for machine learning: Challenges and opportunities","author":"Sze V.","year":"2017","unstructured":"V. Sze, Y. Chen, J. Emer, A. Suleiman, and Z. Zhang. 2017. Hardware for machine learning: Challenges and opportunities. In IEEE Custom Integrated Circuits Conference, 2017.","journal-title":"IEEE Custom Integrated Circuits Conference"},{"key":"e_1_3_1_30_2","article-title":"Efficient hardware implementations of deep neural networks: A survey","author":"Bodiwala S.","year":"2020","unstructured":"S. Bodiwala and N. Nanavati. 2020. Efficient hardware implementations of deep neural networks: A survey. In International Conference on Inventive Systems and Control, 2020.","journal-title":"International Conference on Inventive Systems and Control"},{"key":"e_1_3_1_31_2","article-title":"A survey of FPGA-based accelerators for convolutional neural networks","author":"Mittal S.","year":"2018","unstructured":"S. Mittal. 2018. A survey of FPGA-based accelerators for convolutional neural networks. Neural Computing and Applications, 2018.","journal-title":"Neural Computing and Applications"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3039858"},{"key":"e_1_3_1_33_2","article-title":"AI enabling technologies: A survey","author":"Gadepally V.","year":"2019","unstructured":"V. Gadepally et al. 2019. AI enabling technologies: A survey. CoRR, vol. abs\/1905.03592, 2019.","journal-title":"CoRR"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1002\/adma.201902761"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2020.2996625"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eng.2020.01.007"},{"key":"e_1_3_1_37_2","article-title":"Edge intelligence: Architectures, challenges, and applications","author":"Xu D.","year":"2003","unstructured":"D. Xu et al. 2003. Edge intelligence: Architectures, challenges, and applications. CoRR, vol. abs\/2003.12172, 2020.","journal-title":"CoRR"},{"key":"e_1_3_1_38_2","article-title":"ShiDianNao: Shifting vision processing closer to the sensor","author":"Du Z.","year":"2015","unstructured":"Z. Du et al. 2015. ShiDianNao: Shifting vision processing closer to the sensor. In Int. Symposium on Computer Architecture, 2015.","journal-title":"Int. Symposium on Computer Architecture"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.30"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2017.7870353"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2016.2616357"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2017.2682138"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2018.2865489"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2910232"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2958684"},{"key":"e_1_3_1_46_2","article-title":"SamurAI: A 1.7MOPS-36GOPS adaptive versatile IoT node with 15,000\u00d7 peak-to-idle power reduction, 207ns wake-up time and 1.3TOPS\/W ML efficiency","author":"Miro-Panades I.","year":"2020","unstructured":"I. Miro-Panades et al. 2020. SamurAI: A 1.7MOPS-36GOPS adaptive versatile IoT node with 15,000\u00d7 peak-to-idle power reduction, 207ns wake-up time and 1.3TOPS\/W ML efficiency. In IEEE Symposium on VLSI Circuits, 2020.","journal-title":"IEEE Symposium on VLSI Circuits"},{"key":"e_1_3_1_47_2","article-title":"An ultra-low-power analog-digital hybrid CNN face recognition processor integrated with a CIS for always-on mobile devices","author":"Kim J.-H.","year":"2019","unstructured":"J.-H. Kim, C. Kim, K. Kim, and H.-J. Yoo. 2019. An ultra-low-power analog-digital hybrid CNN face recognition processor integrated with a CIS for always-on mobile devices. In IEEE Int. Symposium on Circuits and Systems, 2019.","journal-title":"IEEE Int. Symposium on Circuits and Systems"},{"key":"e_1_3_1_48_2","article-title":"Hardware-oriented approximation of convolutional neural networks","author":"Gysel P.","year":"2016","unstructured":"P. Gysel, M. Motamedi, and S. Ghiasi. 2016. Hardware-oriented approximation of convolutional neural networks. CoRR, vol. abs\/1604.03168, 2016.","journal-title":"CoRR"},{"key":"e_1_3_1_49_2","article-title":"A survey of quantization methods for efficient neural network inference","author":"Gholami A.","year":"2021","unstructured":"A. Gholami et al. 2021. A survey of quantization methods for efficient neural network inference. CoRR, 2021.","journal-title":"CoRR"},{"key":"e_1_3_1_50_2","doi-asserted-by":"crossref","unstructured":"A. Ignatov et al. 2019. AI benchmark: All about deep learning on smartphones in 2019. CoRR vol. abs\/1910.06663 2019.","DOI":"10.1109\/ICCVW.2019.00447"},{"key":"e_1_3_1_51_2","article-title":"MLPerf inference benchmark","author":"Reddi V. J.","year":"2019","unstructured":"V. J. Reddi et al. 2019. MLPerf inference benchmark. CoRR, vol. abs\/1911.02549, 2019.","journal-title":"CoRR"},{"key":"e_1_3_1_52_2","article-title":"Benchmarking TinyML systems: Challenges and direction","volume":"3","author":"Banbury C. R.","year":"2020","unstructured":"C. R. Banbury et al. 2020. Benchmarking TinyML systems: Challenges and direction. CoRR, vol. abs\/2003.04821, 3 2020.","journal-title":"CoRR"},{"key":"e_1_3_1_53_2","unstructured":"EEMBC Exploring CoreMark \u2013 A Benchmark Maximizing Simplicity and Efficacy 2009. [Online] Available: https:\/\/www.eembc.org\/techlit\/articles\/coremark-whitepaper.pdf. [Accessed 03 02 2021]."},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/1498765.1498785"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/85.238389"},{"key":"e_1_3_1_56_2","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/978-3-642-58589-0_4","article-title":"Multiple-issue processors","author":"Silc J.","year":"1999","unstructured":"J. Silc, B. Robic, and T. Ungerer. 1999. Multiple-issue processors. In Processor Architecture: From Dataflow to Superscalar and Beyond, Berlin: Springer Berlin, 1999, 123\u2013219.","journal-title":"Processor Architecture: From Dataflow to Superscalar and Beyond"},{"key":"e_1_3_1_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080246"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001177"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2020.2979965"},{"key":"e_1_3_1_60_2","unstructured":"A. Samajdar Y. Zhu P. N. Whatmough M. Mattina and T. Krishna. 2018. SCALE-Sim: Systolic CNN Accelerator. CoRR vol. abs\/1811.02883 2018."},{"key":"e_1_3_1_61_2","article-title":"Design considerations for efficient deep neural networks on processing-in-memory accelerators","author":"Yang T.-J.","year":"2019","unstructured":"T.-J. Yang and V. Sze. 2019. Design considerations for efficient deep neural networks on processing-in-memory accelerators. In IEEE International Electron Devices Meeting, 2019.","journal-title":"IEEE International Electron Devices Meeting"},{"key":"e_1_3_1_62_2","volume-title":"Reflections on the Memory WallProceedings of the 1st Conference on Computing Frontiers","author":"McKee S. A.","year":"2004","unstructured":"S. A. McKee. 2004. Reflections on the Memory Wall. In Proceedings of the 1st Conference on Computing Frontiers, New York, NY, USA, 2004."},{"key":"e_1_3_1_63_2","article-title":"Compute-in-Memory with emerging nonvolatile-memories: Challenges and prospects","author":"Yu S.","year":"2020","unstructured":"S. Yu, X. Sun, X. Peng, and S. Huang. 2020. Compute-in-Memory with emerging nonvolatile-memories: Challenges and prospects. In IEEE Custom Integrated Circuits Conference, 2020.","journal-title":"IEEE Custom Integrated Circuits Conference"},{"key":"e_1_3_1_64_2","article-title":"24.5 A Twin-8T SRAM computation-in-memory macro for multiple-bit CNN-based machine learning","author":"Si X.","year":"2019","unstructured":"X. Si et al. 2019. 24.5 A Twin-8T SRAM computation-in-memory macro for multiple-bit CNN-based machine learning. In IEEE International Solid- State Circuits Conference, 2019.","journal-title":"IEEE International Solid- State Circuits Conference"},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2016.2642198"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2018.2829522"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC42613.2021.9365984"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2018.2790840"},{"key":"e_1_3_1_69_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41928-019-0270-x"},{"key":"e_1_3_1_70_2","first-page":"1","article-title":"Robust processing-in-memory with multi-bit ReRAM using Hessian-driven mixed-precision computation","author":"Dash S.","year":"2021","unstructured":"S. Dash, Y. Luo, A. Lu, S. Yu, and S. Mukhopadhyay. 2021. Robust processing-in-memory with multi-bit ReRAM using Hessian-driven mixed-precision computation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1\u20131, 2021.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_1_71_2","article-title":"24.1 A 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN based AI edge processors","author":"Xue C.","year":"2019","unstructured":"C. Xue et al. 2019. 24.1 A 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN based AI edge processors. In IEEE Int. Solid- State Circuits Conference, 2019.","journal-title":"IEEE Int. Solid- State Circuits Conference"},{"key":"e_1_3_1_72_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2019.2943047"},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2020.3042411"},{"key":"e_1_3_1_74_2","doi-asserted-by":"publisher","DOI":"10.1109\/IEDM19573.2019.8993491"},{"key":"e_1_3_1_75_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3011265"},{"key":"e_1_3_1_76_2","unstructured":"Reprinted from Electronics"},{"key":"e_1_3_1_77_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2017.31"},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.1109\/N-SSC.2007.4785534"},{"key":"e_1_3_1_79_2","first-page":"1593","volume-title":"Encyclopedia of Parallel Computing, D. Padua, Hrsg","author":"Bose P.","year":"2011","unstructured":"P. Bose. 2011. Power Wall. In Encyclopedia of Parallel Computing, D. Padua, Hrsg., Boston, MA: Springer US, 2011, 1593\u20131608."},{"key":"e_1_3_1_80_2","volume-title":"IEEE International Solid-State Circuits Conference","author":"Horowitz M.","year":"2014","unstructured":"M. Horowitz. 2014. 1.1 Computing's energy problem (and what we can do about it). In IEEE International Solid-State Circuits Conference, 2014."},{"key":"e_1_3_1_81_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2012.2221233"},{"key":"e_1_3_1_82_2","article-title":"QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS","author":"Ueyoshi K.","year":"2018","unstructured":"K. Ueyoshi et al. 2018. QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS. In IEEE Int. Solid-State Circuits Conference, 2018.","journal-title":"IEEE Int. Solid-State Circuits Conference"},{"key":"e_1_3_1_83_2","unstructured":"IRDS. 2021. International Roadmap for Devices and Systems: 2020 Update 2020. [Online] Available: https:\/\/irds.ieee.org\/editions\/2020. [Accessed 11 05 2021]."},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.1109\/VLSIT.2012.6242496"},{"key":"e_1_3_1_85_2","volume-title":"European Solid-State Circuits Conference","author":"Patel H. N.","year":"2016","unstructured":"H. N. Patel et al. 2016. A 55nm ultra low leakage deeply depleted Channel technology optimized for energy minimization in subthreshold SRAM and logic. In European Solid-State Circuits Conference, 2016."},{"key":"e_1_3_1_86_2","volume-title":"IEEE Int. Electron Dev. Meeting","author":"Natarajan S.","year":"2014","unstructured":"S. Natarajan et al. 2014. A 14nm logic technology featuring 2nd-generation FinFET, air-gapped interconnects, self-aligned double patterning and a 0.0588 \u03bcm2 SRAM cell size. In IEEE Int. Electron Dev. Meeting, 2014."},{"key":"e_1_3_1_87_2","article-title":"FDSOI vs FinFET: Differentiating device features for ultra low power IoT applications","author":"Weber O.","year":"2017","unstructured":"O. Weber. 2017. FDSOI vs FinFET: Differentiating device features for ultra low power IoT applications. In IEEE Int. Conf. on IC Design and Technology, 2017.","journal-title":"IEEE Int. Conf. on IC Design and Technology"},{"key":"e_1_3_1_88_2","article-title":"22nm FDSOI technology for emerging mobile, Internet-of-Things, and RF applications","author":"Carter R.","year":"2016","unstructured":"R. Carter et al. 2016. 22nm FDSOI technology for emerging mobile, Internet-of-Things, and RF applications. In IEEE Int. Electron Dev. Meet., 2016.","journal-title":"IEEE Int. Electron Dev. Meet."},{"key":"e_1_3_1_89_2","doi-asserted-by":"publisher","DOI":"10.1149\/1.3117397"},{"key":"e_1_3_1_90_2","article-title":"Ultra low-power standard cell design using planar bulk CMOS in subthreshold operation","author":"Pons M.","year":"2013","unstructured":"M. Pons et al. 2013. Ultra low-power standard cell design using planar bulk CMOS in subthreshold operation. In International Workshop on Power and Timing Modeling, Optimization and Simulation, 2013.","journal-title":"International Workshop on Power and Timing Modeling, Optimization and Simulation"},{"key":"e_1_3_1_91_2","article-title":"A 1kb single-side read 6T sub-threshold SRAM in 180 nm with 530 Hz frequency 3.1 nA total current and 2.4 nA leakage at 0.27 V","author":"Pons M.","year":"2015","unstructured":"M. Pons et al. 2015. A 1kb single-side read 6T sub-threshold SRAM in 180 nm with 530 Hz frequency 3.1 nA total current and 2.4 nA leakage at 0.27 V. In IEEE SOI-3D-Subthr. Microel. Tech. Unified Conf., 2015.","journal-title":"IEEE SOI-3D-Subthr. Microel. Tech. Unified Conf."},{"key":"e_1_3_1_92_2","article-title":"Sub-threshold latch-based icyflex2 32-bit processor with wide supply range operation","author":"Pons M.","year":"2016","unstructured":"M. Pons et al. 2016. Sub-threshold latch-based icyflex2 32-bit processor with wide supply range operation. In Europ. Solid-State Circuits Conf., 2016.","journal-title":"Europ. Solid-State Circuits Conf."},{"key":"e_1_3_1_93_2","article-title":"A 0.5 V 2.5 \u03bcW\/MHz microcontroller with analog-assisted adaptive body bias PVT compensation with 3.13nW\/kB SRAM retention in 55nm deeply-depleted Channel CMOS","author":"Pons M.","year":"2019","unstructured":"M. Pons et al. 2019. A 0.5 V 2.5 \u03bcW\/MHz microcontroller with analog-assisted adaptive body bias PVT compensation with 3.13nW\/kB SRAM retention in 55nm deeply-depleted Channel CMOS. In IEEE Custom Integrated Circuits Conference, 2019.","journal-title":"IEEE Custom Integrated Circuits Conference"},{"key":"e_1_3_1_94_2","doi-asserted-by":"publisher","DOI":"10.1109\/S3S.2017.8309246"},{"key":"e_1_3_1_95_2","doi-asserted-by":"publisher","DOI":"10.5555\/1267638.1267640"},{"key":"e_1_3_1_96_2","first-page":"203","volume-title":"Technologies for Wireless Computing","author":"Burd T. D.","year":"1996","unstructured":"T. D. Burd and R. W. Brodersen. 1996. Processor design for portable systems. In Technologies for Wireless Computing, USA, Kluwer Academic Publishers, 1996, 203\u2013221."},{"key":"e_1_3_1_97_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISLPED.2015.7273520"},{"key":"e_1_3_1_98_2","article-title":"A systematic approach to blocking convolutional neural networks","author":"Yang X.","year":"2016","unstructured":"X. Yang et al. 2016. A systematic approach to blocking convolutional neural networks. CoRR, vol. abs\/1606.04209, 2016.","journal-title":"CoRR"},{"key":"e_1_3_1_99_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sse.2016.07.006"},{"key":"e_1_3_1_100_2","doi-asserted-by":"publisher","DOI":"10.3390\/electronics9091414"},{"key":"e_1_3_1_101_2","article-title":"A 40nm low-power logic compatible phase change memory technology","author":"Wu J. Y.","year":"2018","unstructured":"J. Y. Wu et al. 2018. A 40nm low-power logic compatible phase change memory technology. In IEEE Int. Electron Devices Meeting, 2018.","journal-title":"IEEE Int. Electron Devices Meeting"},{"key":"e_1_3_1_102_2","article-title":"1Gbit high density embedded STT-MRAM in 28nm FDSOI technology","author":"Lee K.","year":"2019","unstructured":"K. Lee et al. 2019. 1Gbit high density embedded STT-MRAM in 28nm FDSOI technology. In IEEE Int. Electron Devices Meeting, 2019.","journal-title":"IEEE Int. Electron Devices Meeting"},{"key":"e_1_3_1_103_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2019.8662444"},{"key":"e_1_3_1_104_2","article-title":"Demonstration of highly manufacturable STT-MRAM embedded in 28nm logic","author":"Song Y. J.","year":"2018","unstructured":"Y. J. Song et al. 2018. Demonstration of highly manufacturable STT-MRAM embedded in 28nm logic. In IEEE Int. Electron Devices Meeting, 2018.","journal-title":"IEEE Int. Electron Devices Meeting"},{"key":"e_1_3_1_105_2","volume-title":"IEEE Int. Solid-State Circuits Conference","author":"Liu T.","year":"2013","unstructured":"T. Liu et al. 2013. A 130.7mm2 2-layer 32Gb ReRAM memory device in 24nm technology. In IEEE Int. Solid-State Circuits Conference, 2013."},{"key":"e_1_3_1_106_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2018.8310392"},{"key":"e_1_3_1_107_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2019.8662393"},{"key":"e_1_3_1_108_2","article-title":"24.2 A 14nm-FinFET 1Mb embedded 1T1R RRAM with a 0.022\u03bc m2 cell size using self-adaptive delayed termination and multi-cell reference","author":"Yang J.","year":"2021","unstructured":"J. Yang et al. 2021. 24.2 A 14nm-FinFET 1Mb embedded 1T1R RRAM with a 0.022\u03bc m2 cell size using self-adaptive delayed termination and multi-cell reference. In IEEE Int. Solid- State Circuits Conf., 2021.","journal-title":"IEEE Int. Solid- State Circuits Conf."},{"key":"e_1_3_1_109_2","article-title":"FeFET: A versatile CMOS compatible device with game-changing potential","author":"Beyer S.","year":"2020","unstructured":"S. Beyer et al. 2020. FeFET: A versatile CMOS compatible device with game-changing potential. In IEEE Int. Memory Workshop, 2020.","journal-title":"IEEE Int. Memory Workshop"},{"key":"e_1_3_1_110_2","doi-asserted-by":"publisher","DOI":"10.1109\/JXCDC.2019.2930284"},{"key":"e_1_3_1_111_2","article-title":"14.7 A 288\u03bcW programmable deep-learning processor with 270KB on-chip weight storage using non-uniform memory hierarchy for mobile intelligence","author":"Bang S.","year":"2017","unstructured":"S. Bang et al. 2017. 14.7 A 288\u03bcW programmable deep-learning processor with 270KB on-chip weight storage using non-uniform memory hierarchy for mobile intelligence. In IEEE Int. Solid-State Circuits Conf., 2017.","journal-title":"IEEE Int. Solid-State Circuits Conf."},{"key":"e_1_3_1_112_2","doi-asserted-by":"publisher","DOI":"10.1109\/A-SSCC48613.2020.9336116"},{"key":"e_1_3_1_113_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2017.2767705"},{"issue":"5","key":"e_1_3_1_114_2","article-title":"Power, area, and performance optimization of standard cell memory arrays through controlled placement","volume":"21","author":"Teman A.","year":"2016","unstructured":"A. Teman, D. Rossi, P. Meinerzhagen, L. Benini, and A. Burg. 2016. Power, area, and performance optimization of standard cell memory arrays through controlled placement. ACM Trans. Des. Autom. Electron. Syst. 21, 5 (2016).","journal-title":"ACM Trans. Des. Autom. Electron. Syst"},{"key":"e_1_3_1_115_2","article-title":"XNOR Neural Engine: A hardware accelerator IP for 21.6 fJ\/op binary neural network inference","volume":"7","author":"Conti F.","year":"2018","unstructured":"F. Conti, P. Davide Schiavone, and L. Benini. 2018. XNOR Neural Engine: A hardware accelerator IP for 21.6 fJ\/op binary neural network inference. ArXiv e-prints, 7 2018.","journal-title":"ArXiv e-prints"},{"key":"e_1_3_1_116_2","doi-asserted-by":"publisher","DOI":"10.1109\/VLSIT.2012.6242474"},{"key":"e_1_3_1_117_2","doi-asserted-by":"publisher","DOI":"10.1145\/3093337.3037702"},{"key":"e_1_3_1_118_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2017.2747087"},{"key":"e_1_3_1_119_2","article-title":"A 14.3pW sub-threshold 2T gain-cell eDRAM for ultra-low power IoT applications in 28nm FD-SOI","author":"Giterman R.","year":"2018","unstructured":"R. Giterman, A. Teman, and A. Fish. 2018. A 14.3pW sub-threshold 2T gain-cell eDRAM for ultra-low power IoT applications in 28nm FD-SOI. In IEEE SOI-3D-Subthreshold Microel. Tech. Unified Conf., 2018.","journal-title":"IEEE SOI-3D-Subthreshold Microel. Tech. Unified Conf."},{"key":"e_1_3_1_120_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.622505"},{"key":"e_1_3_1_121_2","unstructured":"Cypress SONOS flash technology 2019. [Online]. Available: https:\/\/www.cypress.com\/file\/123341\/download. [Accessed 21 03 2021]."},{"key":"e_1_3_1_122_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2905361"},{"key":"e_1_3_1_123_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2020.3046125"},{"key":"e_1_3_1_124_2","article-title":"Fused-layer CNN accelerators","author":"Alwani M.","year":"2016","unstructured":"M. Alwani, H. Chen, M. Ferdman, and P. Milder. 2016. Fused-layer CNN accelerators. In IEEE\/ACM Int. Symp. on Microarchitecture, 2016.","journal-title":"IEEE\/ACM Int. Symp. on Microarchitecture"},{"key":"e_1_3_1_125_2","first-page":"1","article-title":"CUTIE: Beyond PetaOp\/s\/W Ternary DNN inference acceleration with better-than-binary energy efficiency","author":"Scherer M.","year":"2021","unstructured":"M. Scherer, G. Rutishauser, L. Cavigelli, and L. Benini. 2021. CUTIE: Beyond PetaOp\/s\/W Ternary DNN inference acceleration with better-than-binary energy efficiency. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1\u20131, 2021.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_1_126_2","first-page":"2019","article-title":"Optimally scheduling CNN convolutions for efficient memory access","author":"Stoutchinin A.","year":"2019","unstructured":"A. Stoutchinin, F. Conti, and L. Benini. 2019. Optimally scheduling CNN convolutions for efficient memory access. CoRR, vol. abs\/1902.01492, 2019.","journal-title":"CoRR"},{"key":"e_1_3_1_127_2","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2018.8573527"},{"key":"e_1_3_1_128_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.3012215"},{"key":"e_1_3_1_129_2","article-title":"CMSIS-NN: Efficient neural network kernels for Arm Cortex-M CPUs","author":"Lai L.","year":"2018","unstructured":"L. Lai, N. Suda, and V. Chandra. 2018. CMSIS-NN: Efficient neural network kernels for Arm Cortex-M CPUs. CoRR, vol. abs\/1801.06601, 2018.","journal-title":"CoRR"},{"key":"e_1_3_1_130_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.3012652"},{"key":"e_1_3_1_131_2","article-title":"ASP Vision: Optically computing the first layer of convolutional neural networks using angle sensitive pixels","author":"Chen H. G.","year":"2016","unstructured":"H. G. Chen et al. 2016. ASP Vision: Optically computing the first layer of convolutional neural networks using angle sensitive pixels. CoRR, vol. abs\/1605.03621, 2016.","journal-title":"CoRR"},{"key":"e_1_3_1_132_2","article-title":"Efficient neural vision systems based on convolutional image acquisition","author":"Pad P.","year":"2020","unstructured":"P. Pad et al. 2020. Efficient neural vision systems based on convolutional image acquisition. In Conf. on Comp. Vision and Pattern Recog., 2020.","journal-title":"Conf. on Comp. Vision and Pattern Recog."},{"key":"e_1_3_1_133_2","first-page":"825","article-title":"Realizing low-energy classification systems by implementing matrix multiplication directly within an ADC","volume":"9","author":"Wang Z.","year":"2015","unstructured":"Z. Wang, J. Zhang, and N. Verma. 2015. Realizing low-energy classification systems by implementing matrix multiplication directly within an ADC. IEEE Trans. Biomed. Circuits Syst. 9 (2015), 825\u2013837.","journal-title":"IEEE Trans. Biomed. Circuits Syst."},{"key":"e_1_3_1_134_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001164"},{"key":"e_1_3_1_135_2","article-title":"Distributed deep neural networks over the cloud, the edge and end devices","author":"Teerapittayanon S.","year":"2017","unstructured":"S. Teerapittayanon, B. McDanel, and H. T. Kung. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In IEEE Int. Conference on Distributed Computing Systems, 2017.","journal-title":"IEEE Int. Conference on Distributed Computing Systems"},{"key":"e_1_3_1_136_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2858384"},{"key":"e_1_3_1_137_2","article-title":"BranchyNet: Fast inference via early exiting from deep neural networks","author":"Teerapittayanon S.","year":"2017","unstructured":"S. Teerapittayanon, B. McDanel, and H. T. Kung. 2017. BranchyNet: Fast inference via early exiting from deep neural networks. CoRR, vol. abs\/1709.01686, 2017.","journal-title":"CoRR"},{"key":"e_1_3_1_138_2","doi-asserted-by":"publisher","DOI":"10.5555\/2971808.2971918"},{"key":"e_1_3_1_139_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.3012320"},{"key":"e_1_3_1_140_2","article-title":"Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution","author":"Liu L.","year":"2017","unstructured":"L. Liu and J. Deng. 2017. Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution. CoRR, vol. abs\/1701.00299, 2017.","journal-title":"CoRR"},{"key":"e_1_3_1_141_2","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2744904"},{"key":"e_1_3_1_142_2","first-page":"1","article-title":"Improving memory utilization in convolutional neural network accelerators","author":"Jokic P.","year":"2020","unstructured":"P. Jokic, S. Emery, and L. Benini. 2020. Improving memory utilization in convolutional neural network accelerators. IEEE Embedded Systems Letters, 1\u20131, 2020.","journal-title":"IEEE Embedded Systems Letters"},{"key":"e_1_3_1_143_2","article-title":"Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding","author":"Han S.","year":"2016","unstructured":"S. Han, H. Mao, and W. J. Dally. 2016. Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding. In Int. Conf. on Learning Representations, 2016.","journal-title":"Int. Conf. on Learning Representations"},{"key":"e_1_3_1_144_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2019.2897701"},{"key":"e_1_3_1_145_2","article-title":"Fast algorithms for convolutional neural networks","author":"Lavin A.","year":"2016","unstructured":"A. Lavin and S. Gray. 2016. Fast algorithms for convolutional neural networks. In CVPR, 2016.","journal-title":"CVPR"},{"key":"e_1_3_1_146_2","article-title":"MobileNets: Efficient convolutional neural networks for mobile vision applications","author":"Howard A. G.","year":"2017","unstructured":"A. G. Howard et al. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR, vol. abs\/1704.04861, 2017.","journal-title":"CoRR"},{"key":"e_1_3_1_147_2","volume-title":"Int. Conference on Learning Representations, Banff, AB, Canada","author":"Mathieu M.","year":"2014","unstructured":"M. Mathieu, M. Henaff, and Y. LeCun. 2014. Fast training of convolutional networks through FFTs. In Int. Conference on Learning Representations, Banff, AB, Canada, 2014."},{"key":"e_1_3_1_148_2","doi-asserted-by":"publisher","DOI":"10.1145\/3410463.3414642"},{"key":"e_1_3_1_149_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11179-7_36"},{"key":"e_1_3_1_150_2","article-title":"Optimal brain damage","author":"LeCun Y.","year":"1989","unstructured":"Y. LeCun, J. S. Denker, and S. A. Solla. 1989. Optimal brain damage. In NIPS, 1989.","journal-title":"NIPS"},{"key":"e_1_3_1_151_2","article-title":"Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks","author":"Hoefler T.","year":"2021","unstructured":"T. Hoefler, D. Alistarh, T. Ben-Nun, N. Dryden, and A. Peste. 2021. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. CoRR, vol. abs\/2102.00554, 2021.","journal-title":"CoRR"},{"key":"e_1_3_1_152_2","article-title":"Variational dropout sparsifies deep neural networks","author":"Molchanov D.","year":"2017","unstructured":"D. Molchanov, A. Ashukha, and D. Vetrov. 2017. Variational dropout sparsifies deep neural networks. In International Conference on Machine Learning, 2017.","journal-title":"International Conference on Machine Learning"},{"key":"e_1_3_1_153_2","doi-asserted-by":"publisher","DOI":"10.5555\/3322706.3361996"},{"key":"e_1_3_1_154_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6638949"},{"key":"e_1_3_1_155_2","article-title":"Shapeshifter networks: Decoupling layers from parameters for scalable and effective deep learning","author":"Plummer B. A.","year":"2020","unstructured":"B. A. Plummer, N. Dryden, J. Frost, T. Hoefler, and K. Saenko. 2020. Shapeshifter networks: Decoupling layers from parameters for scalable and effective deep learning. CoRR, vol. abs\/2006.10598, 2020.","journal-title":"CoRR"},{"key":"e_1_3_1_156_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2020.2976475"},{"key":"e_1_3_1_157_2","article-title":"Learning both weights and connections for efficient neural network","author":"Han S.","year":"2015","unstructured":"S. Han, J. Pool, J. Tran, and W. Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems, 2015.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_158_2","article-title":"Designing energy-efficient convolutional neural networks using energy-aware pruning","author":"Yang T.","year":"2017","unstructured":"T. Yang, Y. Chen, and V. Sze. 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.","journal-title":"IEEE Conference on Computer Vision and Pattern Recognition"},{"key":"e_1_3_1_159_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2852335"},{"key":"e_1_3_1_160_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001138"},{"key":"e_1_3_1_161_2","article-title":"CBinfer: Change-based inference for convolutional neural networks on video data","author":"Cavigelli L.","year":"2017","unstructured":"L. Cavigelli, P. Degen, and L. Benini. 2017. CBinfer: Change-based inference for convolutional neural networks on video data. CoRR, vol. abs\/1704.04313, 2017.","journal-title":"CoRR"},{"key":"e_1_3_1_162_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2014.106"},{"key":"e_1_3_1_163_2","doi-asserted-by":"publisher","DOI":"10.1145\/2654822.2541967"},{"key":"e_1_3_1_164_2","article-title":"CORAL: Coarse-grained reconfigurable architecture for convolutional neural networks","author":"Yuan Z.","year":"2017","unstructured":"Z. Yuan, Y. Liu, J. Yue, J. Li, and H. Yang. 2017. CORAL: Coarse-grained reconfigurable architecture for convolutional neural networks. In IEEE ISLPED, 2017.","journal-title":"IEEE ISLPED"},{"key":"e_1_3_1_165_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2905654"},{"key":"e_1_3_1_166_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuron.2019.08.034"},{"key":"e_1_3_1_167_2","unstructured":"F. N. Iandola et al. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR vol. abs\/1602.07360 2016."},{"key":"e_1_3_1_168_2","article-title":"On-demand deep model compression for mobile devices: A usage-driven model selection framework","author":"Liu S.","year":"2018","unstructured":"S. Liu et al. 2018. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Int. Conf. on Mobile Systems, Applications, and Services, New York, NY, USA, 2018.","journal-title":"Int. Conf. on Mobile Systems, Applications, and Services"},{"key":"e_1_3_1_169_2","volume-title":"International Conference on Learning Representations","author":"Cai H.","year":"2020","unstructured":"H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han. 2020. Once for all: Train one network and specialize it for efficient deployment. In International Conference on Learning Representations, 2020."},{"key":"e_1_3_1_170_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2018.2881461"},{"key":"e_1_3_1_171_2","first-page":"1378","article-title":"Low-energy voice activity detection via energy-quality scaling from data conversion to machine learning","volume":"67","author":"Teo J. H.","year":"2020","unstructured":"J. H. Teo, S. Cheng, and M. Alioto. 2020. Low-energy voice activity detection via energy-quality scaling from data conversion to machine learning. IEEE Trans. on Circ. and Sys. 67 (2020), 1378\u20131388.","journal-title":"IEEE Trans. on Circ. and Sys"},{"key":"e_1_3_1_172_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2968576"},{"key":"e_1_3_1_173_2","article-title":"Understanding straight-through estimator in training activation quantized neural nets","author":"Yin P.","year":"2019","unstructured":"P. Yin et al. 2019. Understanding straight-through estimator in training activation quantized neural nets. CoRR, vol. abs\/1903.05662, 2019.","journal-title":"CoRR"},{"key":"e_1_3_1_174_2","article-title":"Resiliency of deep neural networks under quantization","author":"Sung W.","year":"2015","unstructured":"W. Sung, S. Shin, and K. Hwang. 2015. Resiliency of deep neural networks under quantization. CoRR, vol. abs\/1511.06488, 2015.","journal-title":"CoRR"},{"key":"e_1_3_1_175_2","article-title":"BinaryNet: Training deep neural networks with weights and activations constrained to +1 or \u22121","author":"Courbariaux M.","year":"2016","unstructured":"M. Courbariaux and Y. Bengio. 2016. BinaryNet: Training deep neural networks with weights and activations constrained to +1 or \u22121. CoRR, vol. abs\/1602.02830, 2016.","journal-title":"CoRR"},{"key":"e_1_3_1_176_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2017.7858354"},{"key":"e_1_3_1_177_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2018.053631145"},{"key":"e_1_3_1_178_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2019.2950386"},{"key":"e_1_3_1_179_2","volume-title":"Int. Conf. on Machine Learning - Volume 48","author":"Lin D. D.","year":"2016","unstructured":"D. D. Lin, S. S. Talathi and V. S. Annapureddy. 2016. Fixed point quantization of deep convolutional networks. In Int. Conf. on Machine Learning - Volume 48, New York, NY, USA, 2016."},{"key":"e_1_3_1_180_2","article-title":"Apprentice: Using knowledge distillation techniques to improve low-precision network accuracy","author":"Mishra A. K.","year":"2017","unstructured":"A. K. Mishra and D. Marr. 2017. Apprentice: Using knowledge distillation techniques to improve low-precision network accuracy. CoRR, vol. abs\/1711.05852, 2017.","journal-title":"CoRR"},{"key":"e_1_3_1_181_2","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2751163"},{"key":"e_1_3_1_182_2","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2017.176"},{"key":"e_1_3_1_183_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2020.3006451"},{"key":"e_1_3_1_184_2","doi-asserted-by":"publisher","DOI":"10.1145\/3218603.3218638"},{"key":"e_1_3_1_185_2","article-title":"Bit error tolerance of a CIFAR-10 binarized convolutional neural network processor","author":"Yang L.","year":"2018","unstructured":"L. Yang, D. Bankman, B. Moons, M. Verhelst, and B. Murmann. 2018. Bit error tolerance of a CIFAR-10 binarized convolutional neural network processor. In IEEE Int. Symp. on Circuits and Systems, 2018.","journal-title":"IEEE Int. Symp. on Circuits and Systems"},{"key":"e_1_3_1_186_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCAS.2020.3027425"},{"key":"e_1_3_1_187_2","doi-asserted-by":"publisher","DOI":"10.3850\/9783981537079_0174"},{"key":"e_1_3_1_188_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-019-1677-2"},{"key":"e_1_3_1_189_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2015.2474396"},{"key":"e_1_3_1_190_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2018.112130359"},{"key":"e_1_3_1_191_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-92910-9_10"},{"key":"e_1_3_1_192_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2019.2935234"},{"key":"e_1_3_1_193_2","doi-asserted-by":"crossref","unstructured":"Cognitive Computation 2009 1 6 Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors","DOI":"10.1007\/s12559-009-9009-8"},{"key":"e_1_3_1_194_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2020.3020286"},{"key":"e_1_3_1_195_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2018.2871057"},{"key":"e_1_3_1_196_2","volume-title":"IEEE International Solid - State Circuits Conference","author":"Bankman D.","year":"2018","unstructured":"D. Bankman, L. Yang, B. Moons, M. Verhelst, and B. Murmann. 2018. An always-on 3.8 uJ\/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28nm CMOS. In IEEE International Solid - State Circuits Conference, 2018."},{"key":"e_1_3_1_197_2","article-title":"BinarEye: An always-on energy-accuracy-scalable binary CNN processor with all memory on chip in 28nm CMOS","author":"Moons B.","year":"2018","unstructured":"B. Moons, D. Bankman, L. Yang, B. Murmann, and M. Verhelst. 2018. BinarEye: An always-on energy-accuracy-scalable binary CNN processor with all memory on chip in 28nm CMOS. CoRR, vol. abs\/1804.05554, 2018.","journal-title":"CoRR"},{"key":"e_1_3_1_198_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASSCC.2016.7844125"},{"key":"e_1_3_1_199_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2017.7870354"},{"key":"e_1_3_1_200_2","volume-title":"IEEE International Solid-State Circuits Conference","author":"Oh J.","year":"2011","unstructured":"J. Oh, J. Park, G. Kim, S. Lee, and H. Yoo. 2011. A 57mW embedded mixed-mode neuro-fuzzy accelerator for intelligent multi-core processor. In IEEE International Solid-State Circuits Conference, 2011."},{"key":"e_1_3_1_201_2","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2018.2796379"},{"key":"e_1_3_1_202_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178146"},{"key":"e_1_3_1_203_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2016.2597140"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3520127","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3520127","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:10:32Z","timestamp":1750183832000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3520127"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,30]]},"references-count":202,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,9,30]]}},"alternative-id":["10.1145\/3520127"],"URL":"https:\/\/doi.org\/10.1145\/3520127","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2022,9,30]]},"assertion":[{"value":"2021-07-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-02-19","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-10-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}