{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:27:14Z","timestamp":1750220834823,"version":"3.41.0"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,8,31]],"date-time":"2018-08-31T00:00:00Z","timestamp":1535673600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2017YFA0700900, 2017YFA0700902, 2017YFA0700901 and 2017YFB1003101"],"award-info":[{"award-number":["2017YFA0700900, 2017YFA0700902, 2017YFA0700901 and 2017YFB1003101"]}]},{"name":"973 Program of China","award":["2015CB358800"],"award-info":[{"award-number":["2015CB358800"]}]},{"name":"Strategic Priority Research Program of Chinese Academy of Science","award":["XDB32050200, XDC01020000"],"award-info":[{"award-number":["XDB32050200, XDC01020000"]}]},{"name":"Key Research Projects in Frontier Science of Chinese Academy of Sciences","award":["QYZDB-SSWJSC001"],"award-info":[{"award-number":["QYZDB-SSWJSC001"]}]},{"name":"Transformation and Transfer of Scientific and Technological Achievements of Chinese Academy of Sciences","award":["KFJ-HGZX-013"],"award-info":[{"award-number":["KFJ-HGZX-013"]}]},{"name":"Beijing Natural Science Foundation","award":["JQ18013"],"award-info":[{"award-number":["JQ18013"]}]},{"name":"CAS Center for Excellence in Brain Science and Intelligence Technology"},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["61432016, 61532016, 61672491, 61602441, 61602446, 61732002, 61702478, 61732007, and 61732020"],"award-info":[{"award-number":["61432016, 61532016, 61672491, 61602441, 61602446, 61732002, 61702478, 61732007, and 61732020"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Science and Technology","award":["2018ZX01031102"],"award-info":[{"award-number":["2018ZX01031102"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Syst."],"published-print":{"date-parts":[[2018,8,31]]},"abstract":"<jats:p>Machine Learning (ML) are a family of models for learning from the data to improve performance on a certain task. ML techniques, especially recent renewed neural networks (deep neural networks), have proven to be efficient for a broad range of applications. ML techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which usually are not energy efficient, since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators have been proposed recently to improve energy efficiency. However, such accelerators were designed for a small set of ML techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an ML technique (such as layers in neural networks) or even an ML as a whole. Although straightforward and easy to implement for a limited set of similar ML techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different ML techniques with sufficient flexibility and efficiency.<\/jats:p><jats:p>In this article, we first propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. We then extend the application scope of Cambricon from NN to ML techniques. We also propose an assembly language, an assembler, and runtime to support programming with Cambricon, especially targeting large-scale ML problems. Our evaluation over a total of 16 representative yet distinct ML techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of ML techniques and provides higher code density than general-purpose ISAs such as x86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao\u00a0[7] (which can only accommodate three types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency\/power\/area overheads, with a versatile coverage of 10 different NN benchmarks and 7 other ML benchmarks. Compared to the recent prevalent ML accelerator PuDianNao, our Cambricon-based accelerator is able to support all the ML techniques as well as the 10 NNs but with only approximate 5.1% performance loss.<\/jats:p>","DOI":"10.1145\/3331469","type":"journal-article","created":{"date-parts":[[2019,8,13]],"date-time":"2019-08-13T14:41:50Z","timestamp":1565707310000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["An Instruction Set Architecture for Machine Learning"],"prefix":"10.1145","volume":"36","author":[{"given":"Yunji","family":"Chen","sequence":"first","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of BrainIntelligence Technology, Zhangjiang Laboratory (BIT, ZJLab); Shanghai Research Center for Brain Science and Brain-Inspired Intelligence (Shanghai Brain\/AI)"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3120-5773","authenticated-orcid":false,"given":"Huiying","family":"Lan","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zidong","family":"Du","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shaoli","family":"Liu","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinhua","family":"Tao","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dong","family":"Han","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Luo","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qi","family":"Guo","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ling","family":"Li","sequence":"additional","affiliation":[{"name":"Institute of Software, Chinese Academy of Sciences, CAS; University of Chinese Academy of Sciences"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuan","family":"Xie","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, UCSB, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tianshi","family":"Chen","sequence":"additional","affiliation":[{"name":"SKL of Computer Architecture, Institute of Computing Technology, CAS"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,8,13]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1080\/00031305.1992.10475879","article-title":"An introduction to kernel and nearest-neighbor nonparametric regression","volume":"46","author":"Altman N. S.","year":"1992","unstructured":"N. S. Altman . 1992 . An introduction to kernel and nearest-neighbor nonparametric regression . Am. Stat. 46 , 3 (1992), 175 -- 185 . N. S. Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 3 (1992), 175--185.","journal-title":"Am. Stat."},{"key":"e_1_2_1_2_1","unstructured":"L. Breiman J. H. Friedman R. A. Olshcn and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth International Group Belmont CA. L. Breiman J. H. Friedman R. A. Olshcn and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth International Group Belmont CA."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815993"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/APSIPA.2014.7041717"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541967"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2015.41"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.58"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the IEEE Conference on Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD\u201916)","author":"Chi Ping","year":"2016","unstructured":"Ping Chi , Wang-Chien Lee , and Yuan Xie . 2016 . Adapting B-plus tree for emerging nov-volatile memory based main memory . In Proceedings of the IEEE Conference on Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD\u201916) . Ping Chi, Wang-Chien Lee, and Yuan Xie. 2016. Adapting B-plus tree for emerging nov-volatile memory based main memory. In Proceedings of the IEEE Conference on Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD\u201916)."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.13"},{"volume-title":"Proceedings of the 30th International Conference on Machine Learning.","author":"Coates A.","key":"e_1_2_1_10_1","unstructured":"A. Coates , B. Huval , T. Wang , D. J. Wu , and A. Y. Ng . 2013. Deep learning with cots hpc systems . In Proceedings of the 30th International Conference on Machine Learning. A. Coates, B. Huval, T. Wang, D. J. Wu, and A. Y. Ng. 2013. Deep learning with cots hpc systems. In Proceedings of the 30th International Conference on Machine Learning."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022627411411"},{"volume-title":"Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.","author":"Dahl G. E.","key":"e_1_2_1_12_1","unstructured":"G. E. Dahl , T. N. Sainath , and G. E. Hinton . 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout . In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. G. E. Dahl, T. N. Sainath, and G. E. Hinton. 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing."},{"key":"e_1_2_1_13_1","first-page":"1","article-title":"An introduction to linear regression and correlation","volume":"69","author":"Edwards A. L.","year":"1984","unstructured":"A. L. Edwards . 1984 . An introduction to linear regression and correlation . Math. Gaz. 69 , 2 (1984), 1 -- 17 . A. L. Edwards. 1984. An introduction to linear regression and correlation. Math. Gaz. 69, 2 (1984), 1--17.","journal-title":"Math. Gaz."},{"key":"e_1_2_1_14_1","unstructured":"V. Eijkhout. 2011. Introduction to High Performance Scientific computing. Retrieved from www.lulu.com. V. Eijkhout. 2011. Introduction to High Performance Scientific computing. Retrieved from www.lulu.com."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2006.1693199"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.48"},{"volume-title":"Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.","author":"Farabet C.","key":"e_1_2_1_17_1","unstructured":"C. Farabet , B. Martini , B. Corda , P. Akselrod , E. Culurciello , and Y. LeCun . 2011. NeuFlow: A runtime reconfigurable dataflow processor for vision . In Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. LeCun. 2011. NeuFlow: A runtime reconfigurable dataflow processor for vision. In Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops."},{"volume-title":"Proceedings of the 2009 International Conference on Field Programmable Logic and Applications.","author":"Farabet C.","key":"e_1_2_1_18_1","unstructured":"C. Farabet , C. Poulet , J.Y. Han , and Y. LeCun . 2009. CNP: An FPGA-based processor for convolutional networks . In Proceedings of the 2009 International Conference on Field Programmable Logic and Applications. C. Farabet, C. Poulet, J.Y. Han, and Y. LeCun. 2009. CNP: An FPGA-based processor for convolutional networks. In Proceedings of the 2009 International Conference on Field Programmable Logic and Applications."},{"key":"e_1_2_1_19_1","first-page":"41","article-title":"Cluster analysis of multivariate data : Efficiency versus interpretability of classifications","volume":"21","author":"Forgy E. W.","year":"1965","unstructured":"E. W. Forgy . 1965 . Cluster analysis of multivariate data : Efficiency versus interpretability of classifications . Biometrics 21 , 3 (1965), 41 -- 52 . E. W. Forgy. 1965. Cluster analysis of multivariate data : Efficiency versus interpretability of classifications. Biometrics 21, 3 (1965), 41--52.","journal-title":"Biometrics"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2014.106"},{"volume-title":"Proceedings of the 2005 IEEE International Joint Conference on Neural Networks.","author":"Graves A.","key":"e_1_2_1_21_1","unstructured":"A. Graves and J. Schmidhuber . 2005. Framewise phoneme classification with bidirectional LSTM networks . In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks. A. Graves and J. Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM networks. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1950365.1950385"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.123"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_2_1_27_1","unstructured":"INTEL. {n.d.}. AVX-512. Retrieved from https:\/\/software.intel.com\/en-us\/blogs\/2013\/avx-512-instructions. INTEL. {n.d.}. AVX-512. Retrieved from https:\/\/software.intel.com\/en-us\/blogs\/2013\/avx-512-instructions."},{"key":"e_1_2_1_28_1","unstructured":"INTEL. {n.d.}. MKL. Retrieved from https:\/\/software.intel.com\/en-us\/intel-mkl. INTEL. {n.d.}. MKL. Retrieved from https:\/\/software.intel.com\/en-us\/intel-mkl."},{"key":"e_1_2_1_29_1","unstructured":"Pineda Fernando J. 1987. Generalization of back-propagation to recurrent neural networks. Phys. Rev. Lett. (1987) 602--611. Pineda Fernando J. 1987. Generalization of back-propagation to recurrent neural networks. Phys. Rev. Lett. (1987) 602--611."},{"volume-title":"Proceedings of the 12th IEEE International Conference on Computer Vision.","author":"Jarrett K.","key":"e_1_2_1_30_1","unstructured":"K. Jarrett , K. Kavukcuoglu , M. Ranzato , and Y. LeCun . 2009. What is the best multi-stage architecture for object recognition? In Proceedings of the 12th IEEE International Conference on Computer Vision. K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. 2009. What is the best multi-stage architecture for object recognition? In Proceedings of the 12th IEEE International Conference on Computer Vision."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.485571"},{"key":"e_1_2_1_33_1","unstructured":"A. Krizhevsky. {n.d.}. cuda-convnet: High-performance c++\/cuda implemen- tation of convolutional neural networks. A. Krizhevsky. {n.d.}. cuda-convnet: High-performance c++\/cuda implemen- tation of convolutional neural networks."},{"key":"e_1_2_1_34_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Sutskever Ilya , and Geoffrey E . Hinton . 2012 . ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. Alex Krizhevsky, Sutskever Ilya, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 10th National Conference on Artificial Intelligence. 223--228","author":"Langley Pat","year":"1992","unstructured":"Pat Langley , Wayne Iba , and Kevin Thompson . 1992 . An analysis of bayesian classifiers . In Proceedings of the 10th National Conference on Artificial Intelligence. 223--228 . Pat Langley, Wayne Iba, and Kevin Thompson. 1992. An analysis of bayesian classifiers. In Proceedings of the 10th National Conference on Artificial Intelligence. 223--228."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273556"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639343"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694358"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.42"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228465"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446050"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v005.i08"},{"key":"e_1_2_1_44_1","volume-title":"Modha","author":"Merolla Paul A","year":"2014","unstructured":"Paul A Merolla , John V. Arthur , Rodrigo Alvarez-icaza, Andrew S. Cassidy , Jun Sawada , Filipp Akopyan , Bryan L. Jackson , Nabil Imam , Chen Guo , Yutaka Nakamura , Bernard Brezzo , Ivan Vo , Steven K. Esser , Rathinakumar Appuswamy , Brian Taba , Arnon Amir , Myron D. Flickner , William P. Risk , Rajit Manohar , and Dharmendra S . Modha . 2014 . A million spiling-neuron interated circuit with a scalable communication network and interface. Science 345, 6197 (2014), 668--673. Paul A Merolla, John V. Arthur, Rodrigo Alvarez-icaza, Andrew S. Cassidy, Jun Sawada, Filipp Akopyan, Bryan L. Jackson, Nabil Imam, Chen Guo, Yutaka Nakamura, Bernard Brezzo, Ivan Vo, Steven K. Esser, Rathinakumar Appuswamy, Brian Taba, Arnon Amir, Myron D. Flickner, William P. Risk, Rajit Manohar, and Dharmendra S. Modha. 2014. A million spiling-neuron interated circuit with a scalable communication network and interface. Science 345, 6197 (2014), 668--673."},{"key":"e_1_2_1_45_1","volume-title":"Human-level control through deep reinforcement learning. Nature 518, 7540","author":"Mnih Volodymyr","year":"2015","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Andrei A. Rusu , Joel Veness , Marc G. Bellemare , Alex Graves , Martin Riedmiller , Andreas K. Fidjeland , Georg Ostrovski , Stig Petersen , Charles Beattie , Amir Sadik , Ioannis Antonoglou , Helen King , Dharshan Kumaran , Daan Wierstra , Shane Legg , and Demis Hassabis . 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 ( 2015 ), 529--533. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529--533."},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the 1999 American Control Conference.","author":"Motter M. A.","year":"1999","unstructured":"M. A. Motter . 1999 . Control of the NASA langley 16-foot transonic tunnel with the self-organizing map . In Proceedings of the 1999 American Control Conference. M. A. Motter. 1999. Control of the NASA langley 16-foot transonic tunnel with the self-organizing map. In Proceedings of the 1999 American Control Conference."},{"key":"e_1_2_1_47_1","unstructured":"NVIDIA. {n.d.}. CUBLAS. Retrieved from https:\/\/developer.nvidia.com\/cublas. NVIDIA. {n.d.}. CUBLAS. Retrieved from https:\/\/developer.nvidia.com\/cublas."},{"volume-title":"Proceedings of the 2004 IEEE International Joint Conference on Neural Networks.","author":"Oliveira C. S.","key":"e_1_2_1_48_1","unstructured":"C. S. Oliveira and E. Del Hernandez . 2004. Forms of adapting patterns to Hopfield neural networks with larger number of nodes and higher storage capacity . In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. C. S. Oliveira and E. Del Hernandez. 2004. Forms of adapting patterns to Hopfield neural networks with larger number of nodes and higher storage capacity. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3123979"},{"volume-title":"Proceedings of the 8th Annual Symposium on Computer Architecture.","author":"David","key":"e_1_2_1_50_1","unstructured":"David A. Patterson and Carlo H. Sequin. 1981. RISC I: A reduced instruction set VLSI computer . In Proceedings of the 8th Annual Symposium on Computer Architecture. David A. Patterson and Carlo H. Sequin. 1981. RISC I: A reduced instruction set VLSI computer. In Proceedings of the 8th Annual Symposium on Computer Architecture."},{"volume-title":"Proceedings of the 31st IEEE International Conference on Computer Design.","author":"Peemen M.","key":"e_1_2_1_51_1","unstructured":"M. Peemen , A. A. A. Setio , B. Mesman , and H. Corp oraal . 2013. Memory-centric accelerator design for convolutional neural networks . In Proceedings of the 31st IEEE International Conference on Computer Design. M. Peemen, A. A. A. Setio, B. Mesman, and H. Corporaal. 2013. Memory-centric accelerator design for convolutional neural networks. In Proceedings of the 31st IEEE International Conference on Computer Design."},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 12 (NIPS\u201999)","author":"Platt John C.","year":"1999","unstructured":"John C. Platt , Nello Cristianini , and John Shawe-Taylor . 1999 . Large margin DAGs for multiclass classification . In Proceedings of the Advances in Neural Information Processing Systems 12 (NIPS\u201999) . 547--553. http:\/\/papers.nips.cc\/paper\/1773-large-margin-dags-for-multiclass-classification John C. Platt, Nello Cristianini, and John Shawe-Taylor. 1999. Large margin DAGs for multiclass classification. In Proceedings of the Advances in Neural Information Processing Systems 12 (NIPS\u201999). 547--553. http:\/\/papers.nips.cc\/paper\/1773-large-margin-dags-for-multiclass-classification"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2898024"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022643204877"},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the 13th National Conference on Artificial Intelligence and 8th Innovative Applications of Artificial Intelligence Conference, (AAAI \u201996 and IAAI\u201996)","author":"Quinlan J. Ross","year":"1996","unstructured":"J. Ross Quinlan . 1996 . Bagging, boosting, and C4.5 . In Proceedings of the 13th National Conference on Artificial Intelligence and 8th Innovative Applications of Artificial Intelligence Conference, (AAAI \u201996 and IAAI\u201996) . 725--730. J. Ross Quinlan. 1996. Bagging, boosting, and C4.5. In Proceedings of the 13th National Conference on Artificial Intelligence and 8th Innovative Applications of Artificial Intelligence Conference, (AAAI \u201996 and IAAI\u201996). 725--730."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1162\/NECO_a_00311"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2009.25"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2303296"},{"volume-title":"Proceedings of the 2011 International Joint Conference on Neural Networks.","author":"Sermanet P.","key":"e_1_2_1_59_1","unstructured":"P. Sermanet and Y. LeCun . 2011. Traffic sign recognition with multi-scale convolutional networks . In Proceedings of the 2011 International Joint Conference on Neural Networks. P. Sermanet and Y. LeCun. 2011. Traffic sign recognition with multi-scale convolutional networks. In Proceedings of the 2011 International Joint Conference on Neural Networks."},{"key":"e_1_2_1_60_1","volume-title":"Proceedings of the International Conference on Learning Representations. http:\/\/arxiv.org\/abs\/1409","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman . 2015 . Very deep convolutional networks for large-scale image recognition . In Proceedings of the International Conference on Learning Representations. http:\/\/arxiv.org\/abs\/1409 .1556 Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations. http:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.5555\/2337159.2337200"},{"volume-title":"Proceedings of the Deep Learning and Unsupervised Feature Learning Workshop (NIPS\u201911)","author":"Vanhoucke V.","key":"e_1_2_1_63_1","unstructured":"V. Vanhoucke , A. Senior , and M. Z. Mao . 2011. Improving the speed of neural networks on CPUs . In Proceedings of the Deep Learning and Unsupervised Feature Learning Workshop (NIPS\u201911) . V. Vanhoucke, A. Senior, and M. Z. Mao. 2011. Improving the speed of neural networks on CPUs. In Proceedings of the Deep Learning and Unsupervised Feature Learning Workshop (NIPS\u201911)."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/2742060.2743756"},{"key":"e_1_2_1_65_1","volume-title":"Proceedings of the 21st International Symposium on High Performance Computer Architecture.","author":"Xu Cong","year":"2015","unstructured":"Cong Xu , Dimin Niu , Naveen Muralimanohar , Rajeev Balasubramonian , Tao Zhang , Shimeng Yu , and Yuan Xie . 2015 . Overcoming the challenges of cross-point resistive memory architectures . In Proceedings of the 21st International Symposium on High Performance Computer Architecture. Cong Xu, Dimin Niu, Naveen Muralimanohar, Rajeev Balasubramonian, Tao Zhang, Shimeng Yu, and Yuan Xie. 2015. Overcoming the challenges of cross-point resistive memory architectures. In Proceedings of the 21st International Symposium on High Performance Computer Architecture."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICNC.2012.6234629"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICMLC.2007.4370640"},{"volume-title":"Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition.","author":"Zhang Zhengyou","key":"e_1_2_1_68_1","unstructured":"Zhengyou Zhang , M. Lyons , M. Schuster , and S. Akamatsu . 1998. Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron . In Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition. Zhengyou Zhang, M. Lyons, M. Schuster, and S. Akamatsu. 1998. Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron. In Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition."},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541228.2541231"}],"container-title":["ACM Transactions on Computer Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3331469","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3331469","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3331469","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:13:38Z","timestamp":1750202018000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3331469"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,8,31]]},"references-count":69,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,8,31]]}},"alternative-id":["10.1145\/3331469"],"URL":"https:\/\/doi.org\/10.1145\/3331469","relation":{},"ISSN":["0734-2071","1557-7333"],"issn-type":[{"type":"print","value":"0734-2071"},{"type":"electronic","value":"1557-7333"}],"subject":[],"published":{"date-parts":[[2018,8,31]]},"assertion":[{"value":"2018-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-08-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}