{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,6,18]],"date-time":"2022-06-18T04:18:04Z","timestamp":1655525884571},"reference-count":39,"publisher":"Institute of Electronics, Information and Communications Engineers (IEICE)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IEICE Trans. Fundamentals"],"published-print":{"date-parts":[[2022,3,1]]},"DOI":"10.1587\/transfun.2021vlp0012","type":"journal-article","created":{"date-parts":[[2021,9,20]],"date-time":"2021-09-20T22:09:10Z","timestamp":1632175750000},"page":"448-458","source":"Crossref","is-referenced-by-count":1,"title":["Reconfigurable Neural Network Accelerator and Simulator for Model Implementation"],"prefix":"10.1587","volume":"E105.A","author":[{"given":"Yasuhiro","family":"NAKAHARA","sequence":"first","affiliation":[{"name":"Kumamoto University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masato","family":"KIYAMA","sequence":"additional","affiliation":[{"name":"Kumamoto University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Motoki","family":"AMAGASAKI","sequence":"additional","affiliation":[{"name":"Kumamoto University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qian","family":"ZHAO","sequence":"additional","affiliation":[{"name":"Kyushu Institute of Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masahiro","family":"IIDA","sequence":"additional","affiliation":[{"name":"Kumamoto University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"532","reference":[{"key":"1","unstructured":"[1] S. Ren, K. He, R. Girshick, and J. Sun, \u201cFaster R-CNN: Towards real-time object detection with region proposal networks,\u201d arXiv:1506.01497, 2015."},{"key":"2","doi-asserted-by":"crossref","unstructured":"[2] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, \u201cYou only look once: Unified, real-time object detection,\u201d arXiv:1506.02640, 2015.","DOI":"10.1109\/CVPR.2016.91"},{"key":"3","unstructured":"[3] V. Badrinarayanan, A. Kendall, and R. Cipolla, \u201cSegNet: A deep convolutional encoder-decoder architecture for image segmentation,\u201d arXiv:1511.00561, 2015."},{"key":"4","doi-asserted-by":"crossref","unstructured":"[4] A. Graves, S. Fern\u00e1ndez, F.J. Gomez, and J.A. Schmidhuber, \u201cConnectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks,\u201d Proc. 23rd International Conference on Machine Learning, pp.369-376, 2006. 10.1145\/1143844.1143891","DOI":"10.1145\/1143844.1143891"},{"key":"5","unstructured":"[5] D. Yu, L. Deng, and G.E. Dahl, \u201cRoles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition,\u201d Proc. Neural Information Processing Systems, 2010."},{"key":"6","doi-asserted-by":"crossref","unstructured":"[6] A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, \u201cSurvey and benchmarking of machine learning accelerators,\u201d 2019 IEEE High Performance Extreme Computing Conference, 2019. 10.1109\/hpec.2019.8916327","DOI":"10.1109\/HPEC.2019.8916327"},{"key":"7","doi-asserted-by":"publisher","unstructured":"[7] X. Wang, Y. Han, V.C.M. Leung, D. Niyato, X. Yan, and X. Chen, \u201cConvergence of edge computing and deep learning: A comprehensive survey,\u201d IEEE Commun. Surveys Tuts., vol.2, no.2, pp.869-904, 2020. 10.1109\/COMST.2020.2970550","DOI":"10.1109\/COMST.2020.2970550"},{"key":"8","unstructured":"[8] A. Krizhevsky, I. Sutskever, and G.E. Hinton, \u201cImageNet classification with deep convolutional neural networks,\u201d Proc. Neural Information Processing Systems, pp.1097-1105, 2012."},{"key":"9","unstructured":"[9] K. Simonyan and A. Zisserman, \u201cVery deep convolutional networks for large-scale image recognition,\u201d arXiv:1409.1556, 2014."},{"key":"10","doi-asserted-by":"crossref","unstructured":"[10] M. Kiyama, Y. Nakahara, M. Amagasaki, and M. Iida, \u201cA quantized neural network library for proper implementation of hardware emulation,\u201d Proc. 4th International Workshop on GPU Computing and AI, pp.136-140, 2019.","DOI":"10.1109\/CANDARW.2019.00032"},{"key":"11","doi-asserted-by":"crossref","unstructured":"[11] M. Kiyama, Y. Nakahara, M. Amagasaki, and M. Iida, \u201cDeep learning framework with arbitrary numerical precision,\u201d Proc. IEEE International Symposium on Embedded Multicore\/Many-core Systems-on-Chip (MCSoC), pp.81-86, 2019. 10.1109\/MCSoC.2019.00019","DOI":"10.1109\/MCSoC.2019.00019"},{"key":"12","unstructured":"[12] S. Han, H. Mao, and W.J. Dally, \u201cDeep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding,\u201d arXiv:1510.00149, 2015."},{"key":"13","unstructured":"[13] S. Han, J. Pool, J. Tran, and W.J. Dally, \u201cLearning both weights and connections for efficient neural networks,\u201d arXiv:1506.02626, 2015."},{"key":"14","doi-asserted-by":"crossref","unstructured":"[14] S. Sharify, A.D. Lascorz, K. Siu, P. Judd, and A. Moshovos, \u201cLoom: Exploiting weight and activation precisions to accelerate convolutional neural networks,\u201d Proc. 55th ACM\/ESDA\/IEEE Design Automation Conference, pp.1-6, 2018. 10.1109\/dac.2018.8465915","DOI":"10.1145\/3195970.3196072"},{"key":"15","doi-asserted-by":"crossref","unstructured":"[15] H. Sharma, J. Park, N. Suda, L. Lai, B. Chau, V. Chandra, and H. Esmaeilzadeh, \u201cBit fusion: Bit-level dynamically composable architecture for accelerating deep neural network,\u201d Proc. ACM\/IEEE 45th Annual International Symposium on Computer Architecture, pp.764-775, 2018. 10.1109\/isca.2018.00069","DOI":"10.1109\/ISCA.2018.00069"},{"key":"16","doi-asserted-by":"crossref","unstructured":"[16] S. Ryu, H. Kim, W. Yi, and J.-J. Kim, \u201cBitBlade: Area and energy-efficient precision-scalable neural network accelerator with bitwise summation,\u201d Proc. 56th Annual Design Automation Conference, no.84, pp.1-6, 2019. 10.1145\/3316781.3317784","DOI":"10.1145\/3316781.3317784"},{"key":"17","doi-asserted-by":"publisher","unstructured":"[17] J. Lee, C. Kim, S. Kang, D. Shin, S. Kim, and H. Yoo, \u201cUNPU: An energy-efficient deep neural network acceleratorwith fully variableweight bit precision,\u201d IEEE J. Solid-State Circuits, vol.54, no.1, pp.173-185, 2019. 10.1109\/jssc.2018.2865489","DOI":"10.1109\/JSSC.2018.2865489"},{"key":"18","doi-asserted-by":"crossref","unstructured":"[18] R. Andri, L. Cavigelli, D. Rossi, and L. Benini, \u201cYodaNN: An ultra-low power convolutional neural network accelerator based on binary weights,\u201d Proc. IEEE Computer Society Annual Symposium on VLSI, pp.236-241, 2016. 10.1109\/isvlsi.2016.111","DOI":"10.1109\/ISVLSI.2016.111"},{"key":"19","doi-asserted-by":"crossref","unstructured":"[19] K. Ando, K. Ueyoshi, K. Orimo, H. Yonekawa, S. Sato, H. Nakahara, M. Ikebe, T. Asai, S. Takamaeda-Yamazaki, T. Kuroda, and M. Motomura, \u201cBRein memory: A 13-layer 4.2K neuron\/0.8M synapse binary\/ternary reconfigurable in-memory deep neural network accelerator in 65nm CMOS,\u201d Symposium on VLSI Circuits, pp.C24-C25, 2017. 10.23919\/vlsic.2017.8008533","DOI":"10.23919\/VLSIC.2017.8008533"},{"key":"20","doi-asserted-by":"crossref","unstructured":"[20] H. Valavi, P.J. Ramadge, E. Nestler, and N. Verma, \u201cA mixed-signal binarized convolutional-neural-network accelerator integrating dense weight storage and multiplication for reduced data movement,\u201d IEEE Symposium on VLSI Circuits, pp.141-142, 2018. 10.1109\/vlsic.2018.8502421","DOI":"10.1109\/VLSIC.2018.8502421"},{"key":"21","doi-asserted-by":"publisher","unstructured":"[21] D. Bankman, L. Yang, B. Moons, M. Verhelst, and B. Murmann, \u201cAn always-on 3.8\u00b5J\/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28-nm CMOS,\u201d IEEE J. Solid-State Circuits, vol.54, no.1, pp.158-172, 2019. 10.1109\/JSSC.2018.2869150","DOI":"10.1109\/JSSC.2018.2869150"},{"key":"22","doi-asserted-by":"publisher","unstructured":"[22] S. Yin, Z. Jiang, J. Seo, and M. Seok, \u201cXNOR-SRAM: In-memory computing SRAM macro for binary\/ternary deep neural networks,\u201d IEEE J. Solid-State Circuits, vol.55, no.6, pp.1733-1743, 2020. 10.1109\/JSSC.2019.2963616","DOI":"10.1109\/JSSC.2019.2963616"},{"key":"23","doi-asserted-by":"publisher","unstructured":"[23] S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M.A. Horowitz, and W.J. Dally, \u201cEIE: Efficient inference engine on compressed deep neural network,\u201d ACM SIGARCH Comput. Archit. News, vol.44, no.3, pp.243-254, 2016. 10.1145\/3007787.3001163","DOI":"10.1145\/3007787.3001163"},{"key":"24","doi-asserted-by":"publisher","unstructured":"[24] J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N.E. Jerger, and A. Moshovos, \u201cCnvlutin: Ineffectual-neuron-free deep neural network computing,\u201d ACM SIGARCH Comput. Archit. News, vol.44, no.3, pp.1-13, 2016. 10.1145\/3007787.3001138","DOI":"10.1145\/3007787.3001138"},{"key":"25","doi-asserted-by":"crossref","unstructured":"[25] S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, and Y. Chen, \u201cCambricon-X: An accelerator for sparse neural networks,\u201d Proc. 49th Annual IEEE\/ACM International Symposium on Microarchitecture, no.20, pp.1-12, 2016 10.1109\/micro.2016.7783723","DOI":"10.1109\/MICRO.2016.7783723"},{"key":"26","doi-asserted-by":"publisher","unstructured":"[26] D. Kim, J. Ahn, and S. Yoo, \u201cZeNA: Zero-aware neural network accelerator,\u201d IEEE Des. Test, vol.35, no.1, pp.39-46, 2018. 10.1109\/mdat.2017.2741463","DOI":"10.1109\/MDAT.2017.2741463"},{"key":"27","doi-asserted-by":"publisher","unstructured":"[27] J. Li, S. Jiang, S. Gong, J. Wu, J. Yan, G. Yan, and X. Li, \u201cSqueezeflow: A sparse cnn accelerator exploiting concise convolution rules,\u201d IEEE Trans. Comput., vol.68, no.11, pp.1663-1677, 2019. 10.1109\/tc.2019.2924215","DOI":"10.1109\/TC.2019.2924215"},{"key":"28","doi-asserted-by":"publisher","unstructured":"[28] Y-H. Chen, T.-J. Yang, J. Emer, and V. Sze, \u201cEyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices,\u201d IEEE J. Emerg. Sel. Topics Circuits Syst., vol.9, no.2, pp.292-308, 2019. 10.1109\/jetcas.2019.2910232","DOI":"10.1109\/JETCAS.2019.2910232"},{"key":"29","doi-asserted-by":"crossref","unstructured":"[29] H. Kwon, A. Samajdar, and T. Krishna, \u201cMAERI: Enabling flexible dataflow mapping over DNN accelerators via reconfigurable interconnects,\u201d Proc. ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pp.461-475, 2018. 10.1145\/3173162.3173176","DOI":"10.1145\/3296957.3173176"},{"key":"30","doi-asserted-by":"crossref","unstructured":"[30] K. Hegde, J. Yu, R. Agrawal, M. Yan, M. Pellauer, and C.W. Fletcher, \u201cUCNN: Exploiting computational reuse in deep neural networks via weight repetition,\u201d Proc. ACM\/IEEE 45th Annual International Symposium on Computer Architecture, 2018. 10.1109\/isca.2018.00062","DOI":"10.1109\/ISCA.2018.00062"},{"key":"31","doi-asserted-by":"crossref","unstructured":"[31] B. Moons, R. Uytterhoeven, W, Dehaene, and M. Verhelst, \u201c14.5 envision: A 0.26-to-10TOPS\/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm FDSO,\u201d Proc. IEEE International Solid-State Circuits Conference (ISSCC), Feb. 2017. 10.1109\/isscc.2017.7870353","DOI":"10.1109\/ISSCC.2017.7870353"},{"key":"32","doi-asserted-by":"crossref","unstructured":"[32] S. Yin, P. Ouyang, S. Tang, F. Tu, X. Li, L. Liu, and S. Wei, \u201cA 1.06-to-5.09TOPS\/W reconfigurable hybrid-neural-network processor for deep learning applications,\u201d VLSI Circuits, 2017 Symposium, pp.C26-C27, 2017. 10.23919\/vlsic.2017.8008534","DOI":"10.23919\/VLSIC.2017.8008534"},{"key":"33","unstructured":"[33] Open Neural Network Exchange, https:\/\/github.com\/onnx\/onnx, accessed March 11. 2021."},{"key":"34","unstructured":"[34] A. Paszke, S. Gros, F. Massa, A Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. K\u00f6pf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, \u201cPyTorch: An imperative style, high-performance deep learning library,\u201d Proc. 32nd International Conference on Neural Information Processing Systems, 2019."},{"key":"35","unstructured":"[35] Pytorch-Alexnet-Cifar100, https:\/\/github.com\/Lornatang\/pytorch-alexnet-cifar100, accessed March 11. 2021."},{"key":"36","unstructured":"[36] A. Krizhevsky, \u201cLearning multiple layers of features from tiny images,\u201d Technical Report, 2009."},{"key":"37","doi-asserted-by":"crossref","unstructured":"[37] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei Fei, \u201cImagenet: A large-scale hierarchical image database,\u201d Proc. Computer Vision and Pattern Recognition, pp.248-255, 2009. 10.1109\/cvpr.2009.5206848","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"38","doi-asserted-by":"crossref","unstructured":"[38] K. He, X. Zhang, S. Ren, and J. Sun, \u201cDeep residual learning for image recognition,\u201d arXiv:1512.03385, 2015.","DOI":"10.1109\/CVPR.2016.90"},{"key":"39","unstructured":"[39] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, \u201cMobileNets: Efficient convolutional neural networks for mobile vision applications,\u201d arXiv:1704.04861, 2017."}],"container-title":["IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transfun\/E105.A\/3\/E105.A_2021VLP0012\/_pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,5]],"date-time":"2022-03-05T03:23:50Z","timestamp":1646450630000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transfun\/E105.A\/3\/E105.A_2021VLP0012\/_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,1]]},"references-count":39,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022]]}},"URL":"https:\/\/doi.org\/10.1587\/transfun.2021vlp0012","relation":{},"ISSN":["0916-8508","1745-1337"],"issn-type":[{"value":"0916-8508","type":"print"},{"value":"1745-1337","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,1]]},"article-number":"2021VLP0012"}}