{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:23:56Z","timestamp":1750220636670,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":68,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,9,28]],"date-time":"2020-09-28T00:00:00Z","timestamp":1601251200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,9,28]]},"DOI":"10.1145\/3422575.3422790","type":"proceedings-article","created":{"date-parts":[[2021,3,22]],"date-time":"2021-03-22T01:43:40Z","timestamp":1616377420000},"page":"158-168","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["A Low Power In-DRAM Architecture for Quantized CNNs using Fast Winograd Convolutions"],"prefix":"10.1145","author":[{"given":"Muhammad Mohsin","family":"Ghaffar","sequence":"first","affiliation":[{"name":"TU Kaiserslautern, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chirag","family":"Sudarshan","sequence":"additional","affiliation":[{"name":"TU Kaiserslautern, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christian","family":"Weis","sequence":"additional","affiliation":[{"name":"TU Kaiserslautern, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Matthias","family":"Jung","sequence":"additional","affiliation":[{"name":"Fraunhofer IESE, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Norbert","family":"Wehn","sequence":"additional","affiliation":[{"name":"TU Kaiserslautern, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,3,21]]},"reference":[{"key":"#cr-split#-e_1_3_2_1_1_1.1","doi-asserted-by":"crossref","unstructured":"A. Agrawal A. Jaiswal D. Roy B. Han G. Srinivasan A. Ankit and K. Roy. 2019. Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays. IEEE Transactions on Circuits and Systems I: Regular Papers PP (04 2019) 1-13. https:\/\/doi.org\/10.1109\/TCSI.2019.2907488 10.1109\/TCSI.2019.2907488","DOI":"10.1109\/TCSI.2019.2907488"},{"key":"#cr-split#-e_1_3_2_1_1_1.2","doi-asserted-by":"crossref","unstructured":"A. Agrawal A. Jaiswal D. Roy B. Han G. Srinivasan A. Ankit and K. Roy. 2019. Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays. IEEE Transactions on Circuits and Systems I: Regular Papers PP (04 2019) 1-13. https:\/\/doi.org\/10.1109\/TCSI.2019.2907488","DOI":"10.1109\/TCSI.2019.2907488"},{"key":"e_1_3_2_1_2_1","unstructured":"S. Angizi and D. Fan. 2019. Accelerating Bulk Bit-Wise X(N)OR Operation in Processing-in-DRAM Platform. CoRR abs\/1904.05782(2019). arxiv:1904.05782http:\/\/arxiv.org\/abs\/1904.05782  S. Angizi and D. Fan. 2019. Accelerating Bulk Bit-Wise X(N)OR Operation in Processing-in-DRAM Platform. CoRR abs\/1904.05782(2019). arxiv:1904.05782http:\/\/arxiv.org\/abs\/1904.05782"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2018.8310397"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3242897"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488762"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/IRPS.2011.5784590"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2017.2749425"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001140"},{"key":"e_1_3_2_1_9_1","unstructured":"J. Choe. 2017. Samsung 18 nm DRAM cell integration: QPT and higher uniformed capacitor high-k dielectrics. https:\/\/www.techinsights.com\/blog\/samsung-18-nm-dram-cell-integration-qpt-and-higher-uniformed-capacitor-high-k-dielectrics  J. Choe. 2017. Samsung 18 nm DRAM cell integration: QPT and higher uniformed capacitor high-k dielectrics. https:\/\/www.techinsights.com\/blog\/samsung-18-nm-dram-cell-integration-qpt-and-higher-uniformed-capacitor-high-k-dielectrics"},{"volume-title":"SK hynix","author":"Choe J.","key":"e_1_3_2_1_10_1","unstructured":"J. Choe . 2017. SK hynix \u2019 21 nm DRAM Cell Technology : Comparison of 1st and 2nd generation. https:\/\/www.techinsights.com\/blog\/sk-hynix-21-nm-dram-cell-technology-comparison-1st-and-2nd-generation J. Choe. 2017. SK hynix\u2019 21 nm DRAM Cell Technology: Comparison of 1st and 2nd generation. https:\/\/www.techinsights.com\/blog\/sk-hynix-21-nm-dram-cell-technology-comparison-1st-and-2nd-generation"},{"key":"e_1_3_2_1_11_1","unstructured":"J. Choe. 2018. Micron\u2019s 1x DRAMs Examined. https:\/\/www.eetimes.com\/author.asp?section_id=36&doc_id=1333289  J. Choe. 2018. Micron\u2019s 1x DRAMs Examined. https:\/\/www.eetimes.com\/author.asp?section_id=36&doc_id=1333289"},{"key":"e_1_3_2_1_12_1","unstructured":"A. Das. 2012. Hynix DRAM layout process integration adapt to change. https:\/\/www.eetimes.com\/hynix-dram-layout-process-integration-adapt-to-change\/#  A. Das. 2012. Hynix DRAM layout process integration adapt to change. https:\/\/www.eetimes.com\/hynix-dram-layout-process-integration-adapt-to-change\/#"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3195970.3196029"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3316781.3317845"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/514191.514197"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/54.748803"},{"key":"e_1_3_2_1_17_1","volume-title":"2013 5th IEEE International Memory Workshop. 30\u201333","author":"Fantini A.","year":"2013","unstructured":"A. Fantini , L. Goux , R. Degraeve , D.\u00a0 J. Wouters , N. Raghavan , G. Kar , A. Belmonte , Y.\u00a0. Chen , B. Govoreanu , and M. Jurczak . 2013. Intrinsic switching variability in HfO2RRAM . In 2013 5th IEEE International Memory Workshop. 30\u201333 . https:\/\/doi.org\/10.1109\/IMW. 2013 .6582090 10.1109\/IMW.2013.6582090 A. Fantini, L. Goux, R. Degraeve, D.\u00a0J. Wouters, N. Raghavan, G. Kar, A. Belmonte, Y.\u00a0. Chen, B. Govoreanu, and M. Jurczak. 2013. Intrinsic switching variability in HfO2RRAM. In 2013 5th IEEE International Memory Workshop. 30\u201333. https:\/\/doi.org\/10.1109\/IMW.2013.6582090"},{"key":"e_1_3_2_1_18_1","unstructured":"J. Fernandez-Marques P.\u00a0N. Whatmough A. Mundy and M. Mattina. 2020. Searching for Winograd-aware Quantized Networks. arxiv:2002.10711\u00a0[cs.LG]  J. Fernandez-Marques P.\u00a0N. Whatmough A. Mundy and M. Mattina. 2020. Searching for Winograd-aware Quantized Networks. arxiv:2002.10711\u00a0[cs.LG]"},{"key":"e_1_3_2_1_19_1","unstructured":"Z. Guz. 2014. Real-Time Analytics as the Killer Application for Processing-In-Memory.  Z. Guz. 2014. Real-Time Analytics as the Killer Application for Processing-In-Memory."},{"key":"e_1_3_2_1_20_1","unstructured":"K. He X. Zhang S. Ren and J. Sun. [n.d.]. Deep Residual Learning for Image Recognition. CVPR \u201915 ([n.\u00a0d.]).  K. He X. Zhang S. Ren and J. Sun. [n.d.]. Deep Residual Learning for Image Recognition. CVPR \u201915 ([n.\u00a0d.])."},{"key":"e_1_3_2_1_21_1","volume-title":"Journal of Physics: Conference Series 1026 (may 2018","author":"Huang Y.","year":"2019","unstructured":"Y. Huang , J. Shen , Z. Wang , M. Wen , and C. Zhang . 2018. A High-efficiency FPGA-based Accelerator for Convolutional Neural Networks using Winograd Algorithm . Journal of Physics: Conference Series 1026 (may 2018 ), 01 2019 . https:\/\/doi.org\/10.1088\/1742-6596\/1026\/1\/012019 10.1088\/1742-6596 Y. Huang, J. Shen, Z. Wang, M. Wen, and C. Zhang. 2018. A High-efficiency FPGA-based Accelerator for Convolutional Neural Networks using Winograd Algorithm. Journal of Physics: Conference Series 1026 (may 2018), 012019. https:\/\/doi.org\/10.1088\/1742-6596\/1026\/1\/012019"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISLPED.2017.8009163"},{"volume-title":"XNOR-SRAM: In-Memory Computing SRAM Macro for Binary\/Ternary Deep Neural Networks. In 2018 IEEE Symposium on VLSI Technology. 173\u2013174","author":"Jiang Z.","key":"e_1_3_2_1_23_1","unstructured":"Z. Jiang , S. Yin , M. Seok , and J. Seo . 2018 . XNOR-SRAM: In-Memory Computing SRAM Macro for Binary\/Ternary Deep Neural Networks. In 2018 IEEE Symposium on VLSI Technology. 173\u2013174 . Z. Jiang, S. Yin, M. Seok, and J. Seo. 2018. XNOR-SRAM: In-Memory Computing SRAM Macro for Binary\/Ternary Deep Neural Networks. In 2018 IEEE Symposium on VLSI Technology. 173\u2013174."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080246"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1186\/s11671-018-2619-x"},{"key":"e_1_3_2_1_26_1","unstructured":"S. Ko and S. Yu. 2020. SMART Paths for Latency Reduction in ReRAM Processing-In-Memory Architecture for CNN Inference. arxiv:2004.04865\u00a0[cs.AR]  S. Ko and S. Yu. 2020. SMART Paths for Latency Reduction in ReRAM Processing-In-Memory Architecture for CNN Inference. arxiv:2004.04865\u00a0[cs.AR]"},{"key":"e_1_3_2_1_27_1","unstructured":"A. Lavin. 2015. Fast Algorithms for Convolutional Neural Networks. CoRR abs\/1509.09308(2015). arxiv:1509.09308http:\/\/arxiv.org\/abs\/1509.09308  A. Lavin. 2015. Fast Algorithms for Convolutional Neural Networks. CoRR abs\/1509.09308(2015). arxiv:1509.09308http:\/\/arxiv.org\/abs\/1509.09308"},{"volume-title":"2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). 432\u2013433","author":"Lee U.","key":"e_1_3_2_1_28_1","unstructured":"D.\u00a0 U. Lee , K.\u00a0 W. Kim , K.\u00a0 W. Kim , H. Kim , J.\u00a0 Y. Kim , Y.\u00a0 J. Park , J.\u00a0 H. Kim , D.\u00a0 S. Kim , H.\u00a0 B. Park , J.\u00a0 W. Shin , J.\u00a0 H. Cho , K.\u00a0 H. Kwon , M.\u00a0 J. Kim , J. Lee , K.\u00a0 W. Park , B. Chung , and S. Hong . 2014. 25.2 A 1.2V 8Gb 8-channel 128GB\/s high-bandwidth memory (HBM) stacked DRAM with effective microbump I\/O test methods using 29nm process and TSV . In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). 432\u2013433 . D.\u00a0U. Lee, K.\u00a0W. Kim, K.\u00a0W. Kim, H. Kim, J.\u00a0Y. Kim, Y.\u00a0J. Park, J.\u00a0H. Kim, D.\u00a0S. Kim, H.\u00a0B. Park, J.\u00a0W. Shin, J.\u00a0H. Cho, K.\u00a0H. Kwon, M.\u00a0J. Kim, J. Lee, K.\u00a0W. Park, B. Chung, and S. Hong. 2014. 25.2 A 1.2V 8Gb 8-channel 128GB\/s high-bandwidth memory (HBM) stacked DRAM with effective microbump I\/O test methods using 29nm process and TSV. In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). 432\u2013433."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2016.7418035"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2924240"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2018.00062"},{"volume-title":"DRISA: A DRAM-based Reconfigurable In-Situ Accelerator. In 2017 50th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). 288\u2013301","author":"Li S.","key":"e_1_3_2_1_32_1","unstructured":"S. Li , D. Niu , K.\u00a0 T. Malladi , H. Zheng , B. Brennan , and Y. Xie . 2017 . DRISA: A DRAM-based Reconfigurable In-Situ Accelerator. In 2017 50th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). 288\u2013301 . S. Li, D. Niu, K.\u00a0T. Malladi, H. Zheng, B. Brennan, and Y. Xie. 2017. DRISA: A DRAM-based Reconfigurable In-Situ Accelerator. In 2017 50th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). 288\u2013301."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2898064"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2019.2908374"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2938234"},{"volume-title":"Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 101\u2013108","author":"Lu L.","key":"e_1_3_2_1_36_1","unstructured":"L. Lu , Y. Liang , Q. Xiao , and S. Yan . 2017 . Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 101\u2013108 . L. Lu, Y. Liang, Q. Xiao, and S. Yan. 2017. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 101\u2013108."},{"key":"e_1_3_2_1_37_1","unstructured":"L. Meng and J. Brothers. 2019. Efficient Winograd Convolution via Integer Arithmetic. CoRR abs\/1901.01965(2019). arxiv:1901.01965http:\/\/arxiv.org\/abs\/1901.01965  L. Meng and J. Brothers. 2019. Efficient Winograd Convolution via Integer Arithmetic. CoRR abs\/1901.01965(2019). arxiv:1901.01965http:\/\/arxiv.org\/abs\/1901.01965"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1147\/JRD.2015.2409732"},{"volume-title":"2019 Symposium on VLSI Circuits. C248\u2013C249","author":"Okumura S.","key":"e_1_3_2_1_39_1","unstructured":"S. Okumura , M. Yabuuchi , K. Hijioka , and K. Nose . 2019. A Ternary Based Bit Scalable, 8.80 TOPS\/W CNN accelerator with Many-core Processing-in-memory Architecture with 896K synapses\/mm2 .. In 2019 Symposium on VLSI Circuits. C248\u2013C249 . S. Okumura, M. Yabuuchi, K. Hijioka, and K. Nose. 2019. A Ternary Based Bit Scalable, 8.80 TOPS\/W CNN accelerator with Many-core Processing-in-memory Architecture with 896K synapses\/mm2.. In 2019 Symposium on VLSI Circuits. C248\u2013C249."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2015.2434872"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3124544"},{"key":"e_1_3_2_1_43_1","unstructured":"V. Seshadri D. Lee T. Mullins H. Hassan A. Boroumand J. Kim M.\u00a0A. Kozuch O. Mutlu P.\u00a0B. Gibbons and T.\u00a0C. Mowry. 2016. Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM. CoRR abs\/1611.09988(2016). arxiv:1611.09988http:\/\/arxiv.org\/abs\/1611.09988  V. Seshadri D. Lee T. Mullins H. Hassan A. Boroumand J. Kim M.\u00a0A. Kozuch O. Mutlu P.\u00a0B. Gibbons and T.\u00a0C. Mowry. 2016. Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM. CoRR abs\/1611.09988(2016). arxiv:1611.09988http:\/\/arxiv.org\/abs\/1611.09988"},{"key":"e_1_3_2_1_44_1","unstructured":"V. Seshadri and O. Mutlu. 2019. In-DRAM Bulk Bitwise Execution Engine. CoRR abs\/1905.09822(2019). arxiv:1905.09822http:\/\/arxiv.org\/abs\/1905.09822  V. Seshadri and O. Mutlu. 2019. In-DRAM Bulk Bitwise Execution Engine. CoRR abs\/1905.09822(2019). arxiv:1905.09822http:\/\/arxiv.org\/abs\/1905.09822"},{"volume-title":"2019 IEEE International Solid- State Circuits Conference - (ISSCC). 396\u2013398","author":"Si X.","key":"e_1_3_2_1_45_1","unstructured":"X. Si , J. Chen , Y. Tu , W. Huang , J. Wang , Y. Chiu , W. Wei , S. Wu , X. Sun , R. Liu , S. Yu , R. Liu , C. Hsieh , K. Tang , Q. Li , and M. Chang . 2019. 24.5 A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning . In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 396\u2013398 . X. Si, J. Chen, Y. Tu, W. Huang, J. Wang, Y. Chiu, W. Wei, S. Wu, X. Sun, R. Liu, S. Yu, R. Liu, C. Hsieh, K. Tang, Q. Li, and M. Chang. 2019. 24.5 A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 396\u2013398."},{"volume-title":"2020 IEEE International Solid- State Circuits Conference - (ISSCC). 246\u2013248","author":"Si X.","key":"e_1_3_2_1_46_1","unstructured":"X. Si , Y. Tu , W. Huanq , J. Su , P. Lu , J. Wang , T. Liu , S. Wu , R. Liu , Y. Chou , Z. Zhang , S. Sie , W. Wei , Y. Lo , T. Wen , T. Hsu , Y. Chen , W. Shih , C. Lo , R. Liu , C. Hsieh , K. Tang , N. Lien , W. Shih , Y. He , Q. Li , and M. Chang . 2020. 15.5 A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips . In 2020 IEEE International Solid- State Circuits Conference - (ISSCC). 246\u2013248 . X. Si, Y. Tu, W. Huanq, J. Su, P. Lu, J. Wang, T. Liu, S. Wu, R. Liu, Y. Chou, Z. Zhang, S. Sie, W. Wei, Y. Lo, T. Wen, T. Hsu, Y. Chen, W. Shih, C. Lo, R. Liu, C. Hsieh, K. Tang, N. Lien, W. Shih, Y. He, Q. Li, and M. Chang. 2020. 15.5 A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips. In 2020 IEEE International Solid- State Circuits Conference - (ISSCC). 246\u2013248."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240765.3240831"},{"volume-title":"Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations.","author":"Simonyan K.","key":"e_1_3_2_1_48_1","unstructured":"K. Simonyan and A. Zisserman . 2015 . Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations. K. Simonyan and A. Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSE.2007.44"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1970.5008902"},{"volume-title":"An In-DRAM Neural Network Processing Engine. In 2019 IEEE International Symposium on Circuits and Systems (ISCAS). 1\u20135.","author":"Sudarshan C.","key":"e_1_3_2_1_51_1","unstructured":"C. Sudarshan , J. Lappas , M.\u00a0 M. Ghaffar , V. Rybalkin , C. Weis , M. Jung , and N. Wehn . 2019 . An In-DRAM Neural Network Processing Engine. In 2019 IEEE International Symposium on Circuits and Systems (ISCAS). 1\u20135. C. Sudarshan, J. Lappas, M.\u00a0M. Ghaffar, V. Rybalkin, C. Weis, M. Jung, and N. Wehn. 2019. An In-DRAM Neural Network Processing Engine. In 2019 IEEE International Symposium on Circuits and Systems (ISCAS). 1\u20135."},{"volume-title":"2015 IEEE International Electron Devices Meeting (IEDM). 26","author":"Sung M.","key":"e_1_3_2_1_52_1","unstructured":"M. Sung , S. Jang , H. Lee , Y. Ji , J. Kang , T. Jung , T. Ahn , Y. Son , H. Kim , S. Lee , S. Lee , J. Lee , S. Baek , E. Doh , H. Cho , T. Jang , I. Jang , J. Han , K. Ko , Y. Lee , S. Shin , J. Yu , S. Cho , J. Han , D. Kang , J. Kim , J. Lee , K. Ban , S. Yeom , H. Nam , D. Lee , M. Jeong , B. Kwak , J. Park , K. Choi , S. Park , N. Kwak , and S. Hong . 2015. Gate-first high-k\/metal gate DRAM technology for low power and high performance products . In 2015 IEEE International Electron Devices Meeting (IEDM). 26 .6.1\u201326.6.4. M. Sung, S. Jang, H. Lee, Y. Ji, J. Kang, T. Jung, T. Ahn, Y. Son, H. Kim, S. Lee, S. Lee, J. Lee, S. Baek, E. Doh, H. Cho, T. Jang, I. Jang, J. Han, K. Ko, Y. Lee, S. Shin, J. Yu, S. Cho, J. Han, D. Kang, J. Kim, J. Lee, K. Ban, S. Yeom, H. Nam, D. Lee, M. Jeong, B. Kwak, J. Park, K. Choi, S. Park, N. Kwak, and S. Hong. 2015. Gate-first high-k\/metal gate DRAM technology for low power and high performance products. In 2015 IEEE International Electron Devices Meeting (IEDM). 26.6.1\u201326.6.4."},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"crossref","unstructured":"C. Szegedy W. Liu Y. Jia P. Sermanet S. Reed D. Anguelov D. Erhan V. Vanhoucke and A. Rabinovich. 2015. Going Deeper with Convolutions. In Computer Vision and Pattern Recognition (CVPR). http:\/\/arxiv.org\/abs\/1409.4842  C. Szegedy W. Liu Y. Jia P. Sermanet S. Reed D. Anguelov D. Erhan V. Vanhoucke and A. Rabinovich. 2015. Going Deeper with Convolutions. In Computer Vision and Pattern Recognition (CVPR). http:\/\/arxiv.org\/abs\/1409.4842","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_2_1_54_1","unstructured":"TechInsights. 2014. TECHNOLOGY ROADMAP of DRAM for Three Major manufacturers: Samsung SK-Hynix and Micron. https:\/\/vdocuments.site\/technology-roadmap-of-dram-for-three-major-manufacturers-samsung-sk-hynix.html  TechInsights. 2014. TECHNOLOGY ROADMAP of DRAM for Three Major manufacturers: Samsung SK-Hynix and Micron. https:\/\/vdocuments.site\/technology-roadmap-of-dram-for-three-major-manufacturers-samsung-sk-hynix.html"},{"key":"e_1_3_2_1_55_1","unstructured":"TechInsights. 2017. Samsung 18 nm DRAMAnalysis. https:\/\/www.techinsights.com\/blog\/samsung-18-nm-dram-analysis  TechInsights. 2017. Samsung 18 nm DRAMAnalysis. https:\/\/www.techinsights.com\/blog\/samsung-18-nm-dram-analysis"},{"key":"e_1_3_2_1_56_1","unstructured":"TechInsights. 2018. Micron Technology MT43A4G40200NFA-S15 ES:A HMC Gen2 - Memory Functional Analysis. https:\/\/w2.techinsights.com\/l\/4202\/2019-08-28\/2hbr19\/4202\/248106\/Sample_Report_MFR_1810_802_Memory_Floorplan_Analysis.pdf.  TechInsights. 2018. Micron Technology MT43A4G40200NFA-S15 ES:A HMC Gen2 - Memory Functional Analysis. https:\/\/w2.techinsights.com\/l\/4202\/2019-08-28\/2hbr19\/4202\/248106\/Sample_Report_MFR_1810_802_Memory_Floorplan_Analysis.pdf."},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2235125"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10766-016-0473-y"},{"key":"e_1_3_2_1_59_1","volume-title":"Arithmetic Complexity of Computations","author":"Winograd S.","year":"1970","unstructured":"S. Winograd . 1980. Arithmetic Complexity of Computations . Society for Industrial and Applied Mathematics . https:\/\/doi.org\/10.1137\/1.978161 1970 364 arXiv:https:\/\/epubs.siam.org\/doi\/pdf\/10.1137\/1.9781611970364 10.1137\/1.9781611970364 S. Winograd. 1980. Arithmetic Complexity of Computations. Society for Industrial and Applied Mathematics. https:\/\/doi.org\/10.1137\/1.9781611970364 arXiv:https:\/\/epubs.siam.org\/doi\/pdf\/10.1137\/1.9781611970364"},{"volume-title":"2019 IEEE International Solid- State Circuits Conference - (ISSCC). 388\u2013390","author":"Xue C.","key":"e_1_3_2_1_60_1","unstructured":"C. Xue , W. Chen , J. Liu , J. Li , W. Lin , W. Lin , J. Wang , W. Wei , T. Chang , T. Chang , T. Huang , H. Kao , S. Wei , Y. Chiu , C. Lee , C. Lo , Y. King , C. Lin , R. Liu , C. Hsieh , K. Tang , and M. Chang . 2019. 24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors . In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 388\u2013390 . C. Xue, W. Chen, J. Liu, J. Li, W. Lin, W. Lin, J. Wang, W. Wei, T. Chang, T. Chang, T. Huang, H. Kao, S. Wei, Y. Chiu, C. Lee, C. Lo, Y. King, C. Lin, R. Liu, C. Hsieh, K. Tang, and M. Chang. 2019. 24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 388\u2013390."},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2019.2951363"},{"volume-title":"Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040)","author":"Kang Y.","key":"e_1_3_2_1_62_1","unstructured":"Y. Kang , W. Huang , S.-M. Yoo , D. Keen , Z. Ge , V. Lam , P. Pattnaik , and J. Torrellas . 1999. FlexRAM: toward an advanced intelligent memory system . In Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040) . 192\u2013201. Y. Kang, W. Huang, S.-M. Yoo, D. Keen, Z. Ge, V. Lam, P. Pattnaik, and J. Torrellas. 1999. FlexRAM: toward an advanced intelligent memory system. In Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040). 192\u2013201."},{"key":"e_1_3_2_1_63_1","volume-title":"WRA: A 2.2-to-6.3 TOPS Highly Unified Dynamically Reconfigurable Accelerator Using a Novel Winograd Decomposition Algorithm for Convolutional Neural Networks","author":"Yang C.","year":"2019","unstructured":"C. Yang , Y. Wang , X. Wang , and L. Geng . 2019 . WRA: A 2.2-to-6.3 TOPS Highly Unified Dynamically Reconfigurable Accelerator Using a Novel Winograd Decomposition Algorithm for Convolutional Neural Networks . IEEE Transactions on Circuits and Systems I: Regular Papers 66, 9(2019), 3480\u20133493. C. Yang, Y. Wang, X. Wang, and L. Geng. 2019. WRA: A 2.2-to-6.3 TOPS Highly Unified Dynamically Reconfigurable Accelerator Using a Novel Winograd Decomposition Algorithm for Convolutional Neural Networks. IEEE Transactions on Circuits and Systems I: Regular Papers 66, 9(2019), 3480\u20133493."},{"key":"e_1_3_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600212.2600213"},{"volume-title":"2019 IEEE Asian Solid-State Circuits Conference (A-SSCC). 217\u2013218","author":"Zhang Z.","key":"e_1_3_2_1_65_1","unstructured":"Z. Zhang , J. Chen , X. Si , Y. Tu , J. Su , W. Huang , J. Wang , W. Wei , Y. Chiu , J. Hong , S. Sheu , S. Li , R. Liu , C. Hsieh , K. Tang , and M. Chang . 2019. A 55nm 1-to-8 bit Configurable 6T SRAM based Computing-in-Memory Unit-Macro for CNN-based AI Edge Processors . In 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC). 217\u2013218 . Z. Zhang, J. Chen, X. Si, Y. Tu, J. Su, W. Huang, J. Wang, W. Wei, Y. Chiu, J. Hong, S. Sheu, S. Li, R. Liu, C. Hsieh, K. Tang, and M. Chang. 2019. A 55nm 1-to-8 bit Configurable 6T SRAM based Computing-in-Memory Unit-Macro for CNN-based AI Edge Processors. In 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC). 217\u2013218."},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/1229175.1229176"},{"key":"e_1_3_2_1_67_1","unstructured":"S. Zhou Y. Wu Z. Ni X. Zhou H. Wen and Y. Zou. 2016. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. arxiv:1606.06160\u00a0[cs.NE]  S. Zhou Y. Wu Z. Ni X. Zhou H. Wen and Y. Zou. 2016. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. arxiv:1606.06160\u00a0[cs.NE]"}],"event":{"name":"MEMSYS 2020: The International Symposium on Memory Systems","acronym":"MEMSYS 2020","location":"Washington DC USA"},"container-title":["The International Symposium on Memory Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3422575.3422790","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3422575.3422790","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:55Z","timestamp":1750197715000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3422575.3422790"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,28]]},"references-count":68,"alternative-id":["10.1145\/3422575.3422790","10.1145\/3422575"],"URL":"https:\/\/doi.org\/10.1145\/3422575.3422790","relation":{},"subject":[],"published":{"date-parts":[[2020,9,28]]},"assertion":[{"value":"2021-03-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}