{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T01:00:19Z","timestamp":1770339619315,"version":"3.49.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","license":[{"start":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T00:00:00Z","timestamp":1694217600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"RIE2020 Industry Alignment Fund \u2013 Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner, HP Inc.","award":["I1801E0028"],"award-info":[{"award-number":["I1801E0028"]}]},{"name":"Nanyang Technological University, Singapore, under its NAP","award":["M4082282\/04INS000515C130"],"award-info":[{"award-number":["M4082282\/04INS000515C130"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2023,10,31]]},"abstract":"<jats:p>Crossbar-based In-Memory Processing (IMP) accelerators have been widely adopted to achieve high-speed and low-power computing, especially for deep neural network (DNN) models with numerous weights and high computational complexity. However, the floating-point (FP) arithmetic is not compatible with crossbar architectures. Also, redundant weights of current DNN models occupy too many crossbars, limiting the efficiency of crossbar accelerators. Meanwhile, due to the inherent non-ideal behavior of crossbar devices, like write variations, pre-trained DNN models suffer from accuracy degradation when it is deployed on a crossbar-based IMP accelerator for inference. Although some approaches are proposed to address these issues, they often fail to consider the interaction among these issues, and introduce significant hardware overhead for solving each issue. To deploy complex models on IMP accelerators, we should compact the model and mitigate the influence of device non-ideal behaviors without introducing significant overhead from each technique.<\/jats:p>\n          <jats:p>In this paper, we first propose to reuse bit-shift units in crossbars for approximately multiplying scaling factors in our quantization scheme to avoid using FP processors. Second, we propose to apply kernel-group pruning and crossbar pruning to eliminate the hardware units for data aligning. We also design a zerorize-recover training process for our pruning method to achieve higher accuracy. Third, we adopt the runtime-aware non-ideality adaptation with a self-compensation scheme to relieve the impact of non-ideality by exploiting the feature of crossbars. Finally, we integrate these three optimization procedures into one training process to form a comprehensive learning framework for co-optimization, which can achieve higher accuracy. The experimental results indicate that our comprehensive learning framework can obtain significant improvements over the original model when inferring on the crossbar-based IMP accelerator, with an average reduction of computing power and computing area by 100.02\u00d7 and 17.37\u00d7, respectively. Furthermore, we can obtain totally integer-only, pruned, and reliable VGG-16 and ResNet-56 models for the Cifar-10 dataset on IMP accelerators, with accuracy drops of only 2.19% and 1.26%, respectively, without any hardware overhead.<\/jats:p>","DOI":"10.1145\/3609115","type":"journal-article","created":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T13:33:18Z","timestamp":1694266398000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["CRIMP:\n            <u>C<\/u>\n            ompact &amp;\n            <u>R<\/u>\n            eliable DNN Inference on\n            <u>I<\/u>\n            n-\n            <u>M<\/u>\n            emory\n            <u>P<\/u>\n            rocessing via Crossbar-Aligned Compression and Non-ideality Adaptation"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4744-304X","authenticated-orcid":false,"given":"Shuo","family":"Huai","sequence":"first","affiliation":[{"name":"Nanyang Technological University, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1378-0056","authenticated-orcid":false,"given":"Hao","family":"Kong","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0758-2248","authenticated-orcid":false,"given":"Xiangzhong","family":"Luo","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6875-5691","authenticated-orcid":false,"given":"Shiqing","family":"Li","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7118-0796","authenticated-orcid":false,"given":"Ravi","family":"Subramaniam","sequence":"additional","affiliation":[{"name":"HP Inc., United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7304-419X","authenticated-orcid":false,"given":"Christian","family":"Makaya","sequence":"additional","affiliation":[{"name":"HP Inc., United States"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-1014-3285","authenticated-orcid":false,"given":"Qian","family":"Lin","sequence":"additional","affiliation":[{"name":"HP Inc., United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9348-4662","authenticated-orcid":false,"given":"Weichen","family":"Liu","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2023,9,9]]},"reference":[{"issue":"1","key":"e_1_3_2_2_2","doi-asserted-by":"crossref","first-page":"89","DOI":"10.25130\/tjps.v28i1.1270","article-title":"Performance refinement of convolutional neural network architectures for solving big data problems","volume":"28","author":"Aljaloud Saud","year":"2023","unstructured":"Saud Aljaloud. 2023. Performance refinement of convolutional neural network architectures for solving big data problems. Tikrit Journal of Pure Science 28, 1 (2023), 89\u201395.","journal-title":"Tikrit Journal of Pure Science"},{"key":"e_1_3_2_3_2","first-page":"3","volume-title":"Esann","author":"Anguita Davide","year":"2013","unstructured":"Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge Luis Reyes-Ortiz, et\u00a0al. 2013. A public domain dataset for human activity recognition using smartphones.. In Esann, Vol. 3. 3."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/JXCDC.2020.2987605"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2014.12"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2789723"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001140"},{"key":"e_1_3_2_8_2","first-page":"348","article-title":"Accurate and efficient 2-bit quantized neural networks","volume":"1","author":"Choi Jungwook","year":"2019","unstructured":"Jungwook Choi, Swagath Venkataramani, Vijayalakshmi Viji Srinivasan, Kailash Gopalakrishnan, Zhuo Wang, and Pierce Chuang. 2019. Accurate and efficient 2-bit quantized neural networks. Proceedings of Machine Learning and Systems 1 (2019), 348\u2013359.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_9_2","first-page":"1","volume-title":"2020 57th ACM\/IEEE Design Automation Conference (DAC)","author":"Chu Chaoqun","year":"2020","unstructured":"Chaoqun Chu, Yanzhi Wang, Yilong Zhao, Xiaolong Ma, Shaokai Ye, Yunyan Hong, Xiaoyao Liang, Yinhe Han, and Li Jiang. 2020. PIM-prune: Fine-grain DCNN pruning for crossbar-based process-in-memory architecture. In 2020 57th ACM\/IEEE Design Automation Conference (DAC). IEEE, 1\u20136."},{"key":"e_1_3_2_10_2","unstructured":"PyTorch Examples Contributors. 2022. PyTorch Examples. https:\/\/github.com\/pytorch\/examples\/tree\/main\/imagenet[Online; accessed 28-May-2023]."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2211477"},{"key":"e_1_3_2_12_2","first-page":"arXiv\u20132211","article-title":"CorrectNet: Robustness enhancement of analog in-memory computing for neural networks by error suppression and compensation","author":"Eldebiky Amro","year":"2022","unstructured":"Amro Eldebiky, Grace Li Zhang, Georg Boecherer, Bing Li, and Ulf Schlichtmann. 2022. CorrectNet: Robustness enhancement of analog in-memory computing for neural networks by error suppression and compensation. arXiv e-prints (2022), arXiv\u20132211.","journal-title":"arXiv e-prints"},{"key":"e_1_3_2_13_2","first-page":"1","volume-title":"2019 International Joint Conference on Neural Networks (IJCNN)","author":"Ensan Sina Sayyah","year":"2019","unstructured":"Sina Sayyah Ensan and Swaroop Ghosh. 2019. FPCAS: In-memory floating point computations for autonomous systems. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1\u20138."},{"key":"e_1_3_2_14_2","first-page":"246","volume-title":"Proceedings of the 28th Asia and South Pacific Design Automation Conference","author":"He Jingyu","year":"2023","unstructured":"Jingyu He, Yucong Huang, Miguel Lastras, Terry Tao Ye, Chi-Ying Tsui, and Kwang-Ting Cheng. 2023. RVComp: Analog variation compensation for RRAM-based in-memory computing. In Proceedings of the 28th Asia and South Pacific Design Automation Conference. 246\u2013251."},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_16_2","volume-title":"SSDM","author":"Hsu K. C.","year":"2015","unstructured":"K. C. Hsu, Feng-Min Lee, Y. Y. Lin, E. K. Lai, J. Y. Wu, D. Y. Lee, Min-Hee Lee, H. L. Lung, K. Y. Hsieh, and C. Y. Lu. 2015. A study of array resistance distribution and a novel operation algorithm for WOx ReRAM memory. In SSDM."},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2022.12.021"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3566097.3567856"},{"issue":"1","key":"e_1_3_2_19_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3533706","article-title":"Rescuing ReRAM-based neural computing systems from device variation","volume":"28","author":"Huang Chenglong","year":"2022","unstructured":"Chenglong Huang, Nuo Xu, Junwei Zeng, Wenqing Wang, Yihong Hu, Liang Fang, Desheng Ma, and Yanting Chen. 2022. Rescuing ReRAM-based neural computing systems from device variation. ACM Transactions on Design Automation of Electronic Systems 28, 1 (2022), 1\u201317.","journal-title":"ACM Transactions on Design Automation of Electronic Systems"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00286"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","first-page":"1733","DOI":"10.23919\/DATE51398.2021.9474226","volume-title":"2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)","author":"Jung Giju","year":"2021","unstructured":"Giju Jung, Mohammed Fouda, Sugil Lee, Jongeun Lee, Ahmed Eltawil, and Fadi Kurdahi. 2021. Cost-and dataset-free stuck-at fault mitigation for ReRAM-based deep learning accelerators. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1733\u20131738."},{"issue":"94720","key":"e_1_3_2_22_2","first-page":"11","article-title":"IEEE standard 754 for binary floating-point arithmetic","author":"Kahan William","year":"1996","unstructured":"William Kahan. 1996. IEEE standard 754 for binary floating-point arithmetic. Lecture Notes on the Status of IEEE94720-1776 (1996), 11.","journal-title":"Lecture Notes on the Status of IEEE"},{"key":"e_1_3_2_23_2","first-page":"1","volume-title":"2019 International Joint Conference on Neural Networks (IJCNN)","author":"Klachko Michael","year":"2019","unstructured":"Michael Klachko, Mohammad Reza Mahmoodi, and Dmitri Strukov. 2019. Improving noise tolerance of mixed-signal neural networks. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1\u20138."},{"key":"e_1_3_2_24_2","first-page":"1","volume-title":"Proceedings of the 41st IEEE\/ACM International Conference on Computer-Aided Design","author":"Kong Hao","year":"2022","unstructured":"Hao Kong, Di Liu, Shuo Huai, Xiangzhong Luo, Weichen Liu, Ravi Subramaniam, Christian Makaya, and Qian Lin. 2022. Smart scissor: Coupling spatial redundancy reduction and CNN compression for embedded hardware. In Proceedings of the 41st IEEE\/ACM International Conference on Computer-Aided Design. 1\u20139."},{"key":"e_1_3_2_25_2","unstructured":"Alex Krizhevsky Geoffrey Hinton et\u00a0al. 2009. Learning multiple layers of features from tiny images. (2009)."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_2_27_2","first-page":"91","volume-title":"2019 IEEE 37th International Conference on Computer Design (ICCD)","author":"Li Wen","year":"2019","unstructured":"Wen Li, Ying Wang, Huawei Li, and Xiaowei Li. 2019. RRAMedy: Protecting ReRAM-based neural network from permanent and soft faults during its lifetime. In 2019 IEEE 37th International Conference on Computer Design (ICCD). IEEE, 91\u201399."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2874823"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3287624.3287715"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3061639.3062310"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3370748.3406581"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.298"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.298"},{"key":"e_1_3_2_34_2","doi-asserted-by":"crossref","first-page":"1769","DOI":"10.23919\/DATE.2019.8715178","volume-title":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","author":"Long Yun","year":"2019","unstructured":"Yun Long, Xueyuan She, and Saibal Mukhopadhyay. 2019. Design of reliable DNN accelerator with un-reliable ReRAM. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1769\u20131774."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2017.241"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSII.2021.3069011"},{"key":"e_1_3_2_37_2","doi-asserted-by":"crossref","first-page":"1078","DOI":"10.23919\/DATE51398.2021.9474179","volume-title":"2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)","author":"Meng Ziqi","year":"2021","unstructured":"Ziqi Meng, Weikanu Oian, Yilonz Zhao, Yanan Sun, Rui Yang, and Li Jiang. 2021. Digital offset for rram-based neuromorphic computing: A novel solution to conquer cycle-to-cycle variation. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1078\u20131083."},{"key":"e_1_3_2_38_2","first-page":"26","volume-title":"Informatics","author":"Rueda Fernando Moya","year":"2018","unstructured":"Fernando Moya Rueda, Ren\u00e9 Grzeszick, Gernot A. Fink, Sascha Feldhorst, and Michael Ten Hompel. 2018. Convolutional neural networks for human activity recognition using body-worn sensors. In Informatics, Vol. 5. MDPI, 26."},{"key":"e_1_3_2_39_2","first-page":"28","article-title":"CACTI 6.0: A tool to model large caches","volume":"27","author":"Muralimanohar Naveen","year":"2009","unstructured":"Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP Laboratories 27 (2009), 28.","journal-title":"HP Laboratories"},{"key":"e_1_3_2_40_2","volume-title":"Neural Networks and Deep Learning","author":"Nielsen Michael A.","year":"2015","unstructured":"Michael A. Nielsen. 2015. Neural Networks and Deep Learning. Vol. 25. Determination press San Francisco, CA, USA."},{"key":"e_1_3_2_41_2","first-page":"8024","volume-title":"NIPS","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An imperative style, high-performance deep learning library. In NIPS. 8024\u20138035."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001139"},{"key":"e_1_3_2_44_2","first-page":"18335","article-title":"Fast certified robust training with short warmup","volume":"34","author":"Shi Zhouxing","year":"2021","unstructured":"Zhouxing Shi, Yihan Wang, Huan Zhang, Jinfeng Yi, and Cho-Jui Hsieh. 2021. Fast certified robust training with short warmup. Advances in Neural Information Processing Systems 34 (2021), 18335\u201318349.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_45_2","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).","journal-title":"arXiv preprint arXiv:1409.1556"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3061639.3062248"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSSC.2016.2546199"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609115","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3609115","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:58Z","timestamp":1750182538000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609115"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,9]]},"references-count":46,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2023,10,31]]}},"alternative-id":["10.1145\/3609115"],"URL":"https:\/\/doi.org\/10.1145\/3609115","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,9]]},"assertion":[{"value":"2023-03-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-13","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-09-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}