{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T11:31:49Z","timestamp":1763724709864,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":51,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,4,27]],"date-time":"2024-04-27T00:00:00Z","timestamp":1714176000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["NSFC.62222411"],"award-info":[{"award-number":["NSFC.62222411"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,4,27]]},"DOI":"10.1145\/3620665.3640359","type":"proceedings-article","created":{"date-parts":[[2024,4,22]],"date-time":"2024-04-22T14:18:06Z","timestamp":1713795486000},"page":"185-200","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9636-6792","authenticated-orcid":false,"given":"Songyun","family":"Qu","sequence":"first","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5175-7025","authenticated-orcid":false,"given":"Shixin","family":"Zhao","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0732-2267","authenticated-orcid":false,"given":"Bing","family":"Li","sequence":"additional","affiliation":[{"name":"Capital Normal University, beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3054-0617","authenticated-orcid":false,"given":"Yintao","family":"He","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9157-113X","authenticated-orcid":false,"given":"Xuyi","family":"Cai","sequence":"additional","affiliation":[{"name":"Institute of Computing Technology, Chinese Academy of Sciences &amp; University of Chinese Academy of Sciences, beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9711-8758","authenticated-orcid":false,"given":"Lei","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5172-4736","authenticated-orcid":false,"given":"Ying","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,4,27]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375","author":"Agarap Abien Fred","year":"2018","unstructured":"Abien Fred Agarap. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRC.2018.8638612"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISVLSI.2019.00044"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304049"},{"key":"e_1_3_2_1_5_1","volume-title":"et al. Onnx: Open neural network exchange. https:\/\/github.com\/onnx\/onnx","author":"Bai Junjie","year":"2019","unstructured":"Junjie Bai, Fang Lu, Ke Zhang, et al. Onnx: Open neural network exchange. https:\/\/github.com\/onnx\/onnx, 2019."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2018.2880918"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3520142"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2789723"},{"key":"e_1_3_2_1_9_1","volume-title":"Tvm: An automated end-to-end optimizing compiler for deep learning. arXiv preprint arXiv:1802.04799","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, and Luis Ceze. Tvm: An automated end-to-end optimizing compiler for deep learning. arXiv preprint arXiv:1802.04799, 2018."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3533251"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eng.2020.01.007"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001177"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.13"},{"key":"e_1_3_2_1_14_1","unstructured":"Scott Cyphers Arjun K. Bansal Anahita Bhiwandiwalla Jayaram Bobba Matthew Brookhart Avijit Chakraborty William Constable Christian Convey Leona Cook Omar Kanawi Robert Kimball Jason Knight Nikolay Korovaiko Varun Kumar Vijay Yixing Lao Christopher R. Lishka Jaikrishnan Menon Jennifer Myers Sandeep Aswath Narayana Adam Procter and Tristan J. Webb. Intel ngraph: An intermediate representation compiler and executor for deep learning. 2018."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2185930"},{"key":"e_1_3_2_1_16_1","volume-title":"An image is worth 16\u00d716 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, and Sylvain Gelly. An image is worth 16\u00d716 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020."},{"key":"e_1_3_2_1_17_1","volume-title":"IMPACT 2020-10th International Workshop on Polyhedral Compilation Techniques","author":"Drebes Andi","year":"2020","unstructured":"Andi Drebes, Lorenzo Chelini, Oleksandr Zinenko, Albert Cohen, Henk Corporaal, Tobias Grosser, Kanishkan Vadivel, and Nicolas Vasilache. Tc-cim: Empowering tensor comprehensions for computing-in-memory. In IMPACT 2020-10th International Workshop on Polyhedral Compilation Techniques, 2020."},{"key":"e_1_3_2_1_18_1","first-page":"383","volume-title":"Neural cache: Bit-serial in-cache acceleration of deep neural networks. In 2018 ACM\/IEEE 45Th annual international symposium on computer architecture (ISCA)","author":"Eckert Charles","year":"2018","unstructured":"Charles Eckert, Xiaowei Wang, Jingcheng Wang, Arun Subramaniyan, Ravi Iyer, Dennis Sylvester, David Blaaauw, and Reetuparna Das. Neural cache: Bit-serial in-cache acceleration of deep neural networks. In 2018 ACM\/IEEE 45Th annual international symposium on computer architecture (ISCA), pages 383--396. IEEE, 2018."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3296957.3173171"},{"key":"e_1_3_2_1_20_1","first-page":"1","volume-title":"Automation & Test in Europe Conference & Exhibition (DATE)","author":"Gao Chengsi","year":"2023","unstructured":"Chengsi Gao, Ying Wang, Cheng Liu, Mengdi Wang, Weiwei Chen, Yinhe Han, and Lei Zhang. Layer-puzzle: Allocating and scheduling multi-task on multi-core npus by using layer heterogeneity. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1--6. IEEE, 2023."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/IEDM.2017.8268341"},{"key":"e_1_3_2_1_22_1","volume-title":"Polyhedral-based compilation framework for in-memory neural network accelerators. ACM Journal on Emerging Technologies in Computing Systems (JETC), 18(1):1--23","author":"Han Jianhui","year":"2021","unstructured":"Jianhui Han, Xiang Fei, Zhaolin Li, and Youhui Zhang. Polyhedral-based compilation framework for in-memory neural network accelerators. ACM Journal on Emerging Technologies in Computing Systems (JETC), 18(1):1--23, 2021."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2018.2885574"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/DAC18074.2021.9586193"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3489517.3530446"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2021.3092759"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304048"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC42613.2021.9365788"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299874.3319452"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2020.3030548"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2016.12.038"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2018.2819190"},{"key":"e_1_3_2_1_34_1","volume-title":"Max 2: An reram-based neural network accelerator that maximizes data reuse and area utilization","author":"Mao Manqing","year":"2019","unstructured":"Manqing Mao, Xiaochen Peng, Rui Liu, Jingtao Li, Shimeng Yu, and Chaitali Chakrabarti. Max 2: An reram-based neural network accelerator that maximizes data reuse and area utilization. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(2):398--410, 2019."},{"key":"e_1_3_2_1_35_1","volume-title":"Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, and Luca Antiga. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019."},{"key":"e_1_3_2_1_36_1","article-title":"A coordinated model pruning and mapping framework for rram-based dnn accelerators","author":"Qu Songyun","year":"2022","unstructured":"Songyun Qu, Bing Li, Shixin Zhao, Lei Zhang, and Ying Wang. A coordinated model pruning and mapping framework for rram-based dnn accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022.","journal-title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems"},{"key":"e_1_3_2_1_37_1","volume-title":"Glow: graph lowering compiler techniques for neural networks. corr abs\/1805.00907","author":"Rotem Nadav","year":"2018","unstructured":"Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Nadathur Satish, Jakob Olesen, et al. Glow: graph lowering compiler techniques for neural networks. corr abs\/1805.00907 (2018). arXiv preprint arXiv:1805.00907, 2018."},{"key":"e_1_3_2_1_38_1","volume-title":"Xla: Compiling machine learning for peak performance","author":"Sabne Amit","year":"2020","unstructured":"Amit Sabne. Xla: Compiling machine learning for peak performance. 2020."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.12"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2021.3101464"},{"key":"e_1_3_2_1_41_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2017.55"},{"key":"e_1_3_2_1_43_1","first-page":"1423","volume-title":"Automation & Test in Europe Conference & Exhibition (DATE)","author":"Sun Xiaoyu","year":"2018","unstructured":"Xiaoyu Sun, Shihui Yin, Xiaochen Peng, Rui Liu, Jae-sun Seo, and Shimeng Yu. Xnor-rram: A scalable and parallel resistive synaptic architecture for binary neural networks. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1423--1428. IEEE, 2018."},{"key":"e_1_3_2_1_44_1","volume-title":"Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions. arXiv preprint arXiv:1802.04730","author":"Vasilache Nicolas","year":"2018","unstructured":"Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions. arXiv preprint arXiv:1802.04730, 2018."},{"key":"e_1_3_2_1_45_1","volume-title":"Hitting the memory wall: Implications of the obvious. ACM SIGARCH computer architecture news, 23(1):20--24","author":"Wulf Wm A","year":"1995","unstructured":"Wm A Wulf and Sally A McKee. Hitting the memory wall: Implications of the obvious. ACM SIGARCH computer architecture news, 23(1):20--24, 1995."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC42613.2021.9365769"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322271"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2019.2940649"},{"issue":"6","key":"e_1_3_2_1_49_1","first-page":"1733","article-title":"In-memory computing sram macro for binary\/ternary deep neural networks","volume":"55","author":"Yin Shihui","year":"2020","unstructured":"Shihui Yin, Zhewei Jiang, Jae-Sun Seo, and Mingoo Seok. Xnor-sram: In-memory computing sram macro for binary\/ternary deep neural networks. IEEE Journal of Solid-State Circuits, 55(6):1733--1743, 2020.","journal-title":"IEEE Journal of Solid-State Circuits"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA52012.2021.00029"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3316781.3317739"}],"event":{"name":"ASPLOS '24: 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","SIGOPS ACM Special Interest Group on Operating Systems","SIGPLAN ACM Special Interest Group on Programming Languages","SIGBED ACM Special Interest Group on Embedded Systems"],"location":"La Jolla CA USA","acronym":"ASPLOS '24"},"container-title":["Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3620665.3640359","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3620665.3640359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:29:27Z","timestamp":1750285767000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3620665.3640359"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,27]]},"references-count":51,"alternative-id":["10.1145\/3620665.3640359","10.1145\/3620665"],"URL":"https:\/\/doi.org\/10.1145\/3620665.3640359","relation":{},"subject":[],"published":{"date-parts":[[2024,4,27]]},"assertion":[{"value":"2024-04-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}