{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:27:35Z","timestamp":1750220855404,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":45,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,9,30]],"date-time":"2019-09-30T00:00:00Z","timestamp":1569801600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,9,30]]},"DOI":"10.1145\/3357526.3357536","type":"proceedings-article","created":{"date-parts":[[2019,11,6]],"date-time":"2019-11-06T14:25:56Z","timestamp":1573050356000},"page":"396-407","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["CASH"],"prefix":"10.1145","author":[{"given":"Anup","family":"Sarma","sequence":"first","affiliation":[{"name":"The Pennsylvania State University"}]},{"given":"Huaipan","family":"Jiang","sequence":"additional","affiliation":[{"name":"The Pennsylvania State University"}]},{"given":"Ashutosh","family":"Pattnaik","sequence":"additional","affiliation":[{"name":"The Pennsylvania State University"}]},{"given":"Jagadish","family":"Kotra","sequence":"additional","affiliation":[{"name":"The Pennsylvania State University"}]},{"given":"Mahmut Taylan","family":"Kandemir","sequence":"additional","affiliation":[{"name":"The Pennsylvania State University"}]},{"given":"Chita R.","family":"Das","sequence":"additional","affiliation":[{"name":"The Pennsylvania State University"}]}],"member":"320","published-online":{"date-parts":[[2019,9,30]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/3195638.3195664"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/774789.774805"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2644865.2541967"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2996864"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.58"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2016.2616357"},{"key":"e_1_3_2_1_8_1","volume-title":"Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. In ACM SIGARCH Computer Architecture News","author":"Chi Ping","year":"2016","unstructured":"Ping Chi , Shuangchen Li , Cong Xu , Tao Zhang , Jishen Zhao , Yongpan Liu , Yu Wang , and Yuan Xie . 2016 . Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. In ACM SIGARCH Computer Architecture News , Vol. 44 . IEEE Press , 27--39. Ping Chi, Shuangchen Li, Cong Xu, Tao Zhang, Jishen Zhao, Yongpan Liu, Yu Wang, and Yuan Xie. 2016. Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. In ACM SIGARCH Computer Architecture News, Vol. 44. IEEE Press, 27--39."},{"key":"e_1_3_2_1_9_1","unstructured":"Inc. CISCO. [n.d.]. Cisco's Global Cloud Index. https:\/\/www.cisco.com\/c\/en\/us\/solutions\/collateral\/service-provider\/global-cloud-index-gci\/white-paper-c11-738085.html.  Inc. CISCO. [n.d.]. Cisco's Global Cloud Index. https:\/\/www.cisco.com\/c\/en\/us\/solutions\/collateral\/service-provider\/global-cloud-index-gci\/white-paper-c11-738085.html."},{"key":"e_1_3_2_1_10_1","volume-title":"Distributed deep learning using synchronous stochastic gradient descent. arXiv preprint arXiv:1602.06709","author":"Das Dipankar","year":"2016","unstructured":"Dipankar Das , Sasikanth Avancha , Dheevatsa Mudigere , Karthikeyan Vaidynathan , Srinivas Sridharan , Dhiraj Kalamkar , Bharat Kaul , and Pradeep Dubey . 2016. Distributed deep learning using synchronous stochastic gradient descent. arXiv preprint arXiv:1602.06709 ( 2016 ). Dipankar Das, Sasikanth Avancha, Dheevatsa Mudigere, Karthikeyan Vaidynathan, Srinivas Sridharan, Dhiraj Kalamkar, Bharat Kaul, and Pradeep Dubey. 2016. Distributed deep learning using synchronous stochastic gradient descent. arXiv preprint arXiv:1602.06709 (2016)."},{"key":"e_1_3_2_1_11_1","unstructured":"Jeffrey Dean Greg Corrado Rajat Monga Kai Chen Matthieu Devin Mark Mao Andrew Senior Paul Tucker Ke Yang Quoc V Le etal 2012. Large scale distributed deep networks. In Advances in neural information processing systems. 1223--1231.  Jeffrey Dean Greg Corrado Rajat Monga Kai Chen Matthieu Devin Mark Mao Andrew Senior Paul Tucker Ke Yang Quoc V Le et al. 2012. Large scale distributed deep networks. In Advances in neural information processing systems. 1223--1231."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750389"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2011.5981829"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2010.5456923"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3135974.3135993"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1133981.1134024"},{"key":"e_1_3_2_1_17_1","volume-title":"Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149","author":"Han Song","year":"2015","unstructured":"Song Han , Huizi Mao , and William J Dally . 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 ( 2015 ). Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173162.3173194"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2749472"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00059"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46493-0_38"},{"volume-title":"Computer architecture: a quantitative approach","author":"Hennessy John L","key":"e_1_3_2_1_22_1","unstructured":"John L Hennessy and David A Patterson . 2011. Computer architecture: a quantitative approach . Elsevier . John L Hennessy and David A Patterson. 2011. Computer architecture: a quantitative approach. Elsevier."},{"key":"e_1_3_2_1_23_1","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","volume":"1","author":"Huang Gao","unstructured":"Gao Huang , Zhuang Liu , Kilian Q Weinberger , and Laurens van der Maaten. 2017. Densely connected convolutional networks . In Proceedings of the IEEE conference on computer vision and pattern recognition , Vol. 1 . 3. Gao Huang, Zhuang Liu, Kilian Q Weinberger, and Laurens van der Maaten. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Vol. 1. 3."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.284"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/SOCC.2018.8618537"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.41"},{"key":"e_1_3_2_1_28_1","volume-title":"One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997","author":"Krizhevsky Alex","year":"2014","unstructured":"Alex Krizhevsky . 2014. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 ( 2014 ). Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014)."},{"key":"e_1_3_2_1_29_1","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.  Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105."},{"key":"e_1_3_2_1_30_1","volume-title":"Phase-change technology and the future of main memory","author":"Lee Benjamin C","year":"2010","unstructured":"Benjamin C Lee , Ping Zhou , Jun Yang , Youtao Zhang , Bo Zhao , Engin Ipek , Onur Mutlu , and Doug Burger . 2010. Phase-change technology and the future of main memory . IEEE micro 30, 1 ( 2010 ), 143--143. Benjamin C Lee, Ping Zhou, Jun Yang, Youtao Zhang, Bo Zhao, Engin Ipek, Onur Mutlu, and Doug Burger. 2010. Phase-change technology and the future of main memory. IEEE micro 30, 1 (2010), 143--143."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694358"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-96983-1_19"},{"key":"e_1_3_2_1_33_1","volume-title":"WRPN: wide reduced-precision networks. arXiv preprint arXiv:1709.01134","author":"Mishra Asit","year":"2017","unstructured":"Asit Mishra , Eriko Nurvitadhi , Jeffrey J Cook , and Debbie Marr . 2017. WRPN: wide reduced-precision networks. arXiv preprint arXiv:1709.01134 ( 2017 ). Asit Mishra, Eriko Nurvitadhi, Jeffrey J Cook, and Debbie Marr. 2017. WRPN: wide reduced-precision networks. arXiv preprint arXiv:1709.01134 (2017)."},{"volume-title":"Advanced Compiler Design and Implementation","author":"Muchnick Steven","key":"e_1_3_2_1_34_1","unstructured":"Steven Muchnick . 1997. Advanced Compiler Design and Implementation . Morgan Kaufmann Publishers . Steven Muchnick. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers."},{"key":"e_1_3_2_1_35_1","unstructured":"Jongsoo Park Maxim Naumov Protonu Basu Summer Deng Aravind Kalaiah Daya Khudia James Law Parth Malani Andrey Malevich Satish Nadathur etal 2018. Deep Learning Inference in Facebook Data Centers: Characterization Performance Optimizations and Hardware Implications. arXiv preprint arXiv:1811.09886 (2018).  Jongsoo Park Maxim Naumov Protonu Basu Summer Deng Aravind Kalaiah Daya Khudia James Law Parth Malani Andrey Malevich Satish Nadathur et al. 2018. Deep Learning Inference in Facebook Data Centers: Characterization Performance Optimizations and Hardware Implications. arXiv preprint arXiv:1811.09886 (2018)."},{"key":"e_1_3_2_1_36_1","volume-title":"Memory-efficient implementation of densenets. arXiv preprint arXiv:1707.06990","author":"Pleiss Geoff","year":"2017","unstructured":"Geoff Pleiss , Danlu Chen , Gao Huang , Tongcheng Li , Laurens van der Maaten , and Kilian Q Weinberger . 2017. Memory-efficient implementation of densenets. arXiv preprint arXiv:1707.06990 ( 2017 ). Geoff Pleiss, Danlu Chen, Gao Huang, Tongcheng Li, Laurens van der Maaten, and Kilian Q Weinberger. 2017. Memory-efficient implementation of densenets. arXiv preprint arXiv:1707.06990 (2017)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.5555\/3195638.3195660"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ReTIS.2011.6146863"},{"key":"e_1_3_2_1_39_1","volume-title":"Cooperative Cache Scrubbing. ACM International Conference on Parallel Architecture and Compiler Techniques (PACT).","author":"Sartor Jennifer B.","year":"2014","unstructured":"Jennifer B. Sartor , Wim Heirman , Stephen M. Blackburn , Lieven Eeckhout , and Kathryn S McKinley . 2014 . Cooperative Cache Scrubbing. ACM International Conference on Parallel Architecture and Compiler Techniques (PACT). Jennifer B. Sartor, Wim Heirman, Stephen M. Blackburn, Lieven Eeckhout, and Kathryn S McKinley. 2014. Cooperative Cache Scrubbing. ACM International Conference on Parallel Architecture and Compiler Techniques (PACT)."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001139"},{"key":"e_1_3_2_1_41_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 ( 2014 ). Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Christian Szegedy Wei Liu Yangqing Jia Pierre Sermanet Scott Reed Dragomir Anguelov Dumitru Erhan Vincent Vanhoucke Andrew Rabinovich etal 2015. Going deeper with convolutions. CVPR.  Christian Szegedy Wei Liu Yangqing Jia Pierre Sermanet Scott Reed Dragomir Anguelov Dumitru Erhan Vincent Vanhoucke Andrew Rabinovich et al. 2015. Going deeper with convolutions. CVPR.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080215"},{"key":"e_1_3_2_1_44_1","volume-title":"Deepcpu: Serving rnn-based deep learning models 10x faster. In 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18). 951--965.","author":"Zhang Minjia","year":"2018","unstructured":"Minjia Zhang , Samyam Rajbhandari , Wenhan Wang , and Yuxiong He . 2018 . Deepcpu: Serving rnn-based deep learning models 10x faster. In 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18). 951--965. Minjia Zhang, Samyam Rajbhandari, Wenhan Wang, and Yuxiong He. 2018. Deepcpu: Serving rnn-based deep learning models 10x faster. In 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18). 951--965."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2016.119"}],"event":{"name":"MEMSYS '19: The International Symposium on Memory Systems","acronym":"MEMSYS '19","location":"Washington District of Columbia USA"},"container-title":["Proceedings of the International Symposium on Memory Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3357526.3357536","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3357526.3357536","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:23:22Z","timestamp":1750202602000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3357526.3357536"}},"subtitle":["compiler assisted hardware design for improving DRAM energy efficiency in CNN inference"],"short-title":[],"issued":{"date-parts":[[2019,9,30]]},"references-count":45,"alternative-id":["10.1145\/3357526.3357536","10.1145\/3357526"],"URL":"https:\/\/doi.org\/10.1145\/3357526.3357536","relation":{},"subject":[],"published":{"date-parts":[[2019,9,30]]},"assertion":[{"value":"2019-09-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}