{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,12]],"date-time":"2025-07-12T01:17:41Z","timestamp":1752283061338,"version":"3.41.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2015,8,3]],"date-time":"2015-08-03T00:00:00Z","timestamp":1438560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100002418","name":"Intel Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100002418","id-type":"DOI","asserted-by":"publisher"}]},{"name":"STARnet"},{"name":"GRC"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Emerg. Technol. Comput. Syst."],"published-print":{"date-parts":[[2015,8,3]]},"abstract":"<jats:p>Spintronic memories are considered to be promising candidates for future on-chip memories due to their high density, nonvolatility, and near-zero leakage. However, they also face challenges such as high write energy and latency and limited read speed due to single-ended sensing. Further, the conflicting requirements of read and write operations lead to stringent design constraints that severely compromises their benefits.<\/jats:p>\n          <jats:p>Recently, domain wall memory was proposed as a spintronic memory that has a potential for very high density by storing multiple bits in the domains of a ferromagnetic nanowire. While reliable operation of DWM memory with multiple domains faces many challenges, single-bit cells that utilize domain wall motion for writes have been experimentally demonstrated [Fukami et al. 2009]. This bit-cell, which we refer to as Domain Wall Memory with Shift-based Write (DWM-SW), achieves improved write efficiency and features decoupled read-write paths, enabling independent optimizations of read and write operations. However, these benefits are achieved at the cost of sacrificing the original goal of improved density. In this work, we explore multilevel storage as a new direction to enhance the density benefits of DWM-SW. At the device level, we propose a new device--multilevel DWM with shift-based write (ML-DWM-SW)--that is capable of storing 2 bits in a single device. At the circuit level, we propose a ML-DWM-SW based bit-cell design and layout. The ML-DWM-SW bit-cell incurs no additional area overhead compared to the DWM-SW bit-cell despite storing an additional bit, thereby achieving roughly twice the density. However, it requires a two-step write operation and has data-dependent read and write energies, which pose unique challenges. To address these issues, we propose suitable architectural optimizations: (i) intra-word interleaving and (ii) bit encoding. We design \u201call-spin\u201d cache architectures using the proposed ML-DWM-SW bit-cell for both general purpose processors as well as general purpose graphics processing units (GPGPUs). We perform an iso-capacity replacement of SRAM with spintronic memories and study the energy and area benefits at iso-performance conditions. For general purpose processors, the ML-DWM-SW cache achieves 10X reduction in energy and 4.4X reduction in cache area compared to an SRAM cache and 2X and 1.7X reduction in energy and area, respectively, compared to an STT-MRAM cache. For GPGPUs, the ML-DWM-SW cache achieves 5.3X reduction in energy and 3.6X area reduction compared to SRAM and 3.5X energy reduction and 1.9X area reduction compared to STT-MRAM.<\/jats:p>","DOI":"10.1145\/2723165","type":"journal-article","created":{"date-parts":[[2015,8,4]],"date-time":"2015-08-04T13:57:39Z","timestamp":1438696659000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Energy-Efficient All-Spin Cache Hierarchy Using Shift-Based Writes and Multilevel Storage"],"prefix":"10.1145","volume":"12","author":[{"given":"Rangharajan","family":"Venkatesan","sequence":"first","affiliation":[{"name":"Purdue University"}]},{"given":"Mrigank","family":"Sharad","sequence":"additional","affiliation":[{"name":"Purdue University"}]},{"given":"Kaushik","family":"Roy","sequence":"additional","affiliation":[{"name":"Purdue University"}]},{"given":"Anand","family":"Raghunathan","sequence":"additional","affiliation":[{"name":"Purdue University"}]}],"member":"320","published-online":{"date-parts":[[2015,8,3]]},"reference":[{"volume-title":"Proceedings of the International Electron Devices Meeting. 17","author":"Augustine C.","key":"e_1_2_1_1_1","unstructured":"C. Augustine , A. Raychowdhury , B. Behin-Aein , S. Srinivasan , J. Tschanz , V. K. De , and K. Roy . 2011. Numerical analysis of domain wall propagation for dense memory arrays . In Proceedings of the International Electron Devices Meeting. 17 .6.1--17.6.4. C. Augustine, A. Raychowdhury, B. Behin-Aein, S. Srinivasan, J. Tschanz, V. K. De, and K. Roy. 2011. Numerical analysis of domain wall propagation for dense memory arrays. In Proceedings of the International Electron Devices Meeting. 17.6.1--17.6.4."},{"volume-title":"Proceedings of the International Electron Devices Meeting. 22","author":"Augustine C.","key":"e_1_2_1_2_1","unstructured":"C. Augustine , A. Raychowdhury , D. Somasekhar , J. Tschanz , K. Roy , and V. K. De . 2010. Numerical analysis of typical STT-MTJ stacks for 1T-1R memory arrays . In Proceedings of the International Electron Devices Meeting. 22 .7.1--22.7.4. C. Augustine, A. Raychowdhury, D. Somasekhar, J. Tschanz, K. Roy, and V. K. De. 2010. Numerical analysis of typical STT-MTJ stacks for 1T-1R memory arrays. In Proceedings of the International Electron Devices Meeting. 22.7.1--22.7.4."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.982917"},{"volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 163--174","author":"Bakhoda A.","key":"e_1_2_1_4_1","unstructured":"A. Bakhoda , G. L. Yuan , W. W. L. Fung , H. Wong , and T. M. Aamodt . 2009. Analyzing CUDA workloads using a detailed GPU simulator . In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 163--174 . A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. 2009. Analyzing CUDA workloads using a detailed GPU simulator. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 163--174."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2010.2066530"},{"volume-title":"Proceedings of the IEEE International Conference on Circuits and Systems. 1--4.","author":"Bhumireddy V. R.","key":"e_1_2_1_6_1","unstructured":"V. R. Bhumireddy , K. A. Shaik , A. Amara , S. Sen , C. D. Parikh , D. Nagchoudhuri , and A. Ioinovici . 2013. Design of low power and high speed comparator with sub-32-nm Double Gate-MOSFET . In Proceedings of the IEEE International Conference on Circuits and Systems. 1--4. V. R. Bhumireddy, K. A. Shaik, A. Amara, S. Sen, C. D. Parikh, D. Nagchoudhuri, and A. Ioinovici. 2013. Design of low power and high speed comparator with sub-32-nm Double Gate-MOSFET. In Proceedings of the IEEE International Conference on Circuits and Systems. 1--4."},{"key":"e_1_2_1_7_1","unstructured":"CACTI. http:\/\/www.hpl.hp.com\/research\/cacti\/.  CACTI. http:\/\/www.hpl.hp.com\/research\/cacti\/."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MWSCAS.2010.5548848"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/2016802.2016826"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"D. Chiba G. Yamada T. Koyama K. Ueda H. Tanigawa S. Fukami T. Suzuki N. Ohshima N. Ishiwata Y. Nakatani and T. Ono. 2010. Control of multiple magnetic domain walls by current in a Co\/Ni nano-wire. Appl. Phys. Exp. 3 073004 1--3.  D. Chiba G. Yamada T. Koyama K. Ueda H. Tanigawa S. Fukami T. Suzuki N. Ohshima N. Ishiwata Y. Nakatani and T. Ono. 2010. Control of multiple magnetic domain walls by current in a Co\/Ni nano-wire. Appl. Phys. Exp. 3 073004 1--3.","DOI":"10.1143\/APEX.3.073004"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Asia and South Pacific Design Automation Conference. 684--691","author":"Fukami S.","year":"2014","unstructured":"S. Fukami , H. Sato , M. Yamanouchi , S. Ikeda , F. Matsukura , and H. Ohno . 2014. Advances in spintronics devices for microelectronics: From spin-transfer torque to spin-orbit torque . In Proceedings of the Asia and South Pacific Design Automation Conference. 684--691 . DOI:http:\/\/dx.doi.org\/10.1109\/ASPIEEE. 2014 .6742970 10.1109\/ASPIEEE S. Fukami, H. Sato, M. Yamanouchi, S. Ikeda, F. Matsukura, and H. Ohno. 2014. Advances in spintronics devices for microelectronics: From spin-transfer torque to spin-orbit torque. In Proceedings of the Asia and South Pacific Design Automation Conference. 684--691. DOI:http:\/\/dx.doi.org\/10.1109\/ASPIEEE. 2014.6742970"},{"volume-title":"Proceedings of the IEEE Symposium on VLSI Technology. 230--231","author":"Fukami S.","key":"e_1_2_1_13_1","unstructured":"S. Fukami , T. Suzuki , K. Nagahara , N. Ohshima , Y. Ozaki , S. Saito , R. Nebashi , N. Sakimura , H. Honjo , K. Mori , C. Igarashi , S. Miura , N. Ishiwata , and T. Sugibayashi . 2009. Low-current perpendicular domain wall motion cell for scalable high-speed MRAM . In Proceedings of the IEEE Symposium on VLSI Technology. 230--231 . S. Fukami, T. Suzuki, K. Nagahara, N. Ohshima, Y. Ozaki, S. Saito, R. Nebashi, N. Sakimura, H. Honjo, K. Mori, C. Igarashi, S. Miura, N. Ishiwata, and T. Sugibayashi. 2009. Low-current perpendicular domain wall motion cell for scalable high-speed MRAM. In Proceedings of the IEEE Symposium on VLSI Technology. 230--231."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1021\/cr900056b"},{"volume-title":"Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1455--1458","author":"Gupta S. K.","key":"e_1_2_1_15_1","unstructured":"S. K. Gupta , S. P. Park , N. N. Mojumder , and K. Roy . 2012. Layout-aware optimization of STT MRAMs . In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1455--1458 . S. K. Gupta, S. P. Park, N. N. Mojumder, and K. Roy. 2012. Layout-aware optimization of STT MRAMs. In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1455--1458."},{"volume-title":"Proceedings of the IEEE Symposium on VLSI Technology. 47--48","author":"Ishigaki T.","key":"e_1_2_1_16_1","unstructured":"T. Ishigaki , T. Kawahara , R. Takemura , K. Ono , K. Ito , H. Matsuoka , and H. Ohno . 2010. A multi-level cell spin-transfer torque memory with series-stacked magnetotunnel junctions . In Proceedings of the IEEE Symposium on VLSI Technology. 47--48 . T. Ishigaki, T. Kawahara, R. Takemura, K. Ono, K. Ito, H. Matsuoka, and H. Ohno. 2010. A multi-level cell spin-transfer torque memory with series-stacked magnetotunnel junctions. In Proceedings of the IEEE Symposium on VLSI Technology. 47--48."},{"volume-title":"Proceedings of the International Symposium on Low-Power Electronics and Design. 79--84","author":"Jadidi A.","key":"e_1_2_1_17_1","unstructured":"A. Jadidi , M. Arjomand , and H. S. Azad . 2011. High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement . In Proceedings of the International Symposium on Low-Power Electronics and Design. 79--84 . A. Jadidi, M. Arjomand, and H. S. Azad. 2011. High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement. In Proceedings of the International Symposium on Low-Power Electronics and Design. 79--84."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228521"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228406"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2333660.2333664"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMAG.2010.2075920"},{"volume-title":"Proceedings of the Asia and South Pacific Design Automation Conference. 841--846","author":"Li J.","key":"e_1_2_1_22_1","unstructured":"J. Li , P. Ndai , A. Goel , H. Liu , and K. Roy . 2009. An alternate design paradigm for robust Spin-Torque Transfer Magnetic RAM (STT MRAM) from circuit\/architecture perspective . In Proceedings of the Asia and South Pacific Design Automation Conference. 841--846 . J. Li, P. Ndai, A. Goel, H. Liu, and K. Roy. 2009. An alternate design paradigm for robust Spin-Torque Transfer Magnetic RAM (STT MRAM) from circuit\/architecture perspective. In Proceedings of the Asia and South Pacific Design Automation Conference. 841--846."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1063\/1.1711168"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1063\/1.3049617"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMAG.2010.2043069"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TED.2011.2116024"},{"key":"e_1_2_1_27_1","article-title":"Switching current reduction and thermally induced delay spread compression in tilted magnetic anisotropy spin-transfer torque (STT) MRAM","author":"Mojumder N. N.","year":"2011","unstructured":"N. N. Mojumder and K. Roy . 2011 . Switching current reduction and thermally induced delay spread compression in tilted magnetic anisotropy spin-transfer torque (STT) MRAM . IEEE Trans. Magnetics. N. N. Mojumder and K. Roy. 2011. Switching current reduction and thermally induced delay spread compression in tilted magnetic anisotropy spin-transfer torque (STT) MRAM. IEEE Trans. Magnetics.","journal-title":"IEEE Trans. Magnetics."},{"key":"e_1_2_1_28_1","first-page":"9","article-title":"Direct Observation of Domain Wall Motion Induced by Low-Current Density in TbFeCo","volume":"4","author":"Ngo Duc-The","year":"2011","unstructured":"Duc-The Ngo , Kotato Ikeda , and Hiroyuki Awano . 2011 . Direct Observation of Domain Wall Motion Induced by Low-Current Density in TbFeCo Wires. Appl. Phys. Express 4 , 9 , 093002. Duc-The Ngo, Kotato Ikeda, and Hiroyuki Awano. 2011. Direct Observation of Domain Wall Motion Induced by Low-Current Density in TbFeCo Wires. Appl. Phys. Express 4, 9, 093002.","journal-title":"Wires. Appl. Phys. Express"},{"volume-title":"Proceedings of the International Symposium on Low-Power Electronics and Design. 121--126","author":"Nigam A.","key":"e_1_2_1_29_1","unstructured":"A. Nigam , C. W. Smullen , IV, V. Mohan , E. Chen , S. Gurumurthi , and M. R. Stan . 2011. Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM) . In Proceedings of the International Symposium on Low-Power Electronics and Design. 121--126 . A. Nigam, C. W. Smullen, IV, V. Mohan, E. Chen, S. Gurumurthi, and M. R. Stan. 2011. Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM). In Proceedings of the International Symposium on Low-Power Electronics and Design. 121--126."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.799861"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1840845.1840931"},{"volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture. 50--61","author":"Smullen C. W.","key":"e_1_2_1_32_1","unstructured":"C. W. Smullen , V. Mohan , A. Nigam , S. Gurumurthi , and M. R. Stan . 2011. Relaxing non-volatility for fast and energy-efficient STT-RAM caches . In Proceedings of the International Symposium on High-Performance Computer Architecture. 50--61 . C. W. Smullen, V. Mohan, A. Nigam, S. Gurumurthi, and M. R. Stan. 2011. Relaxing non-volatility for fast and energy-efficient STT-RAM caches. In Proceedings of the International Symposium on High-Performance Computer Architecture. 50--61."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.384164"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMAG.2011.2159106"},{"key":"e_1_2_1_35_1","volume-title":"Nasser Anssari, Geng Daniel Liu, and Wen mei W. Hwu.","author":"Stratton John A.","year":"2012","unstructured":"John A. Stratton , Christopher Rodrigues , I- Jui Sung , Nady Obeid , vLi Wen Chang , Nasser Anssari, Geng Daniel Liu, and Wen mei W. Hwu. 2012 . Parboil : A revised benchmark suite for scientific and commercial throughput computing. Tech. Rep., IMPACT. John A. Stratton, Christopher Rodrigues, I-Jui Sung, Nady Obeid, vLi Wen Chang, Nasser Anssari, Geng Daniel Liu, and Wen mei W. Hwu. 2012. Parboil: A revised benchmark suite for scientific and commercial throughput computing. Tech. Rep., IMPACT."},{"volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture. 239--249","author":"Sun G.","key":"e_1_2_1_36_1","unstructured":"G. Sun , X. Dong , Y. Xie , J. Li , and Y. Chen . 2009. A novel architecture of the 3D stacked MRAM L2 cache for CMPs . In Proceedings of the International Symposium on High-Performance Computer Architecture. 239--249 . G. Sun, X. Dong, Y. Xie, J. Li, and Y. Chen. 2009. A novel architecture of the 3D stacked MRAM L2 cache for CMPs. In Proceedings of the International Symposium on High-Performance Computer Architecture. 239--249."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155659"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488799"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2006.1696083"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2012.2220507"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevB.48.7099"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2333660.2333707"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1825--1830","author":"Venkatesan Rangharajan","year":"2013","unstructured":"Rangharajan Venkatesan , Mrigank Sharad , Kaushik Roy , and Anand Raghunathan . 2013 . DWMTAPESTRI: An energy efficient all-spin cache using domain wall shift based writes . In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1825--1830 . Rangharajan Venkatesan, Mrigank Sharad, Kaushik Roy, and Anand Raghunathan. 2013. DWMTAPESTRI: An energy efficient all-spin cache using domain wall shift based writes. In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 1825--1830."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555761"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2007.377981"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1063\/1.4716460"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSICT.2012.6466687"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1687399.1687448"}],"container-title":["ACM Journal on Emerging Technologies in Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2723165","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2723165","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:03:57Z","timestamp":1750273437000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2723165"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,8,3]]},"references-count":48,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2015,8,3]]}},"alternative-id":["10.1145\/2723165"],"URL":"https:\/\/doi.org\/10.1145\/2723165","relation":{},"ISSN":["1550-4832","1550-4840"],"issn-type":[{"type":"print","value":"1550-4832"},{"type":"electronic","value":"1550-4840"}],"subject":[],"published":{"date-parts":[[2015,8,3]]},"assertion":[{"value":"2013-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-08-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}