{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T22:12:12Z","timestamp":1766268732392,"version":"3.41.0"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2024,9,4]],"date-time":"2024-09-04T00:00:00Z","timestamp":1725408000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2024,9,30]]},"abstract":"<jats:p>\n            Non-volatile memories (NVMs), with their high storage density and ultra-low leakage power, offer promising potential for redesigning the memory hierarchy in next-generation Multi-Processor Systems-on-Chip (MPSoCs). However, the adoption of NVMs in cache designs introduces challenges such as NVM write overheads and limited NVM endurance. The shared NVM cache in an MPSoC experiences\n            <jats:italic>requests<\/jats:italic>\n            from different processor cores and\n            <jats:italic>responses<\/jats:italic>\n            from the off-chip memory when the requested data is not present in the cache. Besides, upon evictions of dirty data from higher-level caches, the shared NVM cache experiences another source of write operations, known as\n            <jats:italic>writebacks<\/jats:italic>\n            . These sources of write operations\u2014writebacks and responses\u2014further exacerbate the contention for the shared bandwidth of the NVM cache and create significant performance bottlenecks. Uncontrolled write operations can also affect the endurance of the NVM cache, posing a threat to cache lifetime and system reliability. Existing strategies often address either performance or cache endurance individually, leaving a gap for a holistic solution. This study introduces the Performance Optimization and Endurance Management (POEM) methodology, a novel approach that aggressively bypasses cache writebacks and responses to alleviate the NVM cache contention. Contrary to the existing bypass policies that do not pay adequate attention to the shared NVM cache contention and focus too much on cache data reuse, POEM\u2019s aggressive bypass significantly improves the overall system performance, even at the expense of data reuse. POEM also employs effective wear leveling to enhance the NVM cache endurance by careful redistribution of write operations across different cache lines. Across diverse workloads, POEM yields an average speedup of 34% over a na\u00efve baseline and 28.8% over a state-of-the-art NVM cache bypass technique while enhancing the cache endurance by 15% over the baseline. POEM also explores diverse design choices by exploiting a key policy parameter that assigns varying priorities to the two system-level objectives.\n          <\/jats:p>","DOI":"10.1145\/3653452","type":"journal-article","created":{"date-parts":[[2024,3,27]],"date-time":"2024-03-27T11:57:53Z","timestamp":1711540673000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["POEM: Performance Optimization and Endurance Management for Non-volatile Caches"],"prefix":"10.1145","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4885-5055","authenticated-orcid":false,"given":"Aritra","family":"Bagchi","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7479-681X","authenticated-orcid":false,"family":"Dharamjeet","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-9735-8303","authenticated-orcid":false,"given":"Ohm","family":"Rishabh","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1417-3570","authenticated-orcid":false,"given":"Manan","family":"Suri","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2508-7531","authenticated-orcid":false,"given":"Preeti Ranjan","family":"Panda","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}]}],"member":"320","published-online":{"date-parts":[[2024,9,4]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"2013. NVSim - A performance energy and area estimation tool for non-volatile memory (NVM). Retrieved from https:\/\/github.com\/SEAL-UCSB\/NVSim"},{"key":"e_1_3_1_3_2","unstructured":"2017. CACTI: an enhanced cache access and cycle time model. Retrieved from https:\/\/github.com\/HewlettPackard\/cacti"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAC56929.2023.10247878"},{"key":"e_1_3_1_5_2","first-page":"1","volume-title":"IFIP\/IEEE International Conference on Very Large Scale Integration (VLSI-SoC\u201916)","author":"Agarwal Sukarn","year":"2016","unstructured":"Sukarn Agarwal and Hemangee K. Kapoor. 2016. Restricting writes for energy-efficient hybrid cache in multi-core architectures. In IFIP\/IEEE International Conference on Very Large Scale Integration (VLSI-SoC\u201916). IEEE, 1\u20136."},{"key":"e_1_3_1_6_2","first-page":"1","volume-title":"IFIP\/IEEE International Conference on Very Large Scale Integration (VLSI-SoC\u201917)","author":"Agarwal Sukarn","year":"2017","unstructured":"Sukarn Agarwal and Hemangee K. Kapoor. 2017. Targeting inter set write variation to improve the lifetime of non-volatile cache using fellow sets. In IFIP\/IEEE International Conference on Very Large Scale Integration (VLSI-SoC\u201917). IEEE, 1\u20136."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2019.2892424"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3411368"},{"key":"e_1_3_1_9_2","first-page":"223","volume-title":"International Symposium on Low Power Electronics and Design (ISLPED\u201913)","author":"Ahn Junwhan","year":"2013","unstructured":"Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2013. Write intensity prediction for energy-efficient non-volatile caches. In International Symposium on Low Power Electronics and Design (ISLPED\u201913). IEEE, 223\u2013228."},{"key":"e_1_3_1_10_2","first-page":"25","volume-title":"IEEE 20th International Symposium on High Performance Computer Architecture (HPCA\u201914)","author":"Ahn Junwhan","year":"2014","unstructured":"Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2014. DASCA: Dead write prediction assisted STT-RAM cache architecture. In IEEE 20th International Symposium on High Performance Computer Architecture (HPCA\u201914). IEEE, 25\u201336."},{"key":"e_1_3_1_11_2","first-page":"1187","volume-title":"Design, Automation & Test in Europe Conference & Exhibition (DATE\u201918)","author":"Azad Zahra","year":"2018","unstructured":"Zahra Azad, Hamed Farbeh, and Amir Mahdi Hosseini Monazzah. 2018. ORIENT: Organized interleaved ECCs for new STT-MRAM caches. In Design, Automation & Test in Europe Conference & Exhibition (DATE\u201918). IEEE, 1187\u20131190."},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","unstructured":"Aritra Bagchi Dinesh Joshi and Preeti Ranjan Panda. 2024. COBRRA: Contention-aware cache Bypass with Request-Response Arbitration. ACM Transactions on Embedded Computing Systems 23 1 (2024).","DOI":"10.1145\/3632748"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISQED51717.2021.9424250"},{"key":"e_1_3_1_14_2","first-page":"469","volume-title":"22nd International Symposium on Quality Electronic Design (ISQED\u201921)","author":"Baranwal Mayank","year":"2021","unstructured":"Mayank Baranwal, Udbhav Chugh, Shivang Dalal, Sukarn Agarwal, and Hemangee K. Kapoor. 2021. DAMUS: Dynamic allocation based on write frequency in multi-retention STT-RAM based last level caches. In 22nd International Symposium on Quality Electronic Design (ISQED\u201921). IEEE, 469\u2013475."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_3_1_16_2","first-page":"1","volume-title":"Rapido\u201918 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools","author":"Bouziane Rabab","year":"2018","unstructured":"Rabab Bouziane, Erven Rohou, and Abdoulaye Gamati\u00e9. 2018. Compile-time silent-store elimination for energy efficiency: An analytic evaluation for non-volatile cache memory. In Rapido\u201918 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools. 1\u20138."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001148"},{"key":"e_1_3_1_18_2","first-page":"4","volume-title":"IEEE International Workshop on Memory Technology, Design and Testing","author":"Chuang Ching-Te","year":"2007","unstructured":"Ching-Te Chuang, Saibal Mukhopadhyay, Jae-Joon Kim, Keunwoo Kim, and Rahul Rao. 2007. High-performance SRAM in nanoscale CMOS: Design challenges and techniques. In IEEE International Workshop on Memory Technology, Design and Testing. IEEE, 4\u201312."},{"key":"e_1_3_1_19_2","first-page":"39","volume-title":"IEEE\/ACM International Symposium on Nanoscale Architectures (NANOARCH\u201917)","author":"Coi Odilia","year":"2017","unstructured":"Odilia Coi, Guillaume Patrigeon, Sophiane Senni, Lionel Torres, and Pascal Benoit. 2017. A novel SRAM\u2014STT-MRAM hybrid cache implementation improving cache performance. In IEEE\/ACM International Symposium on Nanoscale Architectures (NANOARCH\u201917). IEEE, 39\u201344."},{"key":"e_1_3_1_20_2","volume-title":"Intel Atom Processor P5362","author":"Corporation Intel","year":"2021","unstructured":"Intel Corporation. 2021. Intel Atom Processor P5362. Retrieved from https:\/\/www.intel.in\/content\/www\/in\/en\/products\/sku\/134793\/intel-atom-processor-p5362-27m-cache-2-2ghz\/specifications.html"},{"key":"e_1_3_1_21_2","volume-title":"Intel\u00ae Xeon\u00ae Scalable Processors","author":"Corporation Intel","year":"2023","unstructured":"Intel Corporation. 2023. Intel\u00ae Xeon\u00ae Scalable Processors. Retrieved from https:\/\/www.intel.com\/content\/www\/us\/en\/products\/details\/embedded-processors\/xeon\/4thgen.html"},{"key":"e_1_3_1_22_2","first-page":"480","volume-title":"IEEE International Solid-State Circuits Conference-(ISSCC\u201918)","author":"Dong Qing","year":"2018","unstructured":"Qing Dong, Zhehong Wang, Jongyup Lim, Yiqun Zhang, Yi-Chun Shih, Yu-Der Chih, Jonathan Chang, David Blaauw, and Dennis Sylvester. 2018. A 1Mb 28nm STT-MRAM with 2.8 ns read access time at 1.2 V VDD using single-cap offset-cancelled sense amplifier and in-situ self-write-termination. In IEEE International Solid-State Circuits Conference-(ISSCC\u201918). IEEE, 480\u2013482."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2185930"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2016.2557326"},{"key":"e_1_3_1_25_2","first-page":"222","volume-title":"Design, Automation & Test in Europe Conference & Exhibition (DATE\u201919)","author":"Hosseini Fateme S.","year":"2019","unstructured":"Fateme S. Hosseini and Chengmo Yang. 2019. Compiler-directed and architecture-independent mitigation of read disturbance errors in STT-RAM. In Design, Automation & Test in Europe Conference & Exhibition (DATE\u201919). IEEE, 222\u2013227."},{"key":"e_1_3_1_26_2","first-page":"982","volume-title":"IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","author":"Hu Jingtong","year":"2012","unstructured":"Jingtong Hu, Qingfeng Zhuge, Chun Jason Xue, Wei-Che Tseng, and H. M. Edwin. 2012. Optimizing data allocation and memory configuration for non-volatile memory based hybrid SPM on embedded CMPs. In IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. IEEE, 982\u2013989."},{"issue":"6","key":"e_1_3_1_27_2","first-page":"33","article-title":"Spin-transfer torque MRAM (STT-MRAM): Challenges and prospects","volume":"18","author":"Huai Yiming","year":"2008","unstructured":"Yiming Huai. 2008. Spin-transfer torque MRAM (STT-MRAM): Challenges and prospects. AAPPS Bulletin 18, 6 (2008), 33\u201340.","journal-title":"AAPPS Bulletin"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2018.2796067"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.5555\/2016802.2016827"},{"issue":"3","key":"e_1_3_1_30_2","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1109\/TVLSI.2015.2420954","article-title":"Sequoia: A high-endurance NVM-based cache architecture","volume":"24","author":"Jokar Mohammad Reza","year":"2015","unstructured":"Mohammad Reza Jokar, Mohammad Arjomand, and Hamid Sarbazi-Azad. 2015. Sequoia: A high-endurance NVM-based cache architecture. IEEE Trans. Very Large Scale Integ. Syst. 24, 3 (2015), 954\u2013967.","journal-title":"IEEE Trans. Very Large Scale Integ. Syst."},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1109\/ISQED.2013.6523613","volume-title":"International Symposium on Quality Electronic Design (ISQED\u201913)","author":"Jung Jinwook","year":"2013","unstructured":"Jinwook Jung, Yohei Nakata, Masahiko Yoshimoto, and Hiroshi Kawaguchi. 2013. Energy-efficient spin-transfer torque RAM cache exploiting additional all-zero-data flags. In International Symposium on Quality Electronic Design (ISQED\u201913). IEEE, 216\u2013222."},{"key":"e_1_3_1_32_2","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1109\/MICRO.2010.24","volume-title":"43rd Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Khan Samira Manabi","year":"2010","unstructured":"Samira Manabi Khan, Yingying Tian, and Daniel A. Jimenez. 2010. Sampling dead block prediction for last-level caches. In 43rd Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 175\u2013186."},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2813668"},{"key":"e_1_3_1_34_2","first-page":"52","volume-title":"Symposium on VLSI Technology-Digest of Technical Papers","year":"2011","unstructured":"Young Bae Kim, Seung Ryul Lee, Dongsoo Lee, Chang Bum Lee, Man Chang, Ji Hyun Hur, Myoung Jae Lee, Gyeong Su Park, Chang Jung Kim, U-In Chung, In-Kyeong Yoo, and Kinam Kim. 2011. Bi-layered RRAM with unlimited endurance and extremely uniform switching. In Symposium on VLSI Technology-Digest of Technical Papers. IEEE, 52\u201353."},{"key":"e_1_3_1_35_2","first-page":"1","volume-title":"36th ACM International Conference on Supercomputing","author":"Kokolis Apostolos","year":"2022","unstructured":"Apostolos Kokolis, Namrata Mantri, Shrikanth Ganapathy, Josep Torrellas, and John Kalamatianos. 2022. Cloak: Tolerating non-volatile cache read latency. In 36th ACM International Conference on Supercomputing. 1\u201313."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2018.00035"},{"key":"e_1_3_1_37_2","first-page":"461","volume-title":"Design, Automation & Test in Europe Conference & Exhibition (DATE\u201918)","author":"Kuan Kyle","year":"2018","unstructured":"Kyle Kuan and Tosiron Adegbija. 2018. LARS: Logically adaptable retention time STT-RAM cache for embedded systems. In Design, Automation & Test in Europe Conference & Exhibition (DATE\u201918). IEEE, 461\u2013466."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2019.2918153"},{"key":"e_1_3_1_39_2","first-page":"1","volume-title":"IFIP\/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC\u201922)","author":"Kumar Yogesh","year":"2022","unstructured":"Yogesh Kumar, S. Sivakumar, and John Jose. 2022. ENDURA: Enhancing durability of multi level cell STT-RAM based non volatile memory last level caches. In IFIP\/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC\u201922). IEEE, 1\u20136."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370862"},{"issue":"8","key":"e_1_3_1_41_2","first-page":"2169","article-title":"Compiler-assisted refresh minimization for volatile STT-RAM cache","volume":"64","author":"Li Qingan","year":"2014","unstructured":"Qingan Li, Yanxiang He, Jianhua Li, Liang Shi, Yiran Chen, and Chun Jason Xue. 2014. Compiler-assisted refresh minimization for volatile STT-RAM cache. IEEE Trans. Comput. 64, 8 (2014), 2169\u20132181.","journal-title":"IEEE Trans. Comput."},{"issue":"8","key":"e_1_3_1_42_2","first-page":"1829","article-title":"Compiler-assisted STT-RAM-based hybrid cache for energy efficient embedded systems","volume":"22","author":"Li Qingan","year":"2013","unstructured":"Qingan Li, Jianhua Li, Liang Shi, Mengying Zhao, Chun Jason Xue, and Yanxiang He. 2013. Compiler-assisted STT-RAM-based hybrid cache for energy efficient embedded systems. IEEE Trans. Very Large Scale Integ. (VLSI) Syst. 22, 8 (2013), 1829\u20131840.","journal-title":"IEEE Trans. Very Large Scale Integ. (VLSI) Syst."},{"key":"e_1_3_1_43_2","volume-title":"IEEE International Symposium on Performance Analysis of Systems and Software","author":"Limaye Ankur","year":"2018","unstructured":"Ankur Limaye and Tosiron Adegbija. 2018. A workload characterization of the SPEC CPU2017 benchmark suite. In IEEE International Symposium on Performance Analysis of Systems and Software."},{"issue":"10","key":"e_1_3_1_44_2","first-page":"2149","article-title":"High-endurance hybrid cache design in CMP architecture with cache partitioning and access-aware policies","volume":"23","author":"Lin Chao","year":"2014","unstructured":"Chao Lin and Jeng-Nian Chiou. 2014. High-endurance hybrid cache design in CMP architecture with cache partitioning and access-aware policies. IEEE Trans. Very Large Scale Integ. (VLSI) Syst. 23, 10 (2014), 2149\u20132161.","journal-title":"IEEE Trans. Very Large Scale Integ. (VLSI) Syst."},{"issue":"2","key":"e_1_3_1_45_2","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1109\/TCAD.2022.3175242","article-title":"CAPMIG: Coherence-aware block placement and migration in multiretention STT-RAM caches","volume":"42","author":"Manohar Sheel Sindhu","year":"2022","unstructured":"Sheel Sindhu Manohar and Hemangee K. Kapoor. 2022. CAPMIG: Coherence-aware block placement and migration in multiretention STT-RAM caches. IEEE Trans. Comput.-aid. Des. Integ. Circ. Syst. 42, 2 (2022), 411\u2013422.","journal-title":"IEEE Trans. Comput.-aid. Des. Integ. Circ. Syst."},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3484493"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1186\/1556-276X-9-526"},{"key":"e_1_3_1_48_2","volume-title":"5th Annual Non-volatile Memories Workshop","author":"Mittal Sparsh","year":"2014","unstructured":"Sparsh Mittal and Jeffrey S. Vetter. 2014. Addressing inter-set write-variation for improving lifetime of non-volatile caches. In 5th Annual Non-volatile Memories Workshop."},{"key":"e_1_3_1_49_2","volume-title":"2nd Workshop on Interactions of NVM\/Flash with Operating Systems and Workloads (INFLOW\u201914)","author":"Mittal Sparsh","year":"2014","unstructured":"Sparsh Mittal and Jeffrey S. Vetter. 2014. EqualChance: Addressing intra-set write variation to increase lifetime of non-volatile caches. In 2nd Workshop on Interactions of NVM\/Flash with Operating Systems and Workloads (INFLOW\u201914)."},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2014.2324563"},{"key":"e_1_3_1_51_2","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/2591513.2591525","volume-title":"24th Edition of the Great Lakes Symposium on VLSI","author":"Mittal Sparsh","year":"2014","unstructured":"Sparsh Mittal, Jeffrey S. Vetter, and Dong Li. 2014. WriteSmoothing: Improving lifetime of non-volatile caches using intra-set wear-leveling. In 24th Edition of the Great Lakes Symposium on VLSI. 139\u2013144."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3608096"},{"key":"e_1_3_1_53_2","first-page":"28","article-title":"CACTI 6.0: A tool to model large caches","volume":"27","author":"Muralimanohar Naveen","year":"2009","unstructured":"Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP Lab. 27 (2009), 28.","journal-title":"HP Lab."},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228447"},{"key":"e_1_3_1_55_2","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-01735-3","volume-title":"Phase Change Memory: From Devices to Systems","author":"Qureshi Moinuddin Khalil Ahmed","year":"2012","unstructured":"Moinuddin Khalil Ahmed Qureshi, Sudhanva Gurumurthi, and Bipin Rajendran. 2012. Phase Change Memory: From Devices to Systems. Springer."},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/1840845.1840931"},{"volume-title":"IEEE International Electron Devices Meeting (IEDM\u201918)","year":"2018","key":"e_1_3_1_57_2","unstructured":"Sushil Sakhare, Manu Komalan, T. Huynh Bao, Siddharth Rao, Woojin Kim, Davide Crotti, Farrukh Yasin, Sebastien Couet, Johan Swerts, Shreya Kundu, Dmitry Yakimets, Rogier Baert, Hyungrock Oh, Alessio Spessot, Anda Mocuta, Gouri Sankar Kar, and Arnaud Furn\u00e9mont. 2018. Enablement of STT-MRAM as last level cache for the high performance computing domain at the 5nm node. In IEEE International Electron Devices Meeting (IEDM\u201918). IEEE, 18\u20133."},{"issue":"9","key":"e_1_3_1_58_2","doi-asserted-by":"crossref","first-page":"3618","DOI":"10.1109\/TED.2020.3012123","article-title":"J SW of 5.5 MA\/cm 2 and RA of 5.2- \\(\\Omega\\) \u00b7  \\(\\mu\\) m 2 STT-MRAM technology for LLC application","volume":"67","author":"Sakhare Sushil","year":"2020","unstructured":"Sushil Sakhare, Siddharth Rao, Manu Perumkunnil, Sebastien Couet, Davide Crotti, Simon Van Beek, Arnaud Furnemont, Francky Catthoor, and Gouri Sankar Kar. 2020. J SW of 5.5 MA\/cm 2 and RA of 5.2- \\(\\Omega\\) \u00b7 \\(\\mu\\) m 2 STT-MRAM technology for LLC application. IEEE Trans. Electron Dev. 67, 9 (2020), 3618\u20133625.","journal-title":"IEEE Trans. Electron Dev."},{"issue":"1","key":"e_1_3_1_59_2","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/TETC.2022.3163438","article-title":"WiSE: When learning assists resolving STT-MRAM efficiency challenges","volume":"11","author":"Salahvarzi Arash","year":"2022","unstructured":"Arash Salahvarzi, Mohsen Khosroanjam, Amir Mahdi Hosseini Monazzah, Hakem Beitollahi, Umit Y Ogras, and Mahdi Fazeli. 2022. WiSE: When learning assists resolving STT-MRAM efficiency challenges. IEEE Trans. Emerg. Topics Comput. 11, 1 (2022), 43\u201355.","journal-title":"IEEE Trans. Emerg. Topics Comput."},{"key":"e_1_3_1_60_2","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1145\/3357526.3357538","volume-title":"International Symposium on Memory Systems","author":"Saraf Puneet","year":"2019","unstructured":"Puneet Saraf and Madhu Mutyam. 2019. Endurance enhancement of write-optimized STT-RAM caches. In International Symposium on Memory Systems. 101\u2013113."},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3321693"},{"key":"e_1_3_1_62_2","volume-title":"NXP Layerscape Processors","author":"Semiconductors NXP","year":"2017","unstructured":"NXP Semiconductors. 2017. NXP Layerscape Processors. Retrieved from https:\/\/www.nxp.com\/products\/processors-and-microcontrollers\/arm-processors\/layerscape-processors\/layerscape-lx2160a-lx2120a-lx2080a-processors:LX2160A"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3451995"},{"key":"e_1_3_1_64_2","first-page":"123","volume-title":"Great Lakes Symposium on VLSI","author":"Sivakumar S.","year":"2021","unstructured":"S. Sivakumar, T. M. Abdul Khader, and John Jose. 2021. Improving lifetime of non-volatile memory caches by logical partitioning. In Great Lakes Symposium on VLSI. 123\u2013128."},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3616871"},{"key":"e_1_3_1_66_2","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1007\/978-981-99-0055-8_10","volume-title":"Emerging Electronic Devices, Circuits and Systems: Select Proceedings of EEDCS Workshop Held in Conjunction with ISDCS 2022","author":"Sivakumar S.","year":"2023","unstructured":"S. Sivakumar, Mani Mannampalli, and John Jose. 2023. Enhancing lifetime of non-volatile memory caches by write-aware techniques. In Emerging Electronic Devices, Circuits and Systems: Select Proceedings of EEDCS Workshop Held in Conjunction with ISDCS 2022. Springer, 109\u2013123."},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1093\/nsr\/nwx082"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155659"},{"issue":"12","key":"e_1_3_1_69_2","first-page":"2198","article-title":"ROCKY: A robust hybrid on-chip memory kit for the processors with STT-MRAM cache technology","volume":"70","author":"Talebi Mahdi","year":"2020","unstructured":"Mahdi Talebi, Arash Salahvarzi, Amir Mahdi Hosseini Monazzah, Kevin Skadron, and Mahdi Fazeli. 2020. ROCKY: A robust hybrid on-chip memory kit for the processors with STT-MRAM cache technology. IEEE Trans. Comput. 70, 12 (2020), 2198\u20132210.","journal-title":"IEEE Trans. Comput."},{"key":"e_1_3_1_70_2","first-page":"1","volume-title":"5th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM\u201920)","year":"2020","unstructured":"Eugene Tam, Shenfei Jiang, Paul Duan, Shawn Meng, Yue Pang, Cayden Huang, Yi Han, Jacke Xie, Yuanjun Cui, Jinsong Yu, and Minggui Lu. 2020. Breaking the memory wall for AI chip with a new dimension. In 5th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM\u201920). IEEE, 1\u20137."},{"key":"e_1_3_1_71_2","doi-asserted-by":"publisher","DOI":"10.1145\/3362100"},{"key":"e_1_3_1_72_2","first-page":"847","volume-title":"Design, Automation & Test in Europe Conference & Exhibition (DATE\u201913)","author":"Wang Jue","year":"2013","unstructured":"Jue Wang, Xiangyu Dong, and Yuan Xie. 2013. OAP: An obstruction-aware cache management policy for STT-RAM last-level caches. In Design, Automation & Test in Europe Conference & Exhibition (DATE\u201913). IEEE, 847\u2013852."},{"key":"e_1_3_1_73_2","first-page":"234","volume-title":"IEEE 19th International Symposium on High Performance Computer Architecture (HPCA\u201913)","author":"Wang Jue","year":"2013","unstructured":"Jue Wang, Xiangyu Dong, Yuan Xie, and Norman P. Jouppi. 2013. i 2 WAP: Improving non-volatile cache lifetime by reducing inter-and intra-set write variations. In IEEE 19th International Symposium on High Performance Computer Architecture (HPCA\u201913). IEEE, 234\u2013245."},{"issue":"1","key":"e_1_3_1_74_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2579671","article-title":"Endurance-aware cache line management for non-volatile caches","volume":"11","author":"Wang Jue","year":"2014","unstructured":"Jue Wang, Xiangyu Dong, Yuan Xie, and Norman P. Jouppi. 2014. Endurance-aware cache line management for non-volatile caches. ACM Trans. Archit. Code Optim. 11, 1 (2014), 1\u201325.","journal-title":"ACM Trans. Archit. Code Optim."},{"key":"e_1_3_1_75_2","first-page":"13","volume-title":"IEEE 20th International Symposium on High Performance Computer Architecture (HPCA\u201914)","author":"Wang Zhe","year":"2014","unstructured":"Zhe Wang, Daniel A. Jim\u00e9nez, Cong Xu, Guangyu Sun, and Yuan Xie. 2014. Adaptive placement and migration policy for an STT-RAM-based hybrid cache. In IEEE 20th International Symposium on High Performance Computer Architecture (HPCA\u201914). IEEE, 13\u201324."},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1145\/216585.216588"},{"key":"e_1_3_1_77_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2017.2780522"},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.1145\/3531437.3539709"},{"key":"e_1_3_1_79_2","first-page":"345","volume-title":"International Symposium on Low Power Electronics and Design","author":"Zhang Chao","year":"2014","unstructured":"Chao Zhang, Guangyu Sun, Peng Li, Tao Wang, Dimin Niu, and Yiran Chen. 2014. SBAC: A statistics based cache bypassing method for asymmetric-access caches. In International Symposium on Low Power Electronics and Design. 345\u2013350."},{"key":"e_1_3_1_80_2","first-page":"264","volume-title":"International Conference on Computer-Aided Design","author":"Zhou Ping","year":"2009","unstructured":"Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. Energy reduction for STT-RAM using early write termination. In International Conference on Computer-Aided Design. 264\u2013268."}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3653452","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3653452","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T23:44:26Z","timestamp":1750290266000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3653452"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,4]]},"references-count":79,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,9,30]]}},"alternative-id":["10.1145\/3653452"],"URL":"https:\/\/doi.org\/10.1145\/3653452","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"type":"print","value":"1084-4309"},{"type":"electronic","value":"1557-7309"}],"subject":[],"published":{"date-parts":[[2024,9,4]]},"assertion":[{"value":"2023-10-10","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-03-13","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}