{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T20:57:54Z","timestamp":1760043474754,"version":"3.41.0"},"reference-count":72,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,1,20]],"date-time":"2021-01-20T00:00:00Z","timestamp":1611100800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"ERC","award":["no. 741097"],"award-info":[{"award-number":["no. 741097"]}]},{"name":"FWO","award":["G.0434.16N, G.0144.17N"],"award-info":[{"award-number":["G.0434.16N, G.0144.17N"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2021,3,31]]},"abstract":"<jats:p>Emerging workloads in cloud and data center infrastructures demand high main memory bandwidth and capacity. Unfortunately, DRAM alone is unable to satisfy contemporary main memory demands. High-bandwidth memory (HBM) uses 3D die-stacking to deliver 4\u20138\u00d7 higher bandwidth. HBM has two drawbacks: (1) capacity is low, and (2) soft error rate is high. Hybrid memory combines DRAM and HBM to promise low fault rates, high bandwidth, and high capacity. Prior OS approaches manage HBM by mapping pages to HBM versus DRAM based on hotness (access frequency) and risk (susceptibility to soft errors). Unfortunately, these approaches operate at a coarse-grained page granularity, and frequent page migrations hurt performance.<\/jats:p>\n          <jats:p>This article proposes a new class of reliability-aware garbage collectors for hybrid HBM-DRAM systems that place hot and low-risk objects in HBM and the rest in DRAM. Our analysis of nine real-world Java workloads shows that: (1) newly allocated objects in the nursery are frequently written, making them both hot and low-risk, (2) a small fraction of the mature objects are hot and low-risk, and (3) allocation site is a good predictor for hotness and risk. We propose RiskRelief, a novel reliability-aware garbage collector that uses allocation site prediction to place hot and low-risk objects in HBM. Allocation sites are profiled offline and RiskRelief uses heuristics to classify allocation sites as DRAM and HBM. The proposed heuristics expose Pareto-optimal trade-offs between soft error rate (SER) and execution time. RiskRelief improves SER by 9\u00d7 compared to an HBM-Only system while at the same time improving performance by 29% compared to a DRAM-Only system. Compared to a state-of-the-art OS approach for reliability-aware data placement, RiskRelief eliminates all page migration overheads, which substantially improves performance while delivering similar SER. Reliability-aware garbage collection opens up a new opportunity to manage emerging HBM-DRAM memories at fine granularity while requiring no extra hardware support and leaving the programming model unchanged.<\/jats:p>","DOI":"10.1145\/3431803","type":"journal-article","created":{"date-parts":[[2021,1,20]],"date-time":"2021-01-20T17:26:38Z","timestamp":1611163598000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories"],"prefix":"10.1145","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3712-8118","authenticated-orcid":false,"given":"Wenjie","family":"Liu","sequence":"first","affiliation":[{"name":"Ghent University, Zwijnaarde, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2388-0517","authenticated-orcid":false,"given":"Shoaib","family":"Akram","sequence":"additional","affiliation":[{"name":"Australian National University, Canberra, Australia"}]},{"given":"Jennifer B.","family":"Sartor","sequence":"additional","affiliation":[{"name":"Ghent University, Zwijnaarde, Belgium"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Zwijnaarde, Belgium"}]}],"member":"320","published-online":{"date-parts":[[2021,1,20]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2016.7482070"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3322205.3311080"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2019.00017"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192392"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1147\/sj.391.0211"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1147\/sj.442.0399"},{"key":"e_1_2_1_7_1","unstructured":"AMD. [n.d.]. High Bandwidth Memory. Retrieved from https:\/\/www.amd.com\/en\/technologies\/hbm.  AMD. [n.d.]. High Bandwidth Memory. Retrieved from https:\/\/www.amd.com\/en\/technologies\/hbm."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/66382.66387"},{"volume-title":"Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201917)","author":"Awad Amro","key":"e_1_2_1_9_1","unstructured":"Amro Awad , Arkaprava Basu , Sergey Blagodurov , Yan Solihin , and Gabriel H. Loh . 2017. Avoiding TLB shootdowns through self-invalidating TLB entries . In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201917) . 273--287. Amro Awad, Arkaprava Basu, Sergey Blagodurov, Yan Solihin, and Gabriel H. Loh. 2017. Avoiding TLB shootdowns through self-invalidating TLB entries. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201917). 273--287."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.18"},{"volume-title":"Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS\u201904)","author":"Blackburn Stephen M.","key":"e_1_2_1_11_1","unstructured":"Stephen M. Blackburn , Perry Cheng , and Kathryn S . McKinley. 2004. Myths and realities: The performance impact of garbage collection . In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS\u201904) . 25--36. Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Myths and realities: The performance impact of garbage collection. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS\u201904). 25--36."},{"volume-title":"Proceedings of the International Conference on Software Engineering (ICSE\u201904)","author":"Blackburn Stephen M.","key":"e_1_2_1_12_1","unstructured":"Stephen M. Blackburn , Perry Cheng , and Kathryn S . McKinley. 2004. Oil and water? High performance garbage collection in Java with MMTk . In Proceedings of the International Conference on Software Engineering (ICSE\u201904) . 137--146. Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Oil and water? High performance garbage collection in Java with MMTk. In Proceedings of the International Conference on Software Engineering (ICSE\u201904). 137--146."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1167473.1167488"},{"volume-title":"Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201908)","author":"Stephen","key":"e_1_2_1_14_1","unstructured":"Stephen M. Blackburn and Kathryn S. McKinley. 2008. Immix: A mark-region garbage collector with space efficiency, fast collection, and mutator performance . In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201908) . 22--32. Stephen M. Blackburn and Kathryn S. McKinley. 2008. Immix: A mark-region garbage collector with space efficiency, fast collection, and mutator performance. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201908). 22--32."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1378704.1378723"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629677"},{"volume-title":"Proceedings of the 47th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201914)","author":"Chou ChiaChen","key":"e_1_2_1_17_1","unstructured":"ChiaChen Chou , Aamer Jaleel , and Moinuddin K. Qureshi . 2014. CAMEO: A two-level memory organization with capacity of main memory and flexibility of hardware-managed cache . In Proceedings of the 47th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201914) . 1--12. ChiaChen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2014. CAMEO: A two-level memory organization with capacity of main memory and flexibility of hardware-managed cache. In Proceedings of the 47th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201914). 1--12."},{"volume-title":"Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA\u201915)","author":"Chou ChiaChen","key":"e_1_2_1_18_1","unstructured":"ChiaChen Chou , Aamer Jaleel , and Moinuddin K. Qureshi . 2015. BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches . In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA\u201915) . 198--210. ChiaChen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2015. BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA\u201915). 198--210."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132402.3132404"},{"key":"e_1_2_1_20_1","unstructured":"NVIDIA Corp.2016. NVIDIA Pascal Architecture. Retrieved from https:\/\/www.nvidia.com\/en-us\/data-center\/pascal-gpu-architecture\/.  NVIDIA Corp.2016. NVIDIA Pascal Architecture. Retrieved from https:\/\/www.nvidia.com\/en-us\/data-center\/pascal-gpu-architecture\/."},{"key":"e_1_2_1_21_1","unstructured":"Timothy J. Dell. 1997. A white paper on the benefits of Chipkill-correct ECC for PC server main memory. IBM Microelectronics Division.  Timothy J. Dell. 1997. A white paper on the benefits of Chipkill-correct ECC for PC server main memory. IBM Microelectronics Division."},{"volume-title":"Proceedings of the ACM\/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC\u201910)","author":"Dong Xiangyu","key":"e_1_2_1_22_1","unstructured":"Xiangyu Dong , Yuan Xie , Naveen Muralimanohar , and Norman P. Jouppi . 2010. Simple but effective heterogeneous main memory with on-chip memory controller support . In Proceedings of the ACM\/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC\u201910) . 1--11. Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2010. Simple but effective heterogeneous main memory with on-chip memory controller support. In Proceedings of the ACM\/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC\u201910). 1--11."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201913)","author":"Bois Kristof Du","year":"2013","unstructured":"Kristof Du Bois , Jennifer B. Sartor , Stijn Eyerman , and Lieven Eeckhout . 2013 . Bottle graphs: Visualizing scalability bottlenecks in multi-threaded applications . In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201913) . 355--372. Kristof Du Bois, Jennifer B. Sartor, Stijn Eyerman, and Lieven Eeckhout. 2013. Bottle graphs: Visualizing scalability bottlenecks in multi-threaded applications. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201913). 355--372."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2010.5416643"},{"volume-title":"Proceedings of the International Conference on Virtual Execution Environments (VEE\u201909)","author":"Frampton Daniel","key":"e_1_2_1_25_1","unstructured":"Daniel Frampton , Stephen M. Blackburn , Perry Cheng , Robin J. Garner , David Grove , J. Eliot B. Moss , and Sergey I. Salishev . 2009. Demystifying magic: High-level low-level programming . In Proceedings of the International Conference on Virtual Execution Environments (VEE\u201909) . 81--90. Daniel Frampton, Stephen M. Blackburn, Perry Cheng, Robin J. Garner, David Grove, J. Eliot B. Moss, and Sergey I. Salishev. 2009. Demystifying magic: High-level low-level programming. In Proceedings of the International Conference on Virtual Execution Environments (VEE\u201909). 81--90."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491956.2462171"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00056"},{"volume-title":"Proceedings of the 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911)","author":"Gabriel","key":"e_1_2_1_28_1","unstructured":"Gabriel H. Loh and Mark D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches . In Proceedings of the 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911) . 454--564. Gabriel H. Loh and Mark D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceedings of the 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911). 454--564."},{"volume-title":"Proceedings of the IBM CAS Workshop.","author":"Ha Jungwoo","key":"e_1_2_1_29_1","unstructured":"Jungwoo Ha , Magnus Gustafsson , Stephen M. Blackburn , and Kathryn S . McKinley. 2008. Microarchitectural characterization of production JVMs and Java workloads . In Proceedings of the IBM CAS Workshop. Jungwoo Ha, Magnus Gustafsson, Stephen M. Blackburn, and Kathryn S. McKinley. 2008. Microarchitectural characterization of production JVMs and Java workloads. In Proceedings of the IBM CAS Workshop."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.144.0395"},{"volume-title":"Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201913)","author":"Huang Jipeng","key":"e_1_2_1_31_1","unstructured":"Jipeng Huang and Michael D. Bond . 2013. Efficient context sensitivity for dynamic analyses via calling context uptrees and customized memory management . In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201913) . 53--72. Jipeng Huang and Michael D. Bond. 2013. Efficient context sensitivity for dynamic analyses via calling context uptrees and customized memory management. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201913). 53--72."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201904)","author":"Huang Xianglong","year":"2004","unstructured":"Xianglong Huang , Stephen M. Blackburn , Kathryn S. McKinley , J. Eliot B. Moss , Zhenlin Wang , and Perry Cheng . 2004 . The garbage collection advantage: Improving mutator locality . In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201904) . 69--80. Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J. Eliot B. Moss, Zhenlin Wang, and Perry Cheng. 2004. The garbage collection advantage: Improving mutator locality. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA\u201904). 69--80."},{"key":"e_1_2_1_33_1","unstructured":"ITRS. 2005. Internatial Technology Roadmap for Semiconductors: Assembly and Packaging. https:\/\/www.semiconductors.org\/resources\/2005-international-technology-roadmap-for-semiconductors-itrs\/.  ITRS. 2005. Internatial Technology Roadmap for Semiconductors: Assembly and Packaging. https:\/\/www.semiconductors.org\/resources\/2005-international-technology-roadmap-for-semiconductors-itrs\/."},{"volume-title":"Proceedings of the 47th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201914)","author":"Nair Prashant J.","key":"e_1_2_1_34_1","unstructured":"Prashant J. Nair , David A. Roberts , and Moinuddin K. Qureshi . 2014. Citadel: Efficiently protecting stacked memory from large granularity failures . In Proceedings of the 47th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201914) . 51--62. Prashant J. Nair, David A. Roberts, and Moinuddin K. Qureshi. 2014. Citadel: Efficiently protecting stacked memory from large granularity failures. In Proceedings of the 47th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201914). 51--62."},{"key":"e_1_2_1_35_1","unstructured":"JEDEC. [n.d.]. High Bandwidth Memory. Retrieved from https:\/\/www.jedec.org\/standards-documents\/docs\/jesd235a.  JEDEC. [n.d.]. High Bandwidth Memory. Retrieved from https:\/\/www.jedec.org\/standards-documents\/docs\/jesd235a."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TEST.2014.7035318"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.51"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485957"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2010.5416642"},{"key":"e_1_2_1_40_1","volume-title":"Garbage Collection: Algorithms for Automatic Dynamic Memory Management","author":"Jones Richard","year":"1996","unstructured":"Richard Jones and Rafael Lins . 1996 . Garbage Collection: Algorithms for Automatic Dynamic Memory Management . John Wiley 8 Sons. Richard Jones and Rafael Lins. 1996. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley 8 Sons."},{"volume-title":"Proceedings of the 45th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201912)","author":"Moinuddin","key":"e_1_2_1_41_1","unstructured":"Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design . In Proceedings of the 45th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201912) . 235--246. Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design. In Proceedings of the 45th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201912). 235--246."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2678373.2665726"},{"volume-title":"Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA\u201915)","author":"Lee Yongjun","key":"e_1_2_1_43_1","unstructured":"Yongjun Lee , Jongwon Kim , Hakbeom Jang , Hyunggyun Yang , Jangwoo Kim , Jinkyu Jeong , and Jae W. Leet . 2015. A fully associative, tagless DRAM cache . In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA\u201915) . 211--222. Yongjun Lee, Jongwon Kim, Hakbeom Jang, Hyunggyun Yang, Jangwoo Kim, Jinkyu Jeong, and Jae W. Leet. 2015. A fully associative, tagless DRAM cache. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA\u201915). 211--222."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358262"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065034"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1133956.1133959"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2015.57"},{"key":"e_1_2_1_48_1","unstructured":"Micron. 2007. TN-41-01: Calculating memory system power for DDR3. https:\/\/www.micron.com\/-\/media\/client\/global\/documents\/products\/technical-note\/dram\/tn41_01ddr3_power.pdf.  Micron. 2007. TN-41-01: Calculating memory system power for DDR3. https:\/\/www.micron.com\/-\/media\/client\/global\/documents\/products\/technical-note\/dram\/tn41_01ddr3_power.pdf."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2831234"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916)","author":"Nguyen Khanh","year":"2016","unstructured":"Khanh Nguyen , Lu Fang , Guoqing Xu , Brian Demsky , Shan Lu , Sanazsadat Alamian , and Onur Mutlu . 2016 . Yak: A high-performance big-data-friendly garbage collector . In Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916) . 349--365. Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Shan Lu, Sanazsadat Alamian, and Onur Mutlu. 2016. Yak: A high-performance big-data-friendly garbage collector. In Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916). 349--365."},{"volume-title":"Proceedings of the International Conference on Parallel Architecture and Compilation (PACT\u201915)","author":"Oskin Mark","key":"e_1_2_1_51_1","unstructured":"Mark Oskin and Gabriel H. Loh . 2015. A software-managed approach to die-stacked DRAM . In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT\u201915) . 188--200. Mark Oskin and Gabriel H. Loh. 2015. A software-managed approach to die-stacked DRAM. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT\u201915). 188--200."},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of the Memory Forum Workshop.","author":"O\u2019Connor Mike","year":"2014","unstructured":"Mike O\u2019Connor . 2014 . Highlights of the high-bandwidth memory (HBM) standard . In Proceedings of the Memory Forum Workshop. Mike O\u2019Connor. 2014. Highlights of the high-bandwidth memory (HBM) standard. In Proceedings of the Memory Forum Workshop."},{"volume-title":"Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW\u201917)","author":"Peng I. B.","key":"e_1_2_1_53_1","unstructured":"I. B. Peng , R. Gioiosa , G. Kestor , P. Cicotti , E. Laure , and S. Markidis . 2017. Exploring the performance benefit of hybrid memory system on HPC environments . In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW\u201917) . 683--692. I. B. Peng, R. Gioiosa, G. Kestor, P. Cicotti, E. Laure, and S. Markidis. 2017. Exploring the performance benefit of hybrid memory system on HPC environments. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW\u201917). 683--692."},{"volume-title":"Proceedings of the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA\u201917)","author":"Prodromou Andreas","key":"e_1_2_1_54_1","unstructured":"Andreas Prodromou , Mitesh Meswani , Nuwan Jayasena , Gabriel Loh , and Dean M. Tullsen . 2017. MemPod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories . In Proceedings of the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA\u201917) . 433--444. Andreas Prodromou, Mitesh Meswani, Nuwan Jayasena, Gabriel Loh, and Dean M. Tullsen. 2017. MemPod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories. In Proceedings of the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA\u201917). 433--444."},{"volume-title":"Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA\u201915)","author":"Meswani Mitesh R.","key":"e_1_2_1_55_1","unstructured":"Mitesh R. Meswani , Sergey Blagodurov , David Roberts , John Slice , Mike Ignatowski , and Gabriel H. Loh . 2015. Heterogeneous memory architectures: A HW\/SW approach for mixing die-stacked and off-package memories . In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA\u201915) . 126--136. Mitesh R. Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H. Loh. 2015. Heterogeneous memory architectures: A HW\/SW approach for mixing die-stacked and off-package memories. In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA\u201915). 126--136."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555801"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2003.1253181"},{"volume-title":"Proceedings of the International Conference on Parallel Architectures and Compilation (PACT\u201914)","author":"Sartor Jennifer B.","key":"e_1_2_1_58_1","unstructured":"Jennifer B. Sartor , Wim Heirman , Stephen M. Blackburn , Lieven Eeckhout , and Kathryn S . McKinley. 2014. Cooperative cache scrubbing . In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT\u201914) . 15--26. Jennifer B. Sartor, Wim Heirman, Stephen M. Blackburn, Lieven Eeckhout, and Kathryn S. McKinley. 2014. Cooperative cache scrubbing. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT\u201914). 15--26."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2492101.1555372"},{"volume-title":"Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages 8 Applications (OOPSLA\u201913)","author":"Shahriyar Rifat","key":"e_1_2_1_60_1","unstructured":"Rifat Shahriyar , Stephen M. Blackburn , Xi Yang , and Kathryn S . McKinley. 2013. Taking off the gloves with reference counting Immix . In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages 8 Applications (OOPSLA\u201913) . 93--110. Rifat Shahriyar, Stephen M. Blackburn, Xi Yang, and Kathryn S. McKinley. 2013. Taking off the gloves with reference counting Immix. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages 8 Applications (OOPSLA\u201913). 93--110."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.31"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.56"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694348"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.5555\/2388996.2389100"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/800020.808261"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/111186.116734"},{"volume-title":"Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT\u201911)","author":"Villavieja Carlos","key":"e_1_2_1_67_1","unstructured":"Carlos Villavieja , Vasileios Karakostas , Lluis Vilanova , Yoav Etsion , Alex Ramirez , Avi Mendelson , Nacho Navarro , Adrian Cristal , and Osman S. Unsal . 2011. DiDi: Mitigating the performance impact of TLB shootdowns using a shared TLB directory . In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT\u201911) . 340--349. Carlos Villavieja, Vasileios Karakostas, Lluis Vilanova, Yoav Etsion, Alex Ramirez, Avi Mendelson, Nacho Navarro, Adrian Cristal, and Osman S. Unsal. 2011. DiDi: Mitigating the performance impact of TLB shootdowns using a shared TLB directory. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT\u201911). 340--349."},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/3314221.3314650"},{"key":"e_1_2_1_69_1","volume-title":"Hosking","author":"Yang Xi","year":"2012","unstructured":"Xi Yang , Stephen M. Blackburn , Daniel Frampton , and Antony L . Hosking . 2012 . Barriers reconsidered, friendlier still! In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM\u2019 12). 37--48. Xi Yang, Stephen M. Blackburn, Daniel Frampton, and Antony L. Hosking. 2012. Barriers reconsidered, friendlier still! In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM\u201912). 37--48."},{"volume-title":"Proceedings of the ACM Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA\u201911)","author":"Yang Xi","key":"e_1_2_1_70_1","unstructured":"Xi Yang , Stephen M. Blackburn , Daniel Frampton , Jennifer B. Sartor , and Kathryn S . McKinley. 2011. Why nothing matters: The impact of zeroing . In Proceedings of the ACM Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA\u201911) . 307--324. Xi Yang, Stephen M. Blackburn, Daniel Frampton, Jennifer B. Sartor, and Kathryn S. McKinley. 2011. Why nothing matters: The impact of zeroing. In Proceedings of the ACM Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA\u201911). 307--324."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2018.00036"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/1640089.1640116"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3431803","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3431803","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:46Z","timestamp":1750195486000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3431803"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,20]]},"references-count":72,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,3,31]]}},"alternative-id":["10.1145\/3431803"],"URL":"https:\/\/doi.org\/10.1145\/3431803","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2021,1,20]]},"assertion":[{"value":"2020-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-01-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}