{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:10:29Z","timestamp":1750306229821,"version":"3.41.0"},"reference-count":31,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2016,11,1]],"date-time":"2016-11-01T00:00:00Z","timestamp":1477958400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61433019, U1435217, 61232003, 61502514, 61202121, 61402503, 61402501, 61120106005 and 61303073"],"award-info":[{"award-number":["61433019, U1435217, 61232003, 61502514, 61202121, 61402503, 61402501, 61120106005 and 61303073"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100013286","name":"Research Fund for the Doctoral Program of Higher Education of China","doi-asserted-by":"crossref","award":["20114307120013"],"award-info":[{"award-number":["20114307120013"]}],"id":[{"id":"10.13039\/501100013286","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National High Technology Research"},{"name":"Development 863 Program of China","award":["2015AA015305"],"award-info":[{"award-number":["2015AA015305"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Emerg. Technol. Comput. Syst."],"published-print":{"date-parts":[[2017,4,30]]},"abstract":"<jats:p>To address the high energy consumption issue of SRAM on GPUs, emerging Spin-Transfer Torque (STT-RAM) memory technology has been intensively studied to build GPU register files for better energy-efficiency, thanks to its benefits of low leakage power, high density, and good scalability. However, STT-RAM suffers from the read disturbance issue, which stems from the fact that the voltage difference between read current and write current becomes smaller as technology scales. The read disturbance leads to high error rates for read operations, which cannot be effectively protected by the SEC-DED ECC on large-capacity register files of GPUs.<\/jats:p>\n          <jats:p>\n            Prior schemes (e.g., read-restore) to mitigate the read disturbance usually incur either non-trivial performance loss or excessive energy overhead, thus not applicable for the GPU register file design that aims to achieve both high performance and energy-efficiency. To combat the read disturbance, we propose a novel software-hardware co-designed solution (i.e.,\n            <jats:italic>Red-Shield<\/jats:italic>\n            ), which consists of three optimizations to overcome the limitations of the existing solutions. First, we identify dead reads at compiling stage and augment instructions to avoid unnecessary restores. Second, we employ a small read buffer to accommodate register reads with high-access locality to further reduce restores. Third, we propose an adaptive restore mechanism to selectively pick the suitable restore scheme, according to the busy status of corresponding register banks. Experimental results show that our proposed design can effectively mitigate the performance loss and energy overhead caused by restore operations while still maintaining the reliability of reads.\n          <\/jats:p>","DOI":"10.1145\/2996191","type":"journal-article","created":{"date-parts":[[2016,11,1]],"date-time":"2016-11-01T13:45:50Z","timestamp":1478007950000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Shielding STT-RAM Based Register Files on GPUs against Read Disturbance"],"prefix":"10.1145","volume":"13","author":[{"given":"Hang","family":"Zhang","sequence":"first","affiliation":[{"name":"Key State Laboratory of High Performance Computing, College of Computer, National University of Defense Technology, Changsha, China"}]},{"given":"Xuhao","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha, China"}]},{"given":"Nong","family":"Xiao","sequence":"additional","affiliation":[{"name":"Key State Laboratory of High Performance Computing, College of Computer, National University of Defense Technology 8 School of Data and Computer Science, Sun Yat-sen University, Changsha, China"}]},{"given":"Lei","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha, China"}]},{"given":"Fang","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha, China"}]},{"given":"Wei","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha, China"}]},{"given":"Zhiguang","family":"Chen","sequence":"additional","affiliation":[{"name":"Key State Laboratory of High Performance Computing, College of Computer, National University of Defense Technology, Changsha, China"}]}],"member":"320","published-online":{"date-parts":[[2016,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/523931"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2009.4919648"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2010.5650274"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2012.2224256"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2185930"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2010.158"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155675"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522331"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485952"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485964"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2744785"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CICC.2008.4672056"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2015.7059054"},{"key":"e_1_2_1_15_1","unstructured":"NVIDIA. 2012. GPU Computing SDK. (2012). https:\/\/developer.nvidia.com.  NVIDIA. 2012. GPU Computing SDK. (2012). https:\/\/developer.nvidia.com."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2014.6835966"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2011.5763257"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555349.1555372"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2015.71"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/2132325.2132408"},{"key":"e_1_2_1_21_1","unstructured":"John A. Stratton Christopher Rodrigues I.-Jui Sung Nady Obeid Li-Wen Chang Nasser Anssari Geng Daniel Liu and Wen-Mei W. Hwu. 2012. Parboil: A revised benchmark suite for scientific and commercial throughput computing. IMPACT Technical Report.  John A. Stratton Christopher Rodrigues I.-Jui Sung Nady Obeid Li-Wen Chang Nasser Anssari Geng Daniel Liu and Wen-Mei W. Hwu. 2012. Parboil: A revised benchmark suite for scientific and commercial throughput computing. IMPACT Technical Report."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2009.4798259"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155659"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2333660.2333673"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sse.2010.11.032"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2700230"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2744769.2744908"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMAG.2009.2024325"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056056"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897937.2897989"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMAG.2012.2203589"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555759"}],"container-title":["ACM Journal on Emerging Technologies in Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2996191","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2996191","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:23:10Z","timestamp":1750220590000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2996191"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,11]]},"references-count":31,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2017,4,30]]}},"alternative-id":["10.1145\/2996191"],"URL":"https:\/\/doi.org\/10.1145\/2996191","relation":{},"ISSN":["1550-4832","1550-4840"],"issn-type":[{"type":"print","value":"1550-4832"},{"type":"electronic","value":"1550-4840"}],"subject":[],"published":{"date-parts":[[2016,11]]},"assertion":[{"value":"2016-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}