{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:08:18Z","timestamp":1750306098591,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":88,"publisher":"ACM","license":[{"start":{"date-parts":[[2017,10,14]],"date-time":"2017-10-14T00:00:00Z","timestamp":1507939200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2017,10,14]]},"DOI":"10.1145\/3123939.3123944","type":"proceedings-article","created":{"date-parts":[[2017,11,20]],"date-time":"2017-11-20T14:31:12Z","timestamp":1511188272000},"page":"532-545","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["BVF"],"prefix":"10.1145","author":[{"given":"Ang","family":"Li","sequence":"first","affiliation":[{"name":"Pacific Northwest National Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenfeng","family":"Zhao","sequence":"additional","affiliation":[{"name":"University of Minnesota"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuaiwen Leon","family":"Song","sequence":"additional","affiliation":[{"name":"Pacific Northwest National Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,10,14]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Computer Architecture: A Quantitative Approach","author":"Patterson David A","year":"2011","unstructured":"David A Patterson . Computer Architecture: A Quantitative Approach . Elsevier , 2011 . David A Patterson. Computer Architecture: A Quantitative Approach. Elsevier, 2011."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815963"},{"key":"e_1_3_2_1_3_1","volume-title":"Life after Dennard and how I learned to love the picojoule. Keynote at MICRO","author":"Keckler S","year":"2011","unstructured":"S Keckler . Life after Dennard and how I learned to love the picojoule. Keynote at MICRO , 2011 . S Keckler. Life after Dennard and how I learned to love the picojoule. Keynote at MICRO, 2011."},{"key":"e_1_3_2_1_4_1","volume-title":"Exascale computing study: Technology challenges in achieving exascale systems","author":"Bergman Keren","year":"2008","unstructured":"Keren Bergman , Shekhar Borkar , Dan Campbell , William Carlson , William Dally , Monty Denneau , Paul Franzon , William Harrod , Kerry Hill , Jon Hiller , Exascale computing study: Technology challenges in achieving exascale systems . Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO) , Tech . Rep, 15, 2008 . Keren Bergman, Shekhar Borkar, Dan Campbell, William Carlson, William Dally, Monty Denneau, Paul Franzon, William Harrod, Kerry Hill, Jon Hiller, et al. Exascale computing study: Technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech. Rep, 15, 2008."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063421"},{"key":"e_1_3_2_1_6_1","volume-title":"From Petascale to Exascale: R&D Challenges for HPC Simulation Environments ASC Exascale Workshop","author":"White Andy","year":"2011","unstructured":"Andy White . Exascale challenges: Applications, technologies, and co-design . In From Petascale to Exascale: R&D Challenges for HPC Simulation Environments ASC Exascale Workshop , 2011 . Andy White. Exascale challenges: Applications, technologies, and co-design. In From Petascale to Exascale: R&D Challenges for HPC Simulation Environments ASC Exascale Workshop, 2011."},{"key":"e_1_3_2_1_7_1","first-page":"22","volume-title":"HP Laboratories","author":"Muralimanohar Naveen","year":"2009","unstructured":"Naveen Muralimanohar , Rajeev Balasubramonian , and Norman P Jouppi . CACTI 6.0 : A tool to model large caches . HP Laboratories , pages 22 -- 31 , 2009 . Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P Jouppi. CACTI 6.0: A tool to model large caches. HP Laboratories, pages 22--31, 2009."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.22"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485948"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.54"},{"key":"e_1_3_2_1_11_1","volume-title":"IEEE 32nd International Conference on Computer Design, ICCD. IEEE","author":"Carballo Juan-Antonio","year":"2014","unstructured":"Juan-Antonio Carballo , Wei-Ting Jonas Chan , Paolo A Gargini , Andrew B Kahng , and Siddhartha Nath . ITRS 2.0 : Toward a re-framing of the Semiconductor Technology Roadmap . In IEEE 32nd International Conference on Computer Design, ICCD. IEEE , 2014 . Juan-Antonio Carballo, Wei-Ting Jonas Chan, Paolo A Gargini, Andrew B Kahng, and Siddhartha Nath. ITRS 2.0: Toward a re-framing of the Semiconductor Technology Roadmap. In IEEE 32nd International Conference on Computer Design, ICCD. IEEE, 2014."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2967938.2967951"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056052"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2007.917509"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISVLSI.2014.94"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.913744"},{"key":"e_1_3_2_1_17_1","volume-title":"IEEE International Reliability Physics Symposium. IEEE","author":"Tega Naoki","year":"2008","unstructured":"Naoki Tega , Hiroshi Miki , Masanao Yamaoka , Hitoshi Kume , Toshiyuki Mine , Takeshi Ishida , Yuki Mori , Renichi Yamada , and Kazuyoshi Torii . Impact of threshold voltage fluctuation due to random telegraph noise on scaled-down SRAM . In IEEE International Reliability Physics Symposium. IEEE , 2008 . Naoki Tega, Hiroshi Miki, Masanao Yamaoka, Hitoshi Kume, Toshiyuki Mine, Takeshi Ishida, Yuki Mori, Renichi Yamada, and Kazuyoshi Torii. Impact of threshold voltage fluctuation due to random telegraph noise on scaled-down SRAM. In IEEE International Reliability Physics Symposium. IEEE, 2008."},{"key":"e_1_3_2_1_18_1","volume-title":"Eight transistor (8T) write assist static random access memory (SRAM) cell","author":"Yang J.","year":"2015","unstructured":"J. Yang , H.K. Lin , J. Shen , Y. Li , and H. Chen . Eight transistor (8T) write assist static random access memory (SRAM) cell , 2015 . US Patent 9,183,922. J. Yang, H.K. Lin, J. Shen, Y. Li, and H. Chen. Eight transistor (8T) write assist static random access memory (SRAM) cell, 2015. US Patent 9,183,922."},{"key":"e_1_3_2_1_19_1","volume-title":"Alternating wordline connection in 8t cells for improving resiliency to multi-bit ser upsets","author":"Wuu J.J.","year":"2013","unstructured":"J.J. Wuu , D.R. Weiss , K.E. WILCOX, A.W. SCHAEFER, and K.V. UNDERHILL. Alternating wordline connection in 8t cells for improving resiliency to multi-bit ser upsets , 2013 . WO Patent App. PCT\/US 2012\/050,542. J.J. Wuu, D.R. Weiss, K.E. WILCOX, A.W. SCHAEFER, and K.V. UNDERHILL. Alternating wordline connection in 8t cells for improving resiliency to multi-bit ser upsets, 2013. WO Patent App. PCT\/US2012\/050,542."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/.2005.1469239"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000093"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750417"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2206781.2206846"},{"key":"e_1_3_2_1_24_1","volume-title":"IEEE Custom Integrated Circuits Conference, CICC. IEEE","author":"Mori Haruki","year":"2015","unstructured":"Haruki Mori , T Nakagawa , Y Kitahara , Y Kawamoto , K Takagi , S Yoshimoto , S Izumi , K Nii , H Kawaguchi , and M Yoshimoto . A 298-fJ\/writecycle 650-fJ\/readcycle 8T three-port SRAM in 28-nm FD-SOI process technology for image processor . In IEEE Custom Integrated Circuits Conference, CICC. IEEE , 2015 . Haruki Mori, T Nakagawa, Y Kitahara, Y Kawamoto, K Takagi, S Yoshimoto, S Izumi, K Nii, H Kawaguchi, and M Yoshimoto. A 298-fJ\/writecycle 650-fJ\/readcycle 8T three-port SRAM in 28-nm FD-SOI process technology for image processor. In IEEE Custom Integrated Circuits Conference, CICC. IEEE, 2015."},{"key":"e_1_3_2_1_25_1","volume-title":"Pseudo dual-port SRAM and a shared memory switch using multiple memory banks and a sideband memory","author":"Dama Jonathan","year":"2013","unstructured":"Jonathan Dama and Andrew Lines . Pseudo dual-port SRAM and a shared memory switch using multiple memory banks and a sideband memory , 2013 . US Patent 8,370,557. Jonathan Dama and Andrew Lines. Pseudo dual-port SRAM and a shared memory switch using multiple memory banks and a sideband memory, 2013. US Patent 8,370,557."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669172"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2013.2280310"},{"key":"e_1_3_2_1_28_1","volume-title":"HPL-2008-20","author":"Thoziyoor Shyamkumar","year":"2008","unstructured":"Shyamkumar Thoziyoor , Naveen Muralimanohar , Jung Ho Ahn, and Norman P Jouppi. CACTI 5.1. Technical report , HPL-2008-20 , HP Labs , 2008 . Shyamkumar Thoziyoor, Naveen Muralimanohar, Jung Ho Ahn, and Norman P Jouppi. CACTI 5.1. Technical report, HPL-2008-20, HP Labs, 2008."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2004.826581"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485964"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.50"},{"key":"e_1_3_2_1_32_1","volume-title":"Wu-chun Feng. Measuring and Modeling On-Chip Interconnect Power on Real Hardware. In International Symposium on Workload Characterization, IISWC. IEEE","author":"Adhinarayanan Vignesh","year":"2016","unstructured":"Vignesh Adhinarayanan , Indrani Paul , Joseph L. Greathouse , Wei Huang , Ashutosh Pattnaik , and Wu-chun Feng. Measuring and Modeling On-Chip Interconnect Power on Real Hardware. In International Symposium on Workload Characterization, IISWC. IEEE , 2016 . Vignesh Adhinarayanan, Indrani Paul, Joseph L. Greathouse, Wei Huang, Ashutosh Pattnaik, and Wu-chun Feng. Measuring and Modeling On-Chip Interconnect Power on Real Hardware. In International Symposium on Workload Characterization, IISWC. IEEE, 2016."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/956417.956579"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540729"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446064"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/92.365453"},{"key":"e_1_3_2_1_37_1","volume-title":"IEEE International Symposium on Circuits and Systems, ISCAS. IEEE","author":"Gu Ji","year":"2009","unstructured":"Ji Gu and Hui Guo . A segmental bus-invert coding method for instruction memory data bus power efficiency . In IEEE International Symposium on Circuits and Systems, ISCAS. IEEE , 2009 . Ji Gu and Hui Guo. A segmental bus-invert coding method for instruction memory data bus power efficiency. In IEEE International Symposium on Circuits and Systems, ISCAS. IEEE, 2009."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.18"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485934"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446063"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370870"},{"key":"e_1_3_2_1_42_1","volume-title":"CUDA SDK Code Samples","author":"NVIDIA.","year":"2015","unstructured":"NVIDIA. CUDA SDK Code Samples , 2015 . NVIDIA. CUDA SDK Code Samples, 2015."},{"key":"e_1_3_2_1_43_1","series-title":"Lecture Notes on the Status of IEEE, 754(94720--1776):11","volume-title":"IEEE standard 754 for binary floating-point arithmetic","author":"Kahan William","year":"1996","unstructured":"William Kahan . IEEE standard 754 for binary floating-point arithmetic . Lecture Notes on the Status of IEEE, 754(94720--1776):11 , 1996 . William Kahan. IEEE standard 754 for binary floating-point arithmetic. Lecture Notes on the Status of IEEE, 754(94720--1776):11, 1996."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2013.6509694"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33182-4_5"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.5555\/527072.822610"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2008.4771804"},{"key":"e_1_3_2_1_49_1","unstructured":"NVIDIA. NVIDIA Tesla P100: The Most Advanced Datacenter Accelerator Ever Built Featuring Pascal GP100 the World's Fastest GPU. http:\/\/images.nvidia.com\/content\/pdf\/tesla\/whitepaper\/pascal-architecture-whitepaper.pdf 2016.  NVIDIA. NVIDIA Tesla P100: The Most Advanced Datacenter Accelerator Ever Built Featuring Pascal GP100 the World's Fastest GPU. http:\/\/images.nvidia.com\/content\/pdf\/tesla\/whitepaper\/pascal-architecture-whitepaper.pdf 2016."},{"key":"e_1_3_2_1_50_1","unstructured":"NVIDIA. Parallel Thread Execution ISA. http:\/\/docs.nvidia.com\/cuda\/parallel-thread-execution.  NVIDIA. Parallel Thread Execution ISA. http:\/\/docs.nvidia.com\/cuda\/parallel-thread-execution."},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360154"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.5555\/956417.956561"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.6"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2009.29"},{"key":"e_1_3_2_1_55_1","volume-title":"Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks. arXiv preprint arXiv:1705.01626","author":"Rhu Minsoo","year":"2017","unstructured":"Minsoo Rhu , Mike O'Connor , Niladrish Chatterjee , Jeff Pool , and Stephen W Keckler . Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks. arXiv preprint arXiv:1705.01626 , 2017 . Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, and Stephen W Keckler. Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks. arXiv preprint arXiv:1705.01626, 2017."},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2464996.2465022"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522330"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370864"},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750399"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1950.tb00463.x"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037709"},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807606"},{"key":"e_1_3_2_1_63_1","unstructured":"Cadence. Virtuoso Analog Design Environment.  Cadence. Virtuoso Analog Design Environment."},{"key":"e_1_3_2_1_64_1","unstructured":"Cadence. Virtuoso Multi-Mode Simulation with Spectre Platform.  Cadence. Virtuoso Multi-Mode Simulation with Spectre Platform."},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2009.4919648"},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_3_2_1_67_1","volume-title":"Geng Daniel Liu, and Wen-mei W Hwu. Parboil: A revised benchmark suite for scientific and commercial throughput computing","author":"Stratton John A","year":"2012","unstructured":"John A Stratton , Christopher Rodrigues , I- Jui Sung , Nady Obeid , Li-Wen Chang , Nasser Anssari , Geng Daniel Liu, and Wen-mei W Hwu. Parboil: A revised benchmark suite for scientific and commercial throughput computing . Center for Reliable and High-Performance Computing , 127, 2012 . John A Stratton, Christopher Rodrigues, I-Jui Sung, Nady Obeid, Li-Wen Chang, Nasser Anssari, Geng Daniel Liu, and Wen-mei W Hwu. Parboil: A revised benchmark suite for scientific and commercial throughput computing. Center for Reliable and High-Performance Computing, 127, 2012."},{"key":"e_1_3_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/1735688.1735702"},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2009.4919639"},{"key":"e_1_3_2_1_70_1","volume-title":"Innovative Parallel Computing, InPar","author":"Grauer-Gray Scott","year":"2012","unstructured":"Scott Grauer-Gray , Lifan Xu , Robert Searles , Sudhee Ayalasomayajula , and John Cavazos . Auto-tuning a high-level language targeted to GPU codes . In Innovative Parallel Computing, InPar . IEEE , 2012 . Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. Auto-tuning a high-level language targeted to GPU codes. In Innovative Parallel Computing, InPar. IEEE, 2012."},{"key":"e_1_3_2_1_71_1","volume-title":"Excerpt from: Silicon Processing for the VLSI Era-vol. 4","author":"Wolf Stanley","year":"2004","unstructured":"Stanley Wolf . Excerpt from: Silicon Processing for the VLSI Era-vol. 4 . 2004 . Stanley Wolf. Excerpt from: Silicon Processing for the VLSI Era-vol. 4. 2004."},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155656"},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2011.24"},{"key":"e_1_3_2_1_74_1","volume-title":"Symposium on VLSI Circuits. IEEE","author":"Chun Ki Chul","year":"2009","unstructured":"Ki Chul Chun , Pulkit Jain , Jung Hwa Lee , and Chris H Kim . A sub-0.9 V logic-compatible embedded DRAM with boosted 3T gain cell, regulated bit-line write scheme and PVT-tracking read reference bias . In Symposium on VLSI Circuits. IEEE , 2009 . Ki Chul Chun, Pulkit Jain, Jung Hwa Lee, and Chris H Kim. A sub-0.9 V logic-compatible embedded DRAM with boosted 3T gain cell, regulated bit-line write scheme and PVT-tracking read reference bias. In Symposium on VLSI Circuits. IEEE, 2009."},{"key":"e_1_3_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000094"},{"key":"e_1_3_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485952"},{"key":"e_1_3_2_1_77_1","volume-title":"IEEE International Solid-State Circuits Conference-Digest of Technical Papers. IEEE","author":"Somasekhar Dinesh","year":"2008","unstructured":"Dinesh Somasekhar , Yibin Ye , Paolo Aseron , Shih-Lien Lu , Muhammad Khellah , Jason Howard , Greg Ruhl , Tanay Karnik , Shekhar Y Borkar , Vivek De , Hz 2Mb 2T gain-cell memory macro with 128GB\/s bandwidth in a 65nm logic process . In IEEE International Solid-State Circuits Conference-Digest of Technical Papers. IEEE , 2008 . Dinesh Somasekhar, Yibin Ye, Paolo Aseron, Shih-Lien Lu, Muhammad Khellah, Jason Howard, Greg Ruhl, Tanay Karnik, Shekhar Y Borkar, Vivek De, et al. 2GHz 2Mb 2T gain-cell memory macro with 128GB\/s bandwidth in a 65nm logic process. In IEEE International Solid-State Circuits Conference-Digest of Technical Papers. IEEE, 2008."},{"key":"e_1_3_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126931"},{"key":"e_1_3_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/VLSIC.2006.1705371"},{"key":"e_1_3_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522337"},{"key":"e_1_3_2_1_81_1","volume-title":"Murali Annavaram. Pilot Register File: Energy Efficient Partitioned Register File for GPUs. In 23rd IEEE International Symposium on High Performance Computer Architecture, HPCA. IEEE","author":"Abdel-Majeed Mohammad","year":"2017","unstructured":"Mohammad Abdel-Majeed , Alireza Shafaei , Hyeran Jeon , Massoud Pedram , and Murali Annavaram. Pilot Register File: Energy Efficient Partitioned Register File for GPUs. In 23rd IEEE International Symposium on High Performance Computer Architecture, HPCA. IEEE , 2017 . Mohammad Abdel-Majeed, Alireza Shafaei, Hyeran Jeon, Massoud Pedram, and Murali Annavaram. Pilot Register File: Energy Efficient Partitioned Register File for GPUs. In 23rd IEEE International Symposium on High Performance Computer Architecture, HPCA. IEEE, 2017."},{"key":"e_1_3_2_1_82_1","volume-title":"Nam Sung Kim. G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs. In 23rd IEEE International Symposium on High Performance Computer Architecture, HPCA. IEEE","author":"Liu Zhenhong","year":"2017","unstructured":"Zhenhong Liu , Syed Gilani , Murali Annavaram , and Nam Sung Kim. G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs. In 23rd IEEE International Symposium on High Performance Computer Architecture, HPCA. IEEE , 2017 . Zhenhong Liu, Syed Gilani, Murali Annavaram, and Nam Sung Kim. G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs. In 23rd IEEE International Symposium on High Performance Computer Architecture, HPCA. IEEE, 2017."},{"key":"e_1_3_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540717"},{"key":"e_1_3_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1145\/2786572.2786596"},{"key":"e_1_3_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.68"},{"key":"e_1_3_2_1_86_1","first-page":"980","author":"Whitepaper NVIDIA.","year":"2014","unstructured":"NVIDIA. Whitepaper : NVIDIA GeForce GTX 980 , 2014 . NVIDIA. Whitepaper: NVIDIA GeForce GTX 980, 2014.","journal-title":"NVIDIA GeForce GTX"},{"key":"e_1_3_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815998"},{"key":"e_1_3_2_1_88_1","volume-title":"22nd International Symposium on High Performance Computer Architecture, HPCA. IEEE","author":"Thomas Renji","year":"2016","unstructured":"Renji Thomas , Kristin Barber , Naser Sedaghati , Li Zhou , and Radu Teodorescu . Core Tunneling : Variation-aware voltage noise mitigation in GPUs . In 22nd International Symposium on High Performance Computer Architecture, HPCA. IEEE , 2016 . Renji Thomas, Kristin Barber, Naser Sedaghati, Li Zhou, and Radu Teodorescu. Core Tunneling: Variation-aware voltage noise mitigation in GPUs. In 22nd International Symposium on High Performance Computer Architecture, HPCA. IEEE, 2016."},{"key":"e_1_3_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750404"}],"event":{"name":"MICRO-50: The 50th Annual IEEE\/ACM International Symposium on Microarchitecture","sponsor":["SIGMICRO ACM Special Interest Group on Microarchitectural Research and Processing","IEEE-CS\\DATC IEEE Computer Society"],"location":"Cambridge Massachusetts","acronym":"MICRO-50"},"container-title":["Proceedings of the 50th Annual IEEE\/ACM International Symposium on Microarchitecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3123939.3123944","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3123939.3123944","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:30Z","timestamp":1750217430000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3123939.3123944"}},"subtitle":["enabling significant on-chip power savings via bit-value-favor for throughput processors"],"short-title":[],"issued":{"date-parts":[[2017,10,14]]},"references-count":88,"alternative-id":["10.1145\/3123939.3123944","10.1145\/3123939"],"URL":"https:\/\/doi.org\/10.1145\/3123939.3123944","relation":{},"subject":[],"published":{"date-parts":[[2017,10,14]]},"assertion":[{"value":"2017-10-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}