{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T05:18:47Z","timestamp":1672291127754},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2007,3]]},"abstract":"<jats:p>\n            As transistors keep shrinking and on-chip caches keep growing, static power dissipation resulting from leakage of caches takes an increasing fraction of total power in processors. Several techniques have already been proposed to reduce leakage power by turning off unused cache lines. However, they all have to pay the price of performance degradation. This paper presents a cache architecture, the\n            <jats:italic>snug set-associative<\/jats:italic>\n            (\n            <jats:italic>SSA<\/jats:italic>\n            ) cache, that cuts most of static power dissipation of caches without incuring performance penalties. The SSA cache reduces leakage power by implementing the\n            <jats:italic>minimum set-associative<\/jats:italic>\n            scheme, which only activates the minimal numbers of ways in each cache set, while the performance losses caused by this scheme are compensated by the\n            <jats:italic>base-offset load\/store queues<\/jats:italic>\n            . The rationale of combining these two techniques is locality: as the contents of the cache blocks in the current working set are repeatedly accessed, same addresses would be computed again and again. The SSA cache architecture can be applied to data and instruction caches to reduce leakage power without incurring performance penalties. Experimental results show that SSA can cut static power consumption of the L1 data cache by 93%, on average, for SPECint2000 benchmarks, while the execution times are reduced by 5%. Similarly, SSA can cut leakage dissipation of the L1 instruction cache by 92%, on average, and improve performance over 3%. Furthermore, when SSA is adopted for both L1 data and instruction caches, the normalized leakage of L1 data and instruction caches is lowered to 8%, on average, while still accomplishing a 2% reduction in execution times.\n          <\/jats:p>","DOI":"10.1145\/1216544.1216549","type":"journal-article","created":{"date-parts":[[2007,4,5]],"date-time":"2007-04-05T19:20:08Z","timestamp":1175800808000},"page":"6","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Snug set-associative caches"],"prefix":"10.1145","volume":"4","author":[{"given":"Yuan-Shin","family":"Hwang","sequence":"first","affiliation":[{"name":"National Taiwan Ocean University, Keelung, Taiwan"}]},{"given":"Jia-Jhe","family":"Li","sequence":"additional","affiliation":[{"name":"National Tsing Hua University, Hsinchu, Taiwan"}]}],"member":"320","published-online":{"date-parts":[[2007,3]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/320080.320119"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1023833.1023852"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.982917"},{"key":"e_1_2_1_4_1","volume-title":"The First Watson Conference on Interaction between Architecture, Circuits, and Compilers (p &equals; ac2 Conference).","author":"Baugh L."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.782564"},{"key":"e_1_2_1_6_1","unstructured":"Burger D. C. Goodman J. R. and K\u00e4gi A. 1995. The declining effectiveness of dynamic caching for general-purpose microprocessors. Technical report 1261 University of Wisconsin-Madison Computer Science Department.  Burger D. C. Goodman J. R. and K\u00e4gi A. 1995. The declining effectiveness of dynamic caching for general-purpose microprocessors. Technical report 1261 University of Wisconsin-Madison Computer Science Department."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360148"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 30th annual ACM\/IEEE international symposium on Microarchitecture. IEEE Computer Society, Washington, D.C. 292--302","author":"Dean J."},{"key":"e_1_2_1_9_1","first-page":"42","article-title":"Transistor elements for 30nm physical gate lengths and beyond","volume":"6","author":"Doyle B.","year":"2002","journal-title":"Intel Technology Journal"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 29th Annual International Symposium on Computer Architecture. IEEE Computer Society, Washington, D.C. 148--157","author":"Flautner K."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of 2001 International Conference on Computer Design. IEEE Computer Society, Washington, D.C. 276--283","author":"Hanson H."},{"key":"e_1_2_1_12_1","volume-title":"Computer Architecture: A Quantative Approach","author":"Hennessy J. L.","year":"2003","edition":"3"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/871506.871606"},{"key":"e_1_2_1_14_1","unstructured":"Intel Corporation. 2001. Intel XScale Microarchitecture. Intel Corporation Santa Clara CA.  Intel Corporation. 2001. Intel XScale Microarchitecture. Intel Corporation Santa Clara CA."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/379240.379268"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1077603.1077617"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.755465"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 35th annual ACM\/IEEE International Symposium on Microarchitecture. IEEE Computer Society Press, Washington, D.C. 219--230","author":"Kim N. S."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSL.2003.821550"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013254"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013273"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2005.23"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/871506.871569"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 36th Annual ACM\/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, D.C. 411--422","author":"Park I."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2004.10020"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 34th annual ACM\/IEEE International Symposium on Microarchitecture. IEEE Computer Society, Washington, D.C. 54--65","author":"Powell M. D."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 15th Annual International Symposium on Computer architecture. IEEE Computer Society Press, Washington, D.C. 290--298","author":"Prybylski S."},{"key":"e_1_2_1_28_1","volume-title":"SPEC CPU2000 v1.1.","author":"Standard Performance Evaluation Corporation","year":"2000"},{"key":"e_1_2_1_29_1","unstructured":"Tanenbaum A. S. 1999. Structured Computer Organization 4th ed. Prentice-Hall Englewood Cliffs NJ.   Tanenbaum A. S. 1999. Structured Computer Organization 4th ed. Prentice-Hall Englewood Cliffs NJ."},{"key":"e_1_2_1_30_1","unstructured":"Transmeta Corporation. 2004. Crusoe Processor Model TM5700\/TM5900 Data Book.  Transmeta Corporation. 2004. Crusoe Processor Model TM5700\/TM5900 Data Book."},{"key":"e_1_2_1_31_1","volume-title":"C. SPEC 2000 binaries. http:\/\/www.eecs.umich.edu\/~chriswea\/benchmarks\/spec","author":"Weaver","year":"2000"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013270"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/369028.369059"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859635"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013272"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/860176.860181"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1216544.1216549","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T20:51:35Z","timestamp":1672260695000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1216544.1216549"}},"subtitle":["Reducing leakage power of instruction and data caches with no performance penalties"],"short-title":[],"issued":{"date-parts":[[2007,3]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,3]]}},"alternative-id":["10.1145\/1216544.1216549"],"URL":"https:\/\/doi.org\/10.1145\/1216544.1216549","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,3]]},"assertion":[{"value":"2007-03-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}