{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:17:33Z","timestamp":1750306653352,"version":"3.41.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2014,6,1]],"date-time":"2014-06-01T00:00:00Z","timestamp":1401580800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"(Spanish Gov. and European ERDF)","award":["TIN2010-21291-C02-01, DPI2011-25892, and TIN2012-34557"],"award-info":[{"award-number":["TIN2010-21291-C02-01, DPI2011-25892, and TIN2012-34557"]}]},{"name":"Consolider CSD2007-00050 (Spanish Gov.)"},{"name":"gaZ: T48 research group (Arag&#243;n Gov. and European ESF)"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2014,6]]},"abstract":"<jats:p>Cache working-set adaptation is key as embedded systems move to multiprocessor and Simultaneous Multithreaded Architectures (SMT) because interthread pollution harms system performance and battery life. Light-Power NUCA (LP-NUCA) is a working-set adaptive cache that depends on temporal-locality to save energy. This work identifies the sources of energy waste in LP-NUCAs: parallel access to the tag and data arrays of the tiles and low locality phases with useless block migration. To counteract both issues, we prove that switching to serial access reduces energy without harming performance and propose a machine learning Adaptive Drop Rate (ADR) controller that minimizes the amount of replacement and migration when locality is low.<\/jats:p>\n          <jats:p>This work demonstrates that these techniques efficiently adapt the cache drop and access policies to save energy. They reduce LP-NUCA consumption 22.7% for 1SMT. With interthread cache contention in 2SMT, the savings rise to 29%. Versus a conventional organization, energy--delay improves 20.8% and 25% for 1- and 2SMT benchmarks, and, in 65% of the 2SMT mixes, gains are larger than 20%.<\/jats:p>","DOI":"10.1145\/2632217","type":"journal-article","created":{"date-parts":[[2014,7,1]],"date-time":"2014-07-01T14:23:02Z","timestamp":1404224582000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Revisiting LP-NUCA Energy Consumption"],"prefix":"10.1145","volume":"11","author":[{"given":"Dar\u00edo Su\u00e1rez","family":"Gracia","sequence":"first","affiliation":[{"name":"Qualcomm Research Silicon Valley, CA, USA"}]},{"given":"Alexandra","family":"Ferrer\u00f3n","sequence":"additional","affiliation":[{"name":"Universidad de Zaragoza and HiPEAC, Zaragoza, Spain"}]},{"given":"Luis Montesano Del","family":"Campo","sequence":"additional","affiliation":[{"name":"Universidad de Zaragoza, Zaragoza, Spain"}]},{"given":"Teresa Monreal","family":"Arnal","sequence":"additional","affiliation":[{"name":"Universitat Polit\u00e8cnica de Catalunya and HiPEAC, Barcelona, Spain"}]},{"given":"V\u00edctor Vi\u00f1als","family":"Y\u00fafera","sequence":"additional","affiliation":[{"name":"Universidad de Zaragoza and HiPEAC, Zaragoza, Spain"}]}],"member":"320","published-online":{"date-parts":[[2014,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2086696.2086698"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/320080.320119"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360153"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.10"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1941487.1941507"},{"volume-title":"Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitectures. IEEE Computer Society, 55","author":"Chishti Zeshan","key":"e_1_2_1_6_1","unstructured":"Zeshan Chishti , Michael D. Powell , and T. N. Vijaykumar . 2003. Distance associativity for high-performance energy-efficient non-uniform cache architectures . In Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitectures. IEEE Computer Society, 55 . Zeshan Chishti, Michael D. Powell, and T. N. Vijaykumar. 2003. Distance associativity for high-performance energy-efficient non-uniform cache architectures. In Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitectures. IEEE Computer Society, 55."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1482619.1482620"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1287\/ijoc.1.3.190"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.25"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 1st JILP Workshop on Computer Architecture Competitions: Cache Replacement Championship. 1--4.","author":"Gao Hongliang","year":"2010","unstructured":"Hongliang Gao and Chris Wilkerson . 2010 . Dueling segmented LRU replacement algorithm with adaptive bypassing . In Proceedings of the 1st JILP Workshop on Computer Architecture Competitions: Cache Replacement Championship. 1--4. Hongliang Gao and Chris Wilkerson. 2010. Dueling segmented LRU replacement algorithm with adaptive bypassing. In Proceedings of the 1st JILP Workshop on Computer Architecture Competitions: Cache Replacement Championship. 1--4."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the Workshop on Multithreaded Execution, Architecture, and Compilation, 1--8.","author":"Garc\u00eda Montse","year":"2000","unstructured":"Montse Garc\u00eda , Jos\u00e9 Gonz\u00e1lez , and Antonio Gonz\u00e1lez . 2000 . Data caches for multithreaded processors . In Proceedings of the Workshop on Multithreaded Execution, Architecture, and Compilation, 1--8. Montse Garc\u00eda, Jos\u00e9 Gonz\u00e1lez, and Antonio Gonz\u00e1lez. 2000. Data caches for multithreaded processors. In Proceedings of the Workshop on Multithreaded Execution, Architecture, and Compilation, 1--8."},{"key":"e_1_2_1_12_1","volume-title":"Energy per instruction trends in Intel\u00ae microprocessors. Technology&commat","author":"Grochowski Ed","year":"2006","unstructured":"Ed Grochowski and Murali Annavaram . 2006. Energy per instruction trends in Intel\u00ae microprocessors. Technology&commat ; Intel Magazine 4, 3 ( 2006 ), 1--8. Ed Grochowski and Murali Annavaram. 2006. Energy per instruction trends in Intel\u00ae microprocessors. Technology&commat; Intel Magazine 4, 3 (2006), 1--8."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of Workshop on Modeling, Benchmarking and Simulation, 1--8.","author":"Hamerly Greg","year":"2005","unstructured":"Greg Hamerly , Erez Perelman , Jeremy Lau , and Brad Calder . 2005 . SimPoint 3.0: Faster and more flexible program analysis . In Proceedings of Workshop on Modeling, Benchmarking and Simulation, 1--8. Greg Hamerly, Erez Perelman, Jeremy Lau, and Brad Calder. 2005. SimPoint 3.0: Faster and more flexible program analysis. In Proceedings of Workshop on Modeling, Benchmarking and Simulation, 1--8."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555779"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/545215.545239"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088154"},{"key":"e_1_2_1_20_1","volume-title":"Intel\u00ae Xeon\u00ae processor C5500\/C3500 Series. Datasheet--Volume 1. (February","author":"Embedded Intel","year":"2010","unstructured":"Intel Embedded . 2010. Intel\u00ae Xeon\u00ae processor C5500\/C3500 Series. Datasheet--Volume 1. (February 2010 ). http:\/\/edc.intel.com\/Link.aspx&excl;id=3179. Intel Embedded. 2010. Intel\u00ae Xeon\u00ae processor C5500\/C3500 Series. Datasheet--Volume 1. (February 2010). http:\/\/edc.intel.com\/Link.aspx&excl;id=3179."},{"key":"e_1_2_1_21_1","unstructured":"Intel Software 2011. Bull Mountain: Software Implementation Guide. Intel Software.  Intel Software 2011. Bull Mountain: Software Implementation Guide. Intel Software."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815971"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/325164.325162"},{"volume-title":"Proceedings of International Symposium on Parallel Distributed Processing, 1--12","author":"Kedzierski K.","key":"e_1_2_1_24_1","unstructured":"K. Kedzierski , M. Moreto , F. J. Cazorla , and M. Valero . 2010. Adapting cache partitioning algorithms to pseudo-LRU replacement policies . In Proceedings of International Symposium on Parallel Distributed Processing, 1--12 . K. Kedzierski, M. Moreto, F. J. Cazorla, and M. Valero. 2010. Adapting cache partitioning algorithms to pseudo-LRU replacement policies. In Proceedings of International Symposium on Parallel Distributed Processing, 1--12."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/74925.74941"},{"volume-title":"Proceedings of the IBM Austin Center for Advanced Studies Workshop, 1--12","author":"Kim Changkyu","key":"e_1_2_1_26_1","unstructured":"Changkyu Kim , Doug Burger , and Stephen W. Keckler . 2002. An adaptive cache structure for future high-performance systems . In Proceedings of the IBM Austin Center for Advanced Studies Workshop, 1--12 . Changkyu Kim, Doug Burger, and Stephen W. Keckler. 2002. An adaptive cache structure for future high-performance systems. In Proceedings of the IBM Austin Center for Advanced Studies Workshop, 1--12."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.752659"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013251"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1762146.1762160"},{"key":"e_1_2_1_30_1","unstructured":"LSI Corporation. 2010. PowerPC#8482; Processor (476FP) Embedded Core Product Brief. Available at http:\/\/www.lsi.com\/DistributionSystem\/AssetDocument\/PPC476FP-PB-v7.pdf.  LSI Corporation. 2010. PowerPC#8482; Processor (476FP) Embedded Core Product Brief. Available at http:\/\/www.lsi.com\/DistributionSystem\/AssetDocument\/PPC476FP-PB-v7.pdf."},{"key":"#cr-split#-e_1_2_1_31_1.1","unstructured":"MIPS Technologies. 2010. MIPS32\u00ae1004K#8482"},{"key":"#cr-split#-e_1_2_1_31_1.2","unstructured":"Coherent Processing System (CPS). (2010). MIPS Technologies. 2010. MIPS32\u00ae1004K#8482"},{"key":"#cr-split#-e_1_2_1_31_1.3","unstructured":"Coherent Processing System (CPS). (2010)."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2007.910967"},{"key":"e_1_2_1_34_1","volume-title":"Workshop on Multithreaded Execution, Architecture, and Compilation, 1--8.","author":"Nemirovsky Mario","year":"1998","unstructured":"Mario Nemirovsky and Wayne Yamamoto . 1998 . Quantitative study of data caches on a multistreamed architecture . In Workshop on Multithreaded Execution, Architecture, and Compilation, 1--8. Mario Nemirovsky and Wayne Yamamoto. 1998. Quantitative study of data caches on a multistreamed architecture. In Workshop on Multithreaded Execution, Architecture, and Compilation, 1--8."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-70550-5_3"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250713"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339685"},{"key":"e_1_2_1_38_1","volume-title":"Artificial Intelligence: A Modern Approach","author":"Russell S. J.","year":"2009","unstructured":"S. J. Russell and P. Norvig . 2009 . Artificial Intelligence: A Modern Approach ( 3 rd ed.). Prentice Hall . S. J. Russell and P. Norvig. 2009. Artificial Intelligence: A Modern Approach (3rd ed.). Prentice Hall.","edition":"3"},{"key":"e_1_2_1_39_1","volume-title":"Tullsen","author":"Sarkar Subhradyuti","year":"2011","unstructured":"Subhradyuti Sarkar and Dean M . Tullsen . 2011 . Data layout for cache performance on a multithreaded arch. In Transactions on High-Performance Embedded Architectures and Compilers III. 43--68. Subhradyuti Sarkar and Dean M. Tullsen. 2011. Data layout for cache performance on a multithreaded arch. In Transactions on High-Performance Embedded Architectures and Compilers III. 43--68."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370868"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/1370998.1371004"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605403"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2011.2158249"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/SAMOS.2011.6045443"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/223982.224449"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/232973.232993"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859635"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2632217","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2632217","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:56:13Z","timestamp":1750229773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2632217"}},"subtitle":["Cache Access Policies and Adaptive Block Dropping"],"short-title":[],"issued":{"date-parts":[[2014,6]]},"references-count":45,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2014,6]]}},"alternative-id":["10.1145\/2632217"],"URL":"https:\/\/doi.org\/10.1145\/2632217","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2014,6]]},"assertion":[{"value":"2013-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}