{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T13:15:16Z","timestamp":1758892516454,"version":"3.41.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3s","license":[{"start":{"date-parts":[[2014,3,1]],"date-time":"2014-03-01T00:00:00Z","timestamp":1393632000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2014,3]]},"abstract":"<jats:p>Improvements in semiconductor nanotechnology made chip multiprocessors the reference architecture for high-performance microprocessors. CMPs usually adopt large Last-Level Caches (LLC) shared among cores and private L1 caches, whose performances depend on the wire-delay dominated response time of LLC. NUCA (NonUniform Cache Architecture) caches represent a viable solution for tolerating wire-delay effects. In this article, we present Re-NUCA, a NUCA cache that exploits replication of blocks inside the LLC to avoid performance limitations of D-NUCA caches due to conflicting access to shared data. Results show that a Re-NUCA LLC permits to improve performances of more than 5% on average, and up to 15% for applications that strongly suffer from conflicting access to shared data, while reducing network traffic and power consumption with respect to D-NUCA caches. Besides, it outperforms different S-NUCA schemes optimized with victim replication.<\/jats:p>","DOI":"10.1145\/2566568","type":"journal-article","created":{"date-parts":[[2014,3,25]],"date-time":"2014-03-25T13:34:12Z","timestamp":1395754452000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Exploiting replication to improve performances of NUCA-based CMP systems"],"prefix":"10.1145","volume":"13","author":[{"given":"Pierfrancesco","family":"Foglia","sequence":"first","affiliation":[{"name":"Universit\u00e0 di Pisa, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marco","family":"Solinas","sequence":"additional","affiliation":[{"name":"Universit\u00e0 di Siena, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2014,3,28]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cad.2012.01.009"},{"volume-title":"Proceedings of the 15th International Symposium on High Performance Computer Architecture (HPCA'09)","author":"Awasthi M.","key":"e_1_2_1_2_1","unstructured":"M. Awasthi , K. Sudan R. Balasubramonian , and J. B. Carter . 2009. Dynamic hardware-assisted software controlled page placement to manage capacity allocation and sharing with larger caches . In Proceedings of the 15th International Symposium on High Performance Computer Architecture (HPCA'09) . M. Awasthi, K. Sudan R. Balasubramonian, and J. B. Carter. 2009. Dynamic hardware-assisted software controlled page placement to manage capacity allocation and sharing with larger caches. In Proceedings of the 15th International Symposium on High Performance Computer Architecture (HPCA'09)."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327171.1327184"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-cdt.2008.0078"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSD.2008.52"},{"volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09)","author":"Bardine A.","key":"e_1_2_1_6_1","unstructured":"A. Bardine , M. Comparetti , P. Foglia , G. Gabrielli , and C. A. Prete . 2009. A power-efficient migration mechanism for d-nuca caches . In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09) . European Design and Automation Association, 598--601. A. Bardine, M. Comparetti, P. Foglia, G. Gabrielli, and C. A. Prete. 2009. A power-efficient migration mechanism for d-nuca caches. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). European Design and Automation Association, 598--601."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJHPSA.2010.034542"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2012.2231949"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/SBAC-PAD.2010.20"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2004.21"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.10"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2006.17"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2009.4798258"},{"volume-title":"Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO'03)","author":"Chishti A.","key":"e_1_2_1_15_1","unstructured":"A. Chishti , M. D. Powell , and T. N. Vijaykumar . 2003. Distance associativity for high-performance energy-efficient non-uniform cache architectures . In Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO'03) . A. Chishti, M. D. Powell, and T. N. Vijaykumar. 2003. Distance associativity for high-performance energy-efficient non-uniform cache architectures. In Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO'03)."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1080695.1070001"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.31"},{"key":"e_1_2_1_18_1","unstructured":"M. Comparetti and P. Foglia. 2013. A workload independent energy reduction strategy for d-nuca caches. J. Supercomput. To appear.  M. Comparetti and P. Foglia. 2013. A workload independent energy reduction strategy for d-nuca caches. J. Supercomput. To appear."},{"key":"e_1_2_1_19_1","volume-title":"Interconnection Networks: An Engineering Approach","author":"Duato J.","year":"2003","unstructured":"J. Duato , S. Yalamanchili , and N. Lionel . 2003 . Interconnection Networks: An Engineering Approach . Morgan Kaufmann Publishers , San Francisco, CA . J. Duato, S. Yalamanchili, and N. Lionel. 2003. Interconnection Networks: An Engineering Approach. Morgan Kaufmann Publishers, San Francisco, CA."},{"volume-title":"Proceedings of the 3rd Workshop on Embedded Systems for Real-Time Multimedia. 41--46","author":"Foglia P.","key":"e_1_2_1_20_1","unstructured":"P. Foglia , D. Mangano , and C. A. Prete . 2005. A nuca model for embedded systems cache design . In Proceedings of the 3rd Workshop on Embedded Systems for Real-Time Multimedia. 41--46 . P. Foglia, D. Mangano, and C. A. Prete. 2005. A nuca model for embedded systems cache design. In Proceedings of the 3rd Workshop on Embedded Systems for Real-Time Multimedia. 41--46."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/SBAC-PAD.2009.12"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSD.2009.153"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSD.2010.41"},{"key":"e_1_2_1_24_1","unstructured":"GEMS. 2008. Winsconsin multifacet gems simulator. http:\/\/www.cs.wisc.edu\/gems\/.  GEMS. 2008. Winsconsin multifacet gems simulator. http:\/\/www.cs.wisc.edu\/gems\/."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.378997"},{"key":"e_1_2_1_26_1","unstructured":"D. Greenhill and J. Alabado. 2005. Power savings in the ultrasparc t1 processor. Sun Microsystem Whitepaper.  D. Greenhill and J. Alabado. 2005. Power savings in the ultrasparc t1 processor. Sun Microsystem Whitepaper."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1378533.1378535"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-92990-1_26"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555779"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.920580"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088154"},{"volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09)","author":"Kahng A. B.","key":"e_1_2_1_32_1","unstructured":"A. B. Kahng , B. Li , L. Peh , and K. Samadi . 2009. ORION 2.0: A fast and accurate noc power and area model for early-stage design space exploration . In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09) . European Design and Automation Association, 423--428. A. B. Kahng, B. Li, L. Peh, and K. Samadi. 2009. ORION 2.0: A fast and accurate noc power and area model for early-stage design space exploration. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). European Design and Automation Association, 423--428."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605420"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2005.35"},{"key":"e_1_2_1_35_1","unstructured":"K. Krewell. 1997. UltraSparc iv mirrors predecessors. Microprocessor rep. 1--3.  K. Krewell. 1997. UltraSparc iv mirrors predecessors. Microprocessor rep. 1--3."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1384529.1375515"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2005.34"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1535\/itj.1002.02"},{"volume-title":"Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture. 1--10","author":"Merino J.","key":"e_1_2_1_39_1","unstructured":"J. Merino , V. Puente , and J. A. Gregorio . 2010. ESP-NUCA: A low-cost adaptive non-uniform cache architecture . In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture. 1--10 . J. Merino, V. Puente, and J. A. Gregorio. 2010. ESP-NUCA: A low-cost adaptive non-uniform cache architecture. In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture. 1--10."},{"key":"e_1_2_1_40_1","unstructured":"MICRON. 2010. 1 gb ddr2 sdram module datasheet. http:\/\/www.micron.com.  MICRON. 2010. 1 gb ddr2 sdram module datasheet. http:\/\/www.micron.com."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.30"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/237090.237140"},{"key":"e_1_2_1_43_1","unstructured":"PTM. 2007. Predictive technology model (ptm). http:\/\/www.eas.asu.edu\/&sim;ptm\/.  PTM. 2007. Predictive technology model (ptm). http:\/\/www.eas.asu.edu\/&sim;ptm\/."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/1148882.1148884"},{"key":"e_1_2_1_45_1","unstructured":"VIRTUTEC. 2010. Virtutec simics. http:\/\/www.virtutech.com.  VIRTUTEC. 2010. Virtutec simics. http:\/\/www.virtutech.com."},{"key":"e_1_2_1_46_1","unstructured":"N. Weste and D. Harris. 2010. CMOS VLSI Design: A Circuits and Systems Perspective 4th Ed. Addison-Wesley Publishing.   N. Weste and D. Harris. 2010. CMOS VLSI Design: A Circuits and Systems Perspective 4 th Ed. Addison-Wesley Publishing."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/223982.223990"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.53"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2566568","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2566568","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T07:01:00Z","timestamp":1750230060000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2566568"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,3]]},"references-count":48,"journal-issue":{"issue":"3s","published-print":{"date-parts":[[2014,3]]}},"alternative-id":["10.1145\/2566568"],"URL":"https:\/\/doi.org\/10.1145\/2566568","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2014,3]]},"assertion":[{"value":"2012-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-03-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}