{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:17:28Z","timestamp":1763468248182,"version":"3.41.0"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2015,1,9]],"date-time":"2015-01-09T00:00:00Z","timestamp":1420761600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2015,1,9]]},"abstract":"<jats:p>Many modern high-performance processors prefetch blocks into the on-chip cache. Prefetched blocks can potentially pollute the cache by evicting more useful blocks. In this work, we observe that both accurate and inaccurate prefetches lead to cache pollution, and propose a comprehensive mechanism to mitigate prefetcher-caused cache pollution.<\/jats:p>\n          <jats:p>First, we observe that over 95% of useful prefetches in a wide variety of applications are not reused after the first demand hit (in secondary caches). Based on this observation, our first mechanism simply demotes a prefetched block to the lowest priority on a demand hit. Second, to address pollution caused by inaccurate prefetches, we propose a self-tuning prefetch accuracy predictor to predict if a prefetch is accurate or inaccurate. Only predicted-accurate prefetches are inserted into the cache with a high priority.<\/jats:p>\n          <jats:p>Evaluations show that our final mechanism, which combines these two ideas, significantly improves performance compared to both the baseline LRU policy and two state-of-the-art approaches to mitigating prefetcher-caused cache pollution (up to 49%, and 6% on average for 157 two-core multiprogrammed workloads). The performance improvement is consistent across a wide variety of system configurations.<\/jats:p>","DOI":"10.1145\/2677956","type":"journal-article","created":{"date-parts":[[2015,1,12]],"date-time":"2015-01-12T20:02:10Z","timestamp":1421092930000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":35,"title":["Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks"],"prefix":"10.1145","volume":"11","author":[{"given":"Vivek","family":"Seshadri","sequence":"first","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh PA"}]},{"given":"Samihan","family":"Yedkar","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh PA"}]},{"given":"Hongyi","family":"Xin","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh PA"}]},{"given":"Onur","family":"Mutlu","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh PA"}]},{"given":"Phillip B.","family":"Gibbons","sequence":"additional","affiliation":[{"name":"Intel Pittsburgh, Pittsburgh PA"}]},{"given":"Michael A.","family":"Kozuch","sequence":"additional","affiliation":[{"name":"Intel Pittsburgh, Pittsburgh PA"}]},{"given":"Todd C.","family":"Mowry","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh PA"}]}],"member":"320","published-online":{"date-parts":[[2015,1,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2007.346200"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540735"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/777412.777431"},{"key":"e_1_2_1_4_1","volume-title":"Retrieved","author":"AMD.","year":"2012","unstructured":"AMD. 2012 . AMD Phenom II processor model . Retrieved November 11, 2014 from http:\/\/www.amd.com\/en-us\/products\/processors\/desktop\/phenom-ii. (2012). AMD. 2012. AMD Phenom II processor model. Retrieved November 11, 2014 from http:\/\/www.amd.com\/en-us\/products\/processors\/desktop\/phenom-ii. (2012)."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.381947"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.36"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/223587.223608"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669164"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.395402"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669150"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.43"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000081"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669154"},{"key":"e_1_2_1_14_1","volume-title":"Patt","author":"Ebrahimi Eiman","year":"2009","unstructured":"Eiman Ebrahimi , Onur Mutlu , and Yale N . Patt . 2009 a. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching dystems. In HPCA. Eiman Ebrahimi, Onur Mutlu, and Yale N. Patt. 2009a. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching dystems. In HPCA."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2008.44"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339660"},{"key":"e_1_2_1_17_1","unstructured":"Zhigang Hu Stefanos Kaxiras and Margaret Martonosi. 2002. Timekeeping in the memory system: predicting and optimizing memory behavior. In ISCA.   Zhigang Hu Stefanos Kaxiras and Margaret Martonosi. 2002. Timekeeping in the memory system: predicting and optimizing memory behavior. In ISCA."},{"key":"e_1_2_1_18_1","unstructured":"Intel. 2006. Inside Intel Core microarchitecture and smart memory access. Intel White Paper.  Intel. 2006. Inside Intel Core microarchitecture and smart memory access. Intel White Paper."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.52"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815971"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2010.38"},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Georgios Keramidas Pavlos Petoumenos and Stefanos Kaxiras. 2007. Cache replacement based on reuse-distance prediction. In ICCD.  Georgios Keramidas Pavlos Petoumenos and Stefanos Kaxiras. 2007. Cache replacement based on reuse-distance prediction. In ICCD.","DOI":"10.1109\/ICCD.2007.4601909"},{"key":"e_1_2_1_24_1","volume-title":"Jimenez","author":"Khan Samira","year":"2014","unstructured":"Samira Khan , Alaa R. Alameldeen , Chris Wilkerson , Onur Mutlu , and Daniel A . Jimenez . 2014 . Improving cache performance using read-write partitioning. In HPCA. Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Jimenez. 2014. Improving cache performance using read-write partitioning. In HPCA."},{"key":"e_1_2_1_25_1","volume-title":"Jimenez","author":"Khan Samira Manabi","year":"2010","unstructured":"Samira Manabi Khan , Yingying Tian , and Daniel A . Jimenez . 2010 . Sampling dead block prediction for last-level caches. In MICRO. Samira Manabi Khan, Yingying Tian, and Daniel A. Jimenez. 2010. Sampling dead block prediction for last-level caches. In MICRO."},{"key":"e_1_2_1_26_1","volume-title":"ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In HPCA.","author":"Kim Yoongu","year":"2010","unstructured":"Yoongu Kim , Dongsu Han , Onur Mutlu , and Mor Harchol-Balter . 2010 a. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In HPCA. Yoongu Kim, Dongsu Han, Onur Mutlu, and Mor Harchol-Balter. 2010a. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In HPCA."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.51"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/379240.379259"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.516.0639"},{"key":"e_1_2_1_30_1","volume-title":"Patt","author":"Lee Chang Joo","year":"2008","unstructured":"Chang Joo Lee , Onur Mutlu , Veynu Narasiman , and Yale N . Patt . 2008 . Prefetch-aware DRAM controllers. In MICRO. Chang Joo Lee, Onur Mutlu, Veynu Narasiman, and Yale N. Patt. 2008. Prefetch-aware DRAM controllers. In MICRO."},{"key":"e_1_2_1_31_1","volume-title":"Patt","author":"Lee Chang Joo","year":"2009","unstructured":"Chang Joo Lee , Veynu Narasiman , Onur Mutlu , and Yale N . Patt . 2009 . Improving memory bank-level parallelism in the presence of prefetching. In MICRO. Chang Joo Lee, Veynu Narasiman, Onur Mutlu, and Yale N. Patt. 2009. Improving memory bank-level parallelism in the presence of prefetching. In MICRO."},{"key":"e_1_2_1_32_1","unstructured":"Wei-Fen Lin Steven K. Reinhardt and Doug Burger. 2001a. Reducing DRAM latencies with an integrated memory hierarchy design. In HPCA.   Wei-Fen Lin Steven K. Reinhardt and Doug Burger. 2001a. Reducing DRAM latencies with an integrated memory hierarchy design. In HPCA."},{"key":"e_1_2_1_33_1","volume-title":"Puzak","author":"Lin Wei-Fen","year":"2001","unstructured":"Wei-Fen Lin , Steven K. Reinhardt , Doug Burger , and Thomas R . Puzak . 2001 b. Filtering superfluous prefetches using density vectors. In ICCD. Wei-Fen Lin, Steven K. Reinhardt, Doug Burger, and Thomas R. Puzak. 2001b. Filtering superfluous prefetches using density vectors. In ICCD."},{"key":"e_1_2_1_34_1","unstructured":"Kun Luo Jayanth Gummaraju and Manoj Franklin. 2001. Balancing throughput and fairness in SMT processors. In ISPASS.  Kun Luo Jayanth Gummaraju and Manoj Franklin. 2001. Balancing throughput and fairness in SMT processors. In ISPASS."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2004.4"},{"key":"e_1_2_1_36_1","unstructured":"Oracle. 2011. Oracle\u2019s Sparc T4 server architecture. Oracle White Paper.  Oracle. 2011. Oracle\u2019s Sparc T4 server architecture. Oracle White Paper."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/224056.224064"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370870"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250709"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2006.5"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.14"},{"key":"e_1_2_1_42_1","volume-title":"Retrieved","author":"Seshadri Vivek","year":"2014","unstructured":"Vivek Seshadri . 2014 . Source code for Mem-Sim . Retrieved November 11, 2014 from www.ece.cmu.edu\/&sim;safari\/tools.html. (2014). Vivek Seshadri. 2014. Source code for Mem-Sim. Retrieved November 11, 2014 from www.ece.cmu.edu\/&sim;safari\/tools.html. (2014)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370868"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/165123.165152"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605403"},{"key":"e_1_2_1_46_1","unstructured":"Jaewoong Sim Jaekyu Lee Moinuddin K. Qureshi and Hyesoon Kim. 2012. FLEXclusion: balancing cache capacity and on-chip bandwidth via flexible exclusion. In ISCA.   Jaewoong Sim Jaekyu Lee Moinuddin K. Qureshi and Hyesoon Kim. 2012. FLEXclusion: balancing cache capacity and on-chip bandwidth via flexible exclusion. In ISCA."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.379244"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2007.346185"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/L-CA.2011.1"},{"key":"e_1_2_1_50_1","volume-title":"Retrieved","author":"VIA.","year":"2005","unstructured":"VIA. 2005 . VIA C7 Processor . Retrieved November 11, 2014 from http:\/\/www.via.com.tw\/en\/products\/processors\/c7\/. (2005). VIA. 2005. VIA C7 Processor. Retrieved November 11, 2014 from http:\/\/www.via.com.tw\/en\/products\/processors\/c7\/. (2005)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155672"},{"key":"e_1_2_1_52_1","doi-asserted-by":"crossref","unstructured":"Xiaotong Zhuang and Hsien-Hsin S. Lee. 2003. A hardware-based cache pollution filtering mechanism for aggressive prefetches. In ICPP.  Xiaotong Zhuang and Hsien-Hsin S. Lee. 2003. A hardware-based cache pollution filtering mechanism for aggressive prefetches. In ICPP.","DOI":"10.1109\/ICPP.2003.1240591"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2677956","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2677956","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T07:19:44Z","timestamp":1750231184000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2677956"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,1,9]]},"references-count":51,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2015,1,9]]}},"alternative-id":["10.1145\/2677956"],"URL":"https:\/\/doi.org\/10.1145\/2677956","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2015,1,9]]},"assertion":[{"value":"2014-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-01-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}