{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:46:50Z","timestamp":1750308410185,"version":"3.41.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,9,4]],"date-time":"2018-09-04T00:00:00Z","timestamp":1536019200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"European Research Council"},{"name":"MECCA project","award":["340328"],"award-info":[{"award-number":["340328"]}]},{"name":"Swedish National Infrastructure for Computing","award":["C3SE"],"award-info":[{"award-number":["C3SE"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2018,9,30]]},"abstract":"<jats:p>\n            Task-parallel programs inefficiently utilize the cache hierarchy due to the presence of dead blocks in caches. Dead blocks may occupy cache space in multiple cache levels for a long time without providing any utility until they are finally evicted. Existing dead-block prediction schemes take decisions\n            <jats:italic>locally<\/jats:italic>\n            for each cache level and do not efficiently manage the entire cache hierarchy. This article introduces\n            <jats:italic>runtime-orchestrated global dead-block management<\/jats:italic>\n            , in which static and dynamic information about tasks available to the runtime system is used to effectively detect and manage dead blocks across the cache hierarchy. In the proposed global management schemes, static information (e.g., when tasks start\/finish, and what data regions tasks produce\/consume) is combined with dynamic information to detect when\/where blocks become dead. When memory regions are deemed dead at some cache level(s), all the associated cache blocks are evicted from the corresponding level(s). We extend the cache controllers at both private and shared cache levels to use the aforementioned information to evict dead blocks. The article does an extensive evaluation of both inclusive and non-inclusive cache hierarchies and shows that the proposed global schemes outperform existing local dead-block management schemes.\n          <\/jats:p>","DOI":"10.1145\/3234337","type":"journal-article","created":{"date-parts":[[2018,9,4]],"date-time":"2018-09-04T12:37:30Z","timestamp":1536064650000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Global Dead-Block Management for Task-Parallel Programs"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9783-8357","authenticated-orcid":false,"given":"Madhavan","family":"Manivannan","sequence":"first","affiliation":[{"name":"Chalmers University of Technology, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miquel","family":"Peric\u00e1s","sequence":"additional","affiliation":[{"name":"Chalmers University of Technology, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vassilis","family":"Papaefstathiou","sequence":"additional","affiliation":[{"name":"Chalmers University of Technology, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Per","family":"Stenstr\u00f6m","sequence":"additional","affiliation":[{"name":"Chalmers University of Technology, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,9,4]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1061267.1061271"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2015.26"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750411"},{"volume-title":"Proceedings of the European Conference on Parallel Processing (Euro-Par\u201902)","author":"Beyls Kristof","key":"e_1_2_1_4_1"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2004.09.004"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/301970.301974"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370832"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063384.2063454"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370860"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1248377.1248396"},{"volume-title":"Proceedings of the European Conference on Parallel Processing (Euro-Par\u201917)","year":"2017","author":"Dimi\u0107 Vladimir","key":"e_1_2_1_11_1"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626411000151"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/277650.277725"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.17"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/603095.603119"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.52"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815971"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2003.1240592"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.817393"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/977091.977117"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/379240.379268"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2007.4601909"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.24"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2007.70816"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-27866-5_73"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/379240.379259"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2008.4771793"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2013.64"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446101"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.71"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.30"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2013.6704665"},{"key":"e_1_2_1_33_1","unstructured":"OpenMP Architecture Review Board. 2005. OpenMP API Version 2.5.  OpenMP Architecture Review Board. 2005. OpenMP API Version 2.5."},{"key":"e_1_2_1_34_1","unstructured":"OpenMP Architecture Review Board. 2013. OpenMP API Version 4.0.  OpenMP Architecture Review Board. 2013. OpenMP API Version 4.0."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807625"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2464996.2465443"},{"volume-title":"Proceedings of the IEEE International Conference on Cluster Computing (Cluster\u201908)","author":"Perez J. M.","key":"e_1_2_1_37_1"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597652.2597674"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250709"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/277830.277941"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2628071.2628083"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/INTERACT.2005.7"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.16"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541228.2555290"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.14529\/jsfi140102"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2011.7"},{"volume-title":"Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT\u201902)","author":"Wang Zhenlin","key":"e_1_2_1_47_1"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/107971.107981"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155671"},{"volume-title":"Proceedings of the Cache Replacement Championship (CRC\u201917)","year":"2017","author":"Young Vinson","key":"e_1_2_1_50_1"},{"volume-title":"Proceedings of the Annual Workshop on Duplicating, Deconstructing, and Debunking (WDDD\u201907)","year":"2007","author":"Zahran Mohamed","key":"e_1_2_1_51_1"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/1787275.1787315"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3234337","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3234337","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:49:05Z","timestamp":1750268945000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3234337"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,9,4]]},"references-count":52,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,9,30]]}},"alternative-id":["10.1145\/3234337"],"URL":"https:\/\/doi.org\/10.1145\/3234337","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2018,9,4]]},"assertion":[{"value":"2017-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-09-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}