{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:46:18Z","timestamp":1772725578147,"version":"3.50.1"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2017,12,5]],"date-time":"2017-12-05T00:00:00Z","timestamp":1512432000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2017,12,31]]},"abstract":"<jats:p>Compression techniques at the last-level cache and the DRAM play an important role in improving system performance by increasing their effective capacities. A compressed block in DRAM also reduces the transfer time over the memory bus to the caches, reducing the latency of a LLC cache miss. Usually, compression is achieved by exploiting data patterns present within a block. But applications can exhibit data locality that spread across multiple consecutive data blocks. We observe that there is significant opportunity available for compressing multiple consecutive data blocks into one single block, both at the LLC and DRAM. Our studies using 21 SPEC CPU applications show that, at the LLC, around 25% (on average) of the cache blocks can be compressed into one single cache block when grouped together in groups of 2 to 8 blocks. In DRAM, more than 30% of the columns residing in a single DRAM page can be compressed into one DRAM column, when grouped together in groups of 2 to 6. Motivated by these observations, we propose a mechanism, namely, MBZip, that compresses multiple data blocks into one single block (called a zipped block), both at the LLC and DRAM. At the cache, MBZip includes a simple tag structure to index into these zipped cache blocks and the indexing does not incur any redirectional delay. At the DRAM, MBZip does not need any changes to the address computation logic and works seamlessly with the conventional\/existing logic. MBZip is a synergistic mechanism that coordinates these zipped blocks at the LLC and DRAM. Further, we also explore silent writes at the DRAM and show that certain writes need not access the memory when blocks are zipped. MBZip improves the system performance by 21.9%, with a maximum of 90.3% on a 4-core system.<\/jats:p>","DOI":"10.1145\/3151033","type":"journal-article","created":{"date-parts":[[2017,12,6]],"date-time":"2017-12-06T21:23:15Z","timestamp":1512595395000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["MBZip"],"prefix":"10.1145","volume":"14","author":[{"given":"Raghavendra","family":"Kanakagiri","sequence":"first","affiliation":[{"name":"Indian Institute of Technology, Madras"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5971-9456","authenticated-orcid":false,"given":"Biswabandan","family":"Panda","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology, Kanpur"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Madhu","family":"Mutyam","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology, Madras"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,12,5]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"Retrieved","author":"Intel","year":"2017"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.452.0287"},{"key":"e_1_2_2_4_1","volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA\u201904)","author":"Alaa"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830823"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/2665671.2665696"},{"key":"e_1_2_2_7_1","volume-title":"Proceedings of the International Symposium on High Performance Computer Architecture (HPCA\u201914)","author":"Shafiee A."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2872362.2872377"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2009.2020989"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2150976.2151007"},{"key":"e_1_2_2_12_1","volume-title":"IEEE 34th International Conference on Computer Design (ICCD\u201916)","author":"Deb A."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.6"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2008.44"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.36"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2005.4"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.869367"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1186736.1186737"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.37"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807659"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/2045364.2045383"},{"key":"e_1_2_2_22_1","volume-title":"IEEE International Parallel and Distributed Processing Symposium (IPDPS\u201917)","author":"Lal S."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2016.2619348"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339678"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360133"},{"key":"e_1_2_2_26_1","volume-title":"22nd Asia and South Pacific Design Automation Conference (ASP-DAC\u201917)","author":"Liu L."},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2896377.2901498"},{"key":"e_1_2_2_28_1","volume-title":"Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS\u201901)","author":"Luo K."},{"key":"e_1_2_2_29_1","volume-title":"Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA\u201912)","author":"Manikantan R"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830828"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3050440"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750377"},{"key":"e_1_2_2_33_1","volume-title":"49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916)","author":"Panda B."},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2999538"},{"key":"e_1_2_2_35_1","volume-title":"IEEE International Symposium on High Performance Computer Architecture (HPCA\u201916)","author":"Pekhimenko G."},{"key":"e_1_2_2_36_1","volume-title":"IEEE 21st International Symposium on High Performance Computer Architecture (HPCA\u201915)","author":"Pekhimenko G."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540724"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370870"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000073"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.41"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2976740"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540715"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370864"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.379244"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597652.2597655"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750399"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/360128.360154"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080243"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2808233"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3151033","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3151033","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:13:39Z","timestamp":1750212819000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3151033"}},"subtitle":["Multiblock Data Compression"],"short-title":[],"issued":{"date-parts":[[2017,12,5]]},"references-count":48,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2017,12,31]]}},"alternative-id":["10.1145\/3151033"],"URL":"https:\/\/doi.org\/10.1145\/3151033","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,12,5]]},"assertion":[{"value":"2016-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-12-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}