{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T03:08:11Z","timestamp":1771038491127,"version":"3.50.1"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2012,11,1]],"date-time":"2012-11-01T00:00:00Z","timestamp":1351728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Storage"],"published-print":{"date-parts":[[2012,11]]},"abstract":"<jats:p>Replicating data off site is critical for disaster recovery reasons, but the current approach of transferring tapes is cumbersome and error prone. Replicating across a wide area network (WAN) is a promising alternative, but fast network connections are expensive or impractical in many remote locations, so improved compression is needed to make WAN replication truly practical. We present a new technique for replicating backup datasets across a WAN that not only eliminates duplicate regions of files (deduplication) but also compresses similar regions of files with delta compression, which is available as a feature of EMC Data Domain systems.<\/jats:p>\n          <jats:p>Our main contribution is an architecture that adds stream-informed delta compression to already existing deduplication systems and eliminates the need for new, persistent indexes. Unlike techniques based on knowing a file's version or that use a memory cache, our approach achieves delta compression across all data replicated to a server at any time in the past. From a detailed analysis of datasets and statistics from hundreds of customers using our product, we achieve an additional 2X compression from delta compression beyond deduplication and local compression, which enables customers to replicate data that would otherwise fail to complete within their backup window.<\/jats:p>","DOI":"10.1145\/2385603.2385606","type":"journal-article","created":{"date-parts":[[2012,12,5]],"date-time":"2012-12-05T18:55:48Z","timestamp":1354733748000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":99,"title":["WAN-optimized replication of backup datasets using stream-informed delta compression"],"prefix":"10.1145","volume":"8","author":[{"given":"Phlip","family":"Shilane","sequence":"first","affiliation":[{"name":"Backup Recovery Systems Division, EMC Corporation"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mark","family":"Huang","sequence":"additional","affiliation":[{"name":"Backup Recovery Systems Division, EMC Corporation"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Grant","family":"Wallace","sequence":"additional","affiliation":[{"name":"Backup Recovery Systems Division, EMC Corporation"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Windsor","family":"Hsu","sequence":"additional","affiliation":[{"name":"Backup Recovery Systems Division, EMC Corporation"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2012,12,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1534530.1534539"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 17th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.","author":"Bhagwat D."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1210596.1210599"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223855"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the Compression and Complexity of Sequences. 21","author":"Broder A.","year":"1997"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/647819.736184"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276781"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/266220.266223"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM '99)","author":"Chan M. C."},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the International Conference on Information Technology: Coding and Computing. 778","author":"Chen Y."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the USENIX Annual Technical Conference.","author":"Debnath B."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 9th USENIX Conference on File and Storage Technologies.","author":"Dong W."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the USENIX Annual Technical Conference. 113--126","author":"Douglis F."},{"key":"e_1_2_1_14_1","unstructured":"EMC Corporation. 2010. Data Domain Boost Software. http:\/\/www.datadomain.com\/products\/dd-boost.html.  EMC Corporation. 2010. Data Domain Boost Software. http:\/\/www.datadomain.com\/products\/dd-boost.html."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 5th USENIX Conference on File and Storage Technologies.","author":"Eshghi K."},{"key":"e_1_2_1_16_1","unstructured":"Gailly J. L. and Adler M. 2003. The GZIP compressor. http:\/\/www.gzip.org.  Gailly J. L. and Adler M. 2003. The GZIP compressor. http:\/\/www.gzip.org."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the USENIX Annual Technical Conference.","author":"Guo F."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/279310.279321"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 4th USENIX Conference on File and Storage Technologies.","author":"Jain N."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the USENIX Annual Technical Conference. 59--72","author":"Kulkarni P."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 7th USENIX Conference on File and Storage Technologies. 111--123","author":"Lillibridge M."},{"key":"e_1_2_1_22_1","unstructured":"MacDonald J. 2000. File system support for delta compression. M.S. thesis Department of Electrical Engineering and Computer Science University of California Berkeley.  MacDonald J. 2000. File system support for delta compression. M.S. thesis Department of Electrical Engineering and Computer Science University of California Berkeley."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the USENIX Winter Technical Conference. 1--10","author":"Manber U.","year":"1994"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2010.263"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/263105.263162"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/502034.502052"},{"key":"e_1_2_1_27_1","first-page":"1","article-title":"Supporting practical content-addressable caching with CZIP compression","volume":"14","author":"Park K.","year":"2007","journal-title":"Proceedings of the USENIX Annual Technical Conference."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2010.5650369"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 1st USENIX Conference on File and Storage Technologies.","author":"Patterson H."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the USENIX Annual Technical Conference. 73--86","author":"Policroniades C."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 1st USENIX Conference on File and Storage Technologies.","author":"Quinlan S."},{"key":"e_1_2_1_32_1","unstructured":"Rabin M. O. 1981. Fingerprinting by random polynomials. Tech. rep. Center for Research in Computing Technology.  Rabin M. O. 1981. Fingerprinting by random polynomials. Tech. rep. Center for Research in Computing Technology."},{"key":"e_1_2_1_33_1","unstructured":"Riverbed Technology. 2011. Riverbed Steelhead Product Family. http:\/\/www.riverbed.com\/us\/assets\/media\/documents\/data_sheets\/DataSheet-Riverbed-FamilyProduct.pdf.  Riverbed Technology. 2011. Riverbed Steelhead Product Family. http:\/\/www.riverbed.com\/us\/assets\/media\/documents\/data_sheets\/DataSheet-Riverbed-FamilyProduct.pdf."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 10th USENIX Conference on File and Storage Technologies.","author":"Shilane P."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems.","author":"Shilane P."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/347059.347408"},{"key":"e_1_2_1_37_1","doi-asserted-by":"crossref","unstructured":"Suel T. and Memon N. 2002. Algorithms for delta compression and remote file synchronization. In Lossless Compression Handbook K. Sayood Ed. Academic Press San Diego CA.  Suel T. and Memon N. 2002. Algorithms for delta compression and remote file synchronization. In Lossless Compression Handbook K. Sayood Ed. Academic Press San Diego CA.","DOI":"10.1016\/B978-012620861-0\/50014-0"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 20th International Conference on Data Engineering.","author":"Suel T."},{"key":"e_1_2_1_39_1","volume-title":"Zdelta: An efficient delta compression tool. Tech. rep., Department of Computer and Information Science","author":"Trendafilov D.","year":"2002"},{"key":"e_1_2_1_40_1","unstructured":"Tridgell A. 2000. Efficient algorithms for sorting and synchronization. Ph.D. thesis Australian National University.  Tridgell A. 2000. Efficient algorithms for sorting and synchronization. Ph.D. thesis Australian National University."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 10th USENIX Conference on File and Storage Technologies.","author":"Wallace G."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the USENIX Annual Technical Conference.","author":"Xia W."},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 21st Symposium on Mass Storage Systems.","author":"You L."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1970348.1970351"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 6th USENIX Conference on File and Storage Technologies. 269--282","author":"Zhu B."}],"container-title":["ACM Transactions on Storage"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2385603.2385606","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2385603.2385606","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:34:02Z","timestamp":1750239242000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2385603.2385606"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11]]},"references-count":45,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["10.1145\/2385603.2385606"],"URL":"https:\/\/doi.org\/10.1145\/2385603.2385606","relation":{},"ISSN":["1553-3077","1553-3093"],"issn-type":[{"value":"1553-3077","type":"print"},{"value":"1553-3093","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,11]]},"assertion":[{"value":"2012-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-12-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}