{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T03:26:51Z","timestamp":1764732411769,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2009,5,4]],"date-time":"2009-05-04T00:00:00Z","timestamp":1241395200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2009,5,4]]},"DOI":"10.1145\/1534530.1534539","type":"proceedings-article","created":{"date-parts":[[2009,5,5]],"date-time":"2009-05-05T14:40:53Z","timestamp":1241534453000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":75,"title":["The design of a similarity based deduplication system"],"prefix":"10.1145","author":[{"given":"Lior","family":"Aronovich","sequence":"first","affiliation":[{"name":"IBM Corp."}]},{"given":"Ron","family":"Asher","sequence":"additional","affiliation":[{"name":"IBM Corp."}]},{"given":"Eitan","family":"Bachmat","sequence":"additional","affiliation":[{"name":"Ben-Gurion U."}]},{"given":"Haim","family":"Bitner","sequence":"additional","affiliation":[{"name":"Marvell Corp."}]},{"given":"Michael","family":"Hirsch","sequence":"additional","affiliation":[{"name":"IBM Corp."}]},{"given":"Shmuel T.","family":"Klein","sequence":"additional","affiliation":[{"name":"Bar-Ilan U."}]}],"member":"320","published-online":{"date-parts":[[2009,5,4]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0196-6774(03)00097-X"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/362686.362692"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1210596.1210599"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/359842.359859"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223855"},{"volume-title":"Identifying and Filtering Near-Duplicate Documents. CPM 2000: 1--10","author":"Broder","key":"e_1_3_2_1_6_1","unstructured":"A. Z. Broder . Identifying and Filtering Near-Duplicate Documents. CPM 2000: 1--10 A. Z. Broder. Identifying and Filtering Near-Duplicate Documents. CPM 2000: 1--10"},{"key":"e_1_3_2_1_7_1","volume-title":"Some applications of Rabin's fingerprinting method","author":"Broder","year":"1993","unstructured":"A. Z. Broder . Some applications of Rabin's fingerprinting method . In R. Capocelli, A. De Santis, and U. Vaccaro, editors, Sequences II : Methods in Communications, Security, and Computer Science, 143--152. Springer-Verlag , 1993 . A. Z. Broder. Some applications of Rabin's fingerprinting method. In R. Capocelli, A. De Santis, and U. Vaccaro, editors, Sequences II: Methods in Communications, Security, and Computer Science, 143--152. Springer-Verlag, 1993."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/829502.830043"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276781"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/338219.338246"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/283554.283370"},{"key":"e_1_3_2_1_12_1","first-page":"113","volume":"7","author":"Fischer M. J.","year":"1974","unstructured":"Fischer M. J. , Paterson M. S. , String matching and other products, in Complexity of Computation, R. M. Karp (editor) , SIAM-AMS Proc. 7 ( 1974 ) 113 -- 125 . Fischer M. J., Paterson M. S., String matching and other products, in Complexity of Computation, R. M. Karp (editor), SIAM-AMS Proc. 7 (1974) 113--125.","journal-title":"SIAM-AMS Proc."},{"key":"e_1_3_2_1_13_1","volume-title":"The Enterprise Strategy Group (ESG) lab validation report for IBM TS7650G ProtecTier","author":"Garret C.","year":"2008","unstructured":"B. Garret and C. Bouffard , The Enterprise Strategy Group (ESG) lab validation report for IBM TS7650G ProtecTier , 2008 . Available at www.diligent.com. B. Garret and C. Bouffard, The Enterprise Strategy Group (ESG) lab validation report for IBM TS7650G ProtecTier, 2008. Available at www.diligent.com."},{"key":"e_1_3_2_1_14_1","first-page":"191","volume-title":"Proceedings of the Second USENIX Workshop on Electronic Commerce","author":"Heintze","year":"1996","unstructured":"N. Heintze . Scalable Document Fingerprinting . Proceedings of the Second USENIX Workshop on Electronic Commerce , pages 191 -- 200 , 1996 . N. Heintze. Scalable Document Fingerprinting. Proceedings of the Second USENIX Workshop on Electronic Commerce, pages 191--200, 1996."},{"key":"e_1_3_2_1_15_1","volume-title":"Proceedings of USENIX File And Storage Systems conference (FAST)","author":"Jain M.","year":"2005","unstructured":"N. Jain , M. Dahlin , and R. Tewari . TAPER: Tiered Approach for Eliminating Redundancy in Replica Synchronization . Proceedings of USENIX File And Storage Systems conference (FAST) , 2005 . N. Jain, M. Dahlin, and R. Tewari. TAPER: Tiered Approach for Eliminating Redundancy in Replica Synchronization. Proceedings of USENIX File And Storage Systems conference (FAST), 2005."},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of USENIX File And Storage Systems conference (FAST)","author":"Lillibridge K.","year":"2009","unstructured":"M. Lillibridge , K. Eshghi , D. Bhagwat , V. Deolalikar , G. Trezise , and P. Camble , Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality , Proceedings of USENIX File And Storage Systems conference (FAST) , 2009 . M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar, G. Trezise, and P. Camble, Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality, Proceedings of USENIX File And Storage Systems conference (FAST), 2009."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.312.0249"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1137\/0206024"},{"key":"e_1_3_2_1_19_1","first-page":"59","volume-title":"J. M. Tracey: Redundancy Elimination Within Large Collections of Files. Proceedings of USENIX Annual Technical Conference","author":"Kulkarni F.","year":"2004","unstructured":"P. Kulkarni , F. Douglis , J. D. LaVoie , J. M. Tracey: Redundancy Elimination Within Large Collections of Files. Proceedings of USENIX Annual Technical Conference , pages 59 -- 72 , 2004 . P. Kulkarni, F. Douglis, J. D. LaVoie, J. M. Tracey: Redundancy Elimination Within Large Collections of Files. Proceedings of USENIX Annual Technical Conference, pages 59--72, 2004."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/0196-6774(89)90010-2"},{"key":"e_1_3_2_1_22_1","unstructured":"Moulton G. H. Whitehill S. B. Hash fil system and method for use in a commonality factoring system U.S. Pat. No. 6 704 730.  Moulton G. H. Whitehill S. B. Hash fil system and method for use in a commonality factoring system U.S. Pat. No. 6 704 730."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/375360.375365"},{"key":"e_1_3_2_1_24_1","volume-title":"Proc. of the USENIX Conf. on File And Storage Technologies (FAST)","author":"Quinlan S.","year":"2002","unstructured":"S. Quinlan and S. Dorward , Venti: A New Approach to Archival Storage . In Proc. of the USENIX Conf. on File And Storage Technologies (FAST) , 2002 . S. Quinlan and S. Dorward, Venti: A New Approach to Archival Storage. In Proc. of the USENIX Conf. on File And Storage Technologies (FAST), 2002."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/226931.226961"},{"key":"e_1_3_2_1_26_1","volume-title":"Proc. of Workshop on Web Databases (WebDB'98)","author":"Shivakumar H.","year":"1998","unstructured":"N. Shivakumar and H. Garcia-Molina . Finding near-replicas of documents on the web . In Proc. of Workshop on Web Databases (WebDB'98) , March 1998 . N. Shivakumar and H. Garcia-Molina. Finding near-replicas of documents on the web. In Proc. of Workshop on Web Databases (WebDB'98), March 1998."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01206331"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/SWAT.1973.13"},{"volume-title":"Proc. of FAST 08, the 6th USENIX Conf. on File and Storage Technologies, 279--292","author":"Zhu K.","key":"e_1_3_2_1_29_1","unstructured":"B. Zhu , K. Li and H. Patterson , Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , Proc. of FAST 08, the 6th USENIX Conf. on File and Storage Technologies, 279--292 . B. Zhu, K. Li and H. Patterson, Avoiding the Disk Bottleneck in the Data Domain Deduplication File System, Proc. of FAST 08, the 6th USENIX Conf. on File and Storage Technologies, 279--292."},{"key":"e_1_3_2_1_30_1","first-page":"23","author":"Ziv A.","year":"1977","unstructured":"J. Ziv and A. Lempel . A universal algorithm for sequential data compression, IEEE Trans. Inform. Theory , vol. I T- 23 , pp. 337--343, May 1977 . 282 J. Ziv and A. Lempel. A universal algorithm for sequential data compression, IEEE Trans. Inform. Theory, vol. IT-23, pp. 337--343, May 1977. 282","journal-title":"Inform. Theory"}],"event":{"name":"SYSTOR '09: Proceedings of the 2009 Israeli Experimental Systems Conference","sponsor":["Hebrew University of Jerusalem","Melanox Technologies","IBM IBM"],"location":"Haifa Israel","acronym":"SYSTOR '09"},"container-title":["Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1534530.1534539","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1534530.1534539","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:29:44Z","timestamp":1750253384000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1534530.1534539"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,5,4]]},"references-count":29,"alternative-id":["10.1145\/1534530.1534539","10.1145\/1534530"],"URL":"https:\/\/doi.org\/10.1145\/1534530.1534539","relation":{},"subject":[],"published":{"date-parts":[[2009,5,4]]},"assertion":[{"value":"2009-05-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}