{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,15]],"date-time":"2025-11-15T04:56:39Z","timestamp":1763182599923,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":23,"publisher":"ACM","license":[{"start":{"date-parts":[[2012,10,14]],"date-time":"2012-10-14T00:00:00Z","timestamp":1350172800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000781","name":"European Research Council","doi-asserted-by":"publisher","award":["204742"],"award-info":[{"award-number":["204742"]}],"id":[{"id":"10.13039\/501100000781","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2012,10,14]]},"DOI":"10.1145\/2391229.2391246","type":"proceedings-article","created":{"date-parts":[[2012,11,13]],"date-time":"2012-11-13T15:04:07Z","timestamp":1352819047000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":33,"title":["Probabilistic deduplication for cluster-based storage systems"],"prefix":"10.1145","author":[{"given":"Davide","family":"Frey","sequence":"first","affiliation":[{"name":"INRIA, Rennes, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anne-Marie","family":"Kermarrec","sequence":"additional","affiliation":[{"name":"INRIA, Rennes, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Konstantinos","family":"Kloudas","sequence":"additional","affiliation":[{"name":"INRIA, Rennes, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2012,10,14]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"http:\/\/dumps.wikimedia.org\/enwiki\/.  http:\/\/dumps.wikimedia.org\/enwiki\/."},{"key":"e_1_3_2_1_2_1","unstructured":"https:\/\/www.grid5000.fr\/.  https:\/\/www.grid5000.fr\/."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1060289.1060291"},{"key":"e_1_3_2_1_4_1","volume-title":"Parallel Deduplication for Chunk-based File Backup. In MASCOTS","author":"Bhagwat Deepavali","year":"2009","unstructured":"Deepavali Bhagwat , Kave Eshghi , Darrell D. E. Long , and Mark Lillibridge . Extreme Binning : Scalable , Parallel Deduplication for Chunk-based File Backup. In MASCOTS , 2009 . Deepavali Bhagwat, Kave Eshghi, Darrell D. E. Long, and Mark Lillibridge. Extreme Binning: Scalable, Parallel Deduplication for Chunk-based File Backup. In MASCOTS, 2009."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223855"},{"key":"e_1_3_2_1_6_1","volume-title":"FAST","author":"Dong Wei","year":"2011","unstructured":"Wei Dong , Fred Douglis , Kai Li , Hugo Patterson , Sazzala Reddy , and Philip Shilane . Tradeoffs in Scalable Data Routing for Deduplication Clusters . In FAST , 2011 . Wei Dong, Fred Douglis, Kai Li, Hugo Patterson, Sazzala Reddy, and Philip Shilane. Tradeoffs in Scalable Data Routing for Deduplication Clusters. In FAST, 2011."},{"key":"e_1_3_2_1_7_1","volume-title":"FAST","author":"Dubnicki Cezary","year":"2009","unstructured":"Cezary Dubnicki , Leszek Gryz , Lukasz Heldt , Michal Kaczmarczyk , Wojciech Kilian , Przemyslaw Strzelczak , Jerzy Szczepkowski , Cristian Ungureanu , and Michal Welnicki . HYDRAstor : a Scalable Secondary Storage . In FAST , 2009 . Cezary Dubnicki, Leszek Gryz, Lukasz Heldt, Michal Kaczmarczyk, Wojciech Kilian, Przemyslaw Strzelczak, Jerzy Szczepkowski, Cristian Ungureanu, and Michal Welnicki. HYDRAstor: a Scalable Secondary Storage. In FAST, 2009."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-39658-1_55"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(85)90041-8"},{"key":"e_1_3_2_1_10_1","volume-title":"The Diverse and Exploding Digital Universe: An Updated Forecast of Worldwide Information Growth Through","author":"Gantz J. F.","year":"2011","unstructured":"J. F. Gantz , C. Chute , A. Manfrediz , S. Minton , D. Reinsel , W. Schlichting , and A. Toncheva . The Diverse and Exploding Digital Universe: An Updated Forecast of Worldwide Information Growth Through 2011 . Technical report, An IDC White Paper - sponsored by EMC , March 2008. J. F. Gantz, C. Chute, A. Manfrediz, S. Minton, D. Reinsel, W. Schlichting, and A. Toncheva. The Diverse and Exploding Digital Universe: An Updated Forecast of Worldwide Information Growth Through 2011. Technical report, An IDC White Paper - sponsored by EMC, March 2008."},{"key":"e_1_3_2_1_11_1","volume-title":"USENIX ATC","author":"Guo Fanglu","year":"2011","unstructured":"Fanglu Guo and Petros Efstathopoulos . Building a High-performance Deduplication Systems . In USENIX ATC , 2011 . Fanglu Guo and Petros Efstathopoulos. Building a High-performance Deduplication Systems. In USENIX ATC, 2011."},{"key":"e_1_3_2_1_12_1","volume-title":"FAST","author":"Kruus Erik","year":"2010","unstructured":"Erik Kruus , Cristian Ungureanu , and Cezary Dubnicki . Bimodal Content Defined Chunking for Backup Streams . In FAST , 2010 . Erik Kruus, Cristian Ungureanu, and Cezary Dubnicki. Bimodal Content Defined Chunking for Backup Streams. In FAST, 2010."},{"key":"e_1_3_2_1_13_1","volume-title":"USENIX ATC","author":"Kulkarni Purushottam","year":"2004","unstructured":"Purushottam Kulkarni , Fred Douglis , Jason LaVoie , and John M. Tracey . Redundancy elimination within large collections of files . In USENIX ATC , 2004 . Purushottam Kulkarni, Fred Douglis, Jason LaVoie, and John M. Tracey. Redundancy elimination within large collections of files. In USENIX ATC, 2004."},{"key":"e_1_3_2_1_14_1","volume-title":"Inline Deduplication Using Sampling and Locality. In FAST","author":"Lillibridge Mark","year":"2009","unstructured":"Mark Lillibridge , Kave Eshghi , Deepavali Bhagwat , Vinay Deolalikar , Greg Trezise , and Peter Camble . Sparse Indexing : Large Scale , Inline Deduplication Using Sampling and Locality. In FAST , 2009 . Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble. Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality. In FAST, 2009."},{"key":"e_1_3_2_1_15_1","volume-title":"Bolosky. A Study of Practical Deduplication. In FAST","author":"Dutch","year":"2011","unstructured":"Dutch T. Meyer and William J . Bolosky. A Study of Practical Deduplication. In FAST , 2011 . Dutch T. Meyer and William J. Bolosky. A Study of Practical Deduplication. In FAST, 2011."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183643"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/502034.502052"},{"key":"e_1_3_2_1_18_1","volume-title":"USENIX ATC","author":"Policroniades C.","year":"2004","unstructured":"C. Policroniades and I. Pratt . Alternatives for Detecting Redundancy in Storage Systems Data . In USENIX ATC , 2004 . C. Policroniades and I. Pratt. Alternatives for Detecting Redundancy in Storage Systems Data. In USENIX ATC, 2004."},{"key":"e_1_3_2_1_19_1","volume-title":"NSDI","author":"Pucha Himabindu","year":"2007","unstructured":"Himabindu Pucha , David G. Andersen , and Michael Kaminsky . Exploiting Similarity for Multi-Source Downloads Using File Handprints . In NSDI , 2007 . Himabindu Pucha, David G. Andersen, and Michael Kaminsky. Exploiting Similarity for Multi-Source Downloads Using File Handprints. In NSDI, 2007."},{"key":"e_1_3_2_1_20_1","volume-title":"USENIX ATC","author":"Rhea Sean","year":"2008","unstructured":"Sean Rhea , Russ Cox , and Alex Pesterev . Fast, inexpensive content-addressed storage in foundation . In USENIX ATC , 2008 . Sean Rhea, Russ Cox, and Alex Pesterev. Fast, inexpensive content-addressed storage in foundation. In USENIX ATC, 2008."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/1833515.1833727"},{"key":"e_1_3_2_1_22_1","volume-title":"FAST","author":"Ungureanu Cristian","year":"2010","unstructured":"Cristian Ungureanu , Benjamin Atkin , Akshat Aranya , Salil Gokhale , Stephen Rago , Grzegorz Calkowski , Cezary Dubnicki , and Aniruddha Bohra . HydraFS : a High-Throughput File System for the HYDRAstor Content-Addressable Storage System . In FAST , 2010 . Cristian Ungureanu, Benjamin Atkin, Akshat Aranya, Salil Gokhale, Stephen Rago, Grzegorz Calkowski, Cezary Dubnicki, and Aniruddha Bohra. HydraFS: a High-Throughput File System for the HYDRAstor Content-Addressable Storage System. In FAST, 2010."},{"key":"e_1_3_2_1_23_1","volume-title":"FAST","author":"Wallace Grant","year":"2012","unstructured":"Grant Wallace , Fred Douglis , Hangwei Qian , Philip Shilane , Stephen Smaldone , Mark Chamness , and Windsor Hsu . Characteristics of Backup Workloads in Production Systems . In FAST , 2012 . Grant Wallace, Fred Douglis, Hangwei Qian, Philip Shilane, Stephen Smaldone, Mark Chamness, and Windsor Hsu. Characteristics of Backup Workloads in Production Systems. In FAST, 2012."}],"event":{"name":"SOCC '12: ACM Symposium on Cloud Computing","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGOPS ACM Special Interest Group on Operating Systems"],"location":"San Jose California","acronym":"SOCC '12"},"container-title":["Proceedings of the Third ACM Symposium on Cloud Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2391229.2391246","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2391229.2391246","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:34:32Z","timestamp":1750239272000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2391229.2391246"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,10,14]]},"references-count":23,"alternative-id":["10.1145\/2391229.2391246","10.1145\/2391229"],"URL":"https:\/\/doi.org\/10.1145\/2391229.2391246","relation":{},"subject":[],"published":{"date-parts":[[2012,10,14]]},"assertion":[{"value":"2012-10-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}