{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T13:56:53Z","timestamp":1776261413542,"version":"3.50.1"},"reference-count":71,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Reconfigurable Technol. Syst."],"published-print":{"date-parts":[[2026,6,30]]},"abstract":"<jats:p>Efficient data reduction techniques, including deduplication and compression, are essential in storage systems, affecting performance and longevity. Existing data deduplication approaches often focus on intra-SSD deduplication, missing opportunities for cross-node deduplication, or have scalability issues when aiming for low latency and high-throughput data reduction on large-scale, distributed SSD arrays. We propose StreamDedup, a distributed stream accelerator implementing a transparent layer of deduplication as a network-attached, middle-tier service between the compute and storage tiers. StreamDedup manages all aspects of data deduplication and compression and can be seamlessly integrated into existing systems. It is RDMA-enabled and highly scalable, enhancing data processing capacities for large-scale storage systems. Our prototype, deployed on FPGAs, demonstrates that StreamDedup achieves a throughput of 12.7 GB\/s on a single node, matching the network bandwidth of disaggregated storage, with a latency of less than 50 \u00b5s. Across 10 nodes, StreamDedup shows an almost linear increase in throughput with less than 60 \u00b5s of latency.<\/jats:p>","DOI":"10.1145\/3799896","type":"journal-article","created":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T13:10:47Z","timestamp":1772457047000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["<scp>StreamDedup<\/scp>\n                    : Distributed In-line Deduplication for Disaggregated Storage"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-2057-6979","authenticated-orcid":false,"given":"Jiayong","family":"Li","sequence":"first","affiliation":[{"name":"Computer Science, ETH Z\u00fcrich, Z\u00fcrich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6706-0353","authenticated-orcid":false,"given":"Jonas","family":"Dann","sequence":"additional","affiliation":[{"name":"Computer Science, ETH Z\u00fcrich, Z\u00fcrich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-1469-6771","authenticated-orcid":false,"given":"Zhenhao","family":"He","sequence":"additional","affiliation":[{"name":"Computer Science, ETH Z\u00fcrich, Z\u00fcrich, Switzerland and NVIDIA Corp, Santa Clara, California, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8151-6444","authenticated-orcid":false,"given":"Gustavo","family":"Alonso","sequence":"additional","affiliation":[{"name":"Computer Science, ETH Z\u00fcrich, Z\u00fcrich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9004-440X","authenticated-orcid":false,"given":"Sai Rahul","family":"Chalamalasetti","sequence":"additional","affiliation":[{"name":"d-Matrix, Santa Clara, California, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9830-8588","authenticated-orcid":false,"given":"Dejan","family":"Milojicic","sequence":"additional","affiliation":[{"name":"Hewlett Packard Enterprise Co, Palo Alto, California, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-5070-4542","authenticated-orcid":false,"given":"Lance","family":"Evans","sequence":"additional","affiliation":[{"name":"Hewlett Packard Enterprise Co, Palo Alto, California, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-3953-1750","authenticated-orcid":false,"given":"Alex","family":"Veprinsky","sequence":"additional","affiliation":[{"name":"Hewlett Packard Enterprise Co, Palo Alto, California, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4901-5476","authenticated-orcid":false,"given":"Runbin","family":"Shi","sequence":"additional","affiliation":[{"name":"Microsoft Corporation, Redmond, Washington, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,4,15]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/1060289.1060291"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358303"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2019.00025"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/1383422.1383443"},{"key":"e_1_3_2_6_2","unstructured":"Amazon. 2024. Amazon Elastic Block Store. Retrieved December 8 2024 from https:\/\/aws.amazon.com\/ebs\/"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/2043556.2043571"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/1365815.1365816"},{"key":"e_1_3_2_9_2","first-page":"77","volume-title":"9th USENIX Conference on File and Storage Technologies","author":"Chen Feng","year":"2011","unstructured":"Feng Chen, Tian Luo, and Xiaodong Zhang. 2011. CAFTL: A content-aware flash translation layer enhancing the lifespan of flash memory based solid state drives. In 9th USENIX Conference on File and Storage Technologies. USENIX, 77\u201390."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.4028\/www.scientific.net\/AMR.1042.212"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCC.2015.7405578"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.14778\/2983200.2983203"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2011.44"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465295"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.5555\/2342821.2342847"},{"key":"e_1_3_2_16_2","first-page":"519","volume-title":"18th USENIX Symposium on Networked Systems Design and Implementation (NSDI \u201921)","author":"Gao Yixiao","year":"2021","unstructured":"Yixiao Gao, Qiang Li, Lingbo Tang, Yongqing Xi Zhang, Pengcheng Peng, Wenwen Li, Bo Wu, Yaohui Liu, Shaozong Yan, Lei Feng, et al. 2021. When cloud storage meets RDMA. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI \u201921). USENIX Association, 519\u2013533."},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-020-03188-z"},{"key":"e_1_3_2_18_2","volume-title":"2011 USENIX Annual Technical Conference (USENIX ATC \u201911)","author":"Guo Fanglu","year":"2011","unstructured":"Fanglu Guo and Petros Efstathopoulos. 2011. Building a high-performance deduplication system. In 2011 USENIX Annual Technical Conference (USENIX ATC \u201911). USENIX Association, 1\u201314."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.5555\/1960475.1960482"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCC.and.EUC.2013.286"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2019.2947897"},{"key":"e_1_3_2_22_2","unstructured":"Maximilian Jakob Heer Benjamin Ramhorst Yu Zhu Luhao Liu Zhiyi Hu Jonas Dann and Gustavo Alonso. 2025. RoCE BALBOA: Service-enhanced data center RDMA for SmartNICs. arXiv:2507.20412. Retrieved from https:\/\/arxiv.org\/abs\/2507.20412"},{"key":"e_1_3_2_23_2","first-page":"619","volume-title":"2023 USENIX Annual Technical Conference (USENIX ATC \u201923)","author":"Ji Houxiang","year":"2023","unstructured":"Houxiang Ji, Mark Mansi, Yan Sun, Yifan Yuan, Jinghan Huang, Reese Kuper, Michael M. Swift, and Nam Sung Kim. 2023. STYX: Exploiting SmartNIC capability to reduce datacenter memory tax. In 2023 USENIX Annual Technical Conference (USENIX ATC \u201923). USENIX Association, 619\u2013633."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/1534530.1534540"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3132402.3132420"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2013.6558444"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/2141702.2141705"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2012.6232379"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/2901318.2901337"},{"key":"e_1_3_2_30_2","volume-title":"SNIA IOTTA Trace Repository","author":"Koller Ricardo","year":"2008","unstructured":"Ricardo Koller and Raju Rangaswami. 2008. FIU IODedup traces (SNIA IOTTA trace 402). In SNIA IOTTA Trace Repository. Geoff Kuenning (Ed.), Storage Networking Industry Association."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/1837915.1837921"},{"key":"e_1_3_2_32_2","first-page":"991","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI \u201920)","author":"Korolija Dario","year":"2020","unstructured":"Dario Korolija, Timothy Roscoe, and Gustavo Alonso. 2020. Do OS abstractions make sense on FPGAs? In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI \u201920). USENIX Association, 991\u20131010."},{"key":"e_1_3_2_33_2","volume-title":"11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage \u201919)","author":"Kuhring Lucas","year":"2019","unstructured":"Lucas Kuhring, Eva Garcia, and Zsolt Istv\u00e1n. 2019. Specialize in moderation\u2014Building application-aware storage services using FPGAs in the datacenter. In 11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage \u201919). USENIX Association, 1\u20138."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3385073"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS53621.2022.00134"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/EuCNC.2019.8801958"},{"key":"e_1_3_2_37_2","first-page":"273","volume-title":"13th USENIX Conference on File and Storage Technologies (FAST \u201915)","author":"Lee Changman","year":"2015","unstructured":"Changman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In 13th USENIX Conference on File and Storage Technologies (FAST \u201915). USENIX Association, 273\u2013286."},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2020.3009347"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.2009.103"},{"key":"e_1_3_2_40_2","unstructured":"Jean Loup Gailly and Mark Adler. 2024. zlib: A Massively Spiffy Yet Delicately Unobtrusive Compression Library. Retrieved December 8 2024 from https:\/\/github.com\/madler\/zlib"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2012.6232390"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2018.2889329"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCC.2015.2511752"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.5555\/1960475.1960476"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544216.3544238"},{"key":"e_1_3_2_46_2","unstructured":"Microsoft. 2024. Microsoft Azure Boost. Retrieved December 8 2024 from https:\/\/learn.microsoft.com\/en-us\/azure\/azure-boost\/overview"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2018.00106"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/2611778"},{"key":"e_1_3_2_49_2","unstructured":"The OpenSSL Project. 2024. OpenSSL: Cryptography and SSL\/TLS toolkit. Retrieved December 8 2024 from https:\/\/github.com\/openssl\/openssl"},{"key":"e_1_3_2_50_2","first-page":"101","volume-title":"2023 USENIX Annual Technical Conference (USENIX ATC \u201923)","author":"Qiu Jiansheng","year":"2023","unstructured":"Jiansheng Qiu, Yanqi Pan, Wen Xia, Xiaojia Huang, Wenjun Wu, Xiangyu Zou, Shiyi Li, and Yu Hua. 2023. Light-Dedup: A light-weight inline deduplication framework for non-volatile memory file systems. In 2023 USENIX Annual Technical Conference (USENIX ATC \u201923). USENIX Association, 101\u2013116."},{"key":"e_1_3_2_51_2","volume-title":"Conference on File and Storage Technologies (FAST \u201902)","author":"Quinlan Sean","year":"2002","unstructured":"Sean Quinlan and Sean Dorward. 2002. Venti: A new approach to archival data storage. In Conference on File and Storage Technologies (FAST \u201902). USENIX Association, 1\u201313."},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3731569.3764845"},{"key":"e_1_3_2_53_2","doi-asserted-by":"crossref","unstructured":"Christian Esteve Rothenberg Carlos A. B. Macapuna Fabio L. Verdi and Mauricio F. Magalhaes. 2010. The deletable bloom filter: A new member of the bloom family. IEEE Communications Letters 14 6 (2010) 557\u2013559.","DOI":"10.1109\/LCOMM.2010.06.100344"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2014.2350984"},{"key":"e_1_3_2_55_2","unstructured":"SpinalHDL. 2024. SpinalCrypto. Retrieved December 8 2024 from https:\/\/github.com\/SpinalHDL\/SpinalCrypto"},{"key":"e_1_3_2_56_2","first-page":"24","volume-title":"Proceedings of the USENIX Conference on File and Storage Technologies (FAST \u201912)","author":"Srinivasan Kiran","year":"2012","unstructured":"Kiran Srinivasan, Timothy Bisson, Garth R. Goodson, and Kaladhar Voruganti. 2012. iDedup: Latency-aware, inline data deduplication for primary storage. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST \u201912). USENIX Association, 24."},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/383059.383071"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/MASCOTS.2016.71"},{"key":"e_1_3_2_59_2","unstructured":"Vasily Tarasov Deepak Kumar Jain Geoffrey H. Kuenning Harvey Mudd College Sonam Mandal Karthikeyani Palanisami Philip Shilane Sagar Trehan and Erez Zadok. 2014. Dmdedup: Device mapper target for data deduplication. In 2014 Ottawa Linux Symposium (OLS) 83\u201396."},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/SURV.2011.031611.00024"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2011.5937237"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2017.2774270"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544846"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2016.2571298"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/NAS.2012.46"},{"key":"e_1_3_2_66_2","unstructured":"Xilinx. 2024. Vitis_Libraries. Retrieved December 8 2024 from https:\/\/github.com\/Xilinx\/Vitis_Libraries"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2019.00009"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.2015.80"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3579371.3589077"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1109\/Grid.2012.21"},{"key":"e_1_3_2_71_2","first-page":"187","volume-title":"19th USENIX Conference on File and Storage Technologies (FAST \u201921)","author":"Zhou You","year":"2021","unstructured":"You Zhou, Qiulin Wu, Fei Wu, Hong Jiang, Jian Zhou, and Changsheng Xie. 2021. Remap-SSD: Safely and efficiently exploiting SSD address remapping to eliminate duplicate writes. In 19th USENIX Conference on File and Storage Technologies (FAST \u201921). USENIX Association, 187\u2013202."},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2018.00043"}],"container-title":["ACM Transactions on Reconfigurable Technology and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3799896","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T13:02:18Z","timestamp":1776258138000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3799896"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,15]]},"references-count":71,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,6,30]]}},"alternative-id":["10.1145\/3799896"],"URL":"https:\/\/doi.org\/10.1145\/3799896","relation":{},"ISSN":["1936-7406","1936-7414"],"issn-type":[{"value":"1936-7406","type":"print"},{"value":"1936-7414","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,15]]},"assertion":[{"value":"2025-07-04","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-16","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-04-15","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}