{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:44:43Z","timestamp":1774982683712,"version":"3.50.1"},"reference-count":92,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,12,18]],"date-time":"2024-12-18T00:00:00Z","timestamp":1734480000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100006374","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62372106"],"award-info":[{"award-number":["62372106"]}],"id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Nsfocus Information Technology Co., Ltd.","award":["CCF-NSFOCUS202206"],"award-info":[{"award-number":["CCF-NSFOCUS202206"]}]},{"name":"Jiangsu Provincial Scientific Research Center of Applied Mathematics","award":["BK20233002"],"award-info":[{"award-number":["BK20233002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,12,18]]},"abstract":"<jats:p>\n                    In the field of data stream processing, there are two prevalent models, i.e., insertion-only, and turnstile models. Most previous works were proposed for the insertion-only model, which assumes new elements arrive continuously as a stream, and neglects the possibilities of removing existing elements. In this paper, we make a\n                    <jats:italic toggle=\"yes\">bounded deletion<\/jats:italic>\n                    assumption, putting a constraint on the number of deletions allowed. For such a turnstile stream, we focus on a new problem of\n                    <jats:italic toggle=\"yes\">universal measurement<\/jats:italic>\n                    that estimates multiple kinds of statistical metrics simultaneously using limited memory and in an online fashion, including per-element frequency, heavy hitters, frequency moments, and frequency distribution. There are two key challenges for processing a turnstile stream with bounded deletions. Firstly, most previous methods for detecting heavy hitters cannot ensure a bounded detection error when there are deletion events. Secondly, there is still no prior work to estimate the per-element frequency moments under turnstile model, especially in an online fashion. In this paper, we address the former challenge by proposing a Removable Augmented Sketch, and address the latter by a Removable Universal Sketch, enhanced with an Online Moment Estimator. In addition, we improve the accuracy of frequency estimation by a compressed counter design, which can halve the memory cost of a frequency counter and support addition\/minus operations. Our experiments show that our solution outperforms other algorithms by 16%~69% in F1 Score of heavy hitter detection, and improves the throughput of frequency moment estimation by 3.0x10\n                    <jats:sup>4<\/jats:sup>\n                    times.\n                  <\/jats:p>","DOI":"10.1145\/3698799","type":"journal-article","created":{"date-parts":[[2024,12,20]],"date-time":"2024-12-20T16:40:35Z","timestamp":1734712835000},"page":"1-28","source":"Crossref","is-referenced-by-count":3,"title":["A Universal Sketch for Estimating Heavy Hitters and Per-Element Frequency Moments in Data Streams with Bounded Deletions"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4148-2531","authenticated-orcid":false,"given":"Liang","family":"Zheng","sequence":"first","affiliation":[{"name":"Southeast University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8348-6277","authenticated-orcid":false,"given":"Qingjun","family":"Xiao","sequence":"additional","affiliation":[{"name":"Southeast University &amp; Purple Mountain Laboratories, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-3498-3269","authenticated-orcid":false,"given":"Xuyuan","family":"Cai","sequence":"additional","affiliation":[{"name":"The Hong Kong Polytechnic University, Hong Kong, China"}]}],"member":"320","published-online":{"date-parts":[[2024,12,20]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFCOM.2013.6567024"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/303976.303978"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2610510"},{"key":"e_1_2_1_4_1","first-page":"33","article-title":"A method of moments for mixture models and hidden Markov models","volume":"23","author":"Anandkumar Animashree","year":"2012","unstructured":"Animashree Anandkumar, Daniel Hsu, and Sham M Kakade. 2012. A method of moments for mixture models and hidden Markov models. In Proc. of the ACM COLT, Vol. 23. 33.1--33.34.","journal-title":"Proc. of the ACM COLT"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3034786.3034798"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40328-6_5"},{"key":"e_1_2_1_7_1","first-page":"1","article-title":"Revisiting Frequency Moment Estimation in Random Order Streams","volume":"107","author":"Braverman Vladimir","year":"2018","unstructured":"Vladimir Braverman, David Woodruff, and Lin Yang. 2018. Revisiting Frequency Moment Estimation in Random Order Streams. In Proc. of ICALP, Vol. 107. 25:1--25:14.","journal-title":"Proc. of ICALP"},{"key":"e_1_2_1_8_1","unstructured":"CAIDA. 2016. The CAIDA Anonymized Internet Traces. http:\/\/www.caida.org\/data\/overview\/"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(03)00400-6"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3452784"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3384699"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2004.1320018"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3516431.3516433"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.001"},{"key":"e_1_2_1_15_1","first-page":"879","article-title":"Logram: Efficient log parsing using n n-gram dictionaries","volume":"48","author":"Dai Hetong","year":"2020","unstructured":"Hetong Dai, Heng Li, Che-Shao Chen, Weiyi Shang, and Tse-Hsun Chen. 2020. Logram: Efficient log parsing using n n-gram dictionaries. IEEE Transactions on Software Engineering, Vol. 48, 3 (2020), 879--892.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_16_1","first-page":"1","article-title":"SafeBound: A Practical System for Generating Cardinality Bounds","volume":"1","author":"Deeds Kyle B","year":"2023","unstructured":"Kyle B Deeds, Dan Suciu, and Magdalena Balazinska. 2023. SafeBound: A Practical System for Generating Cardinality Bounds. In Proc. of ACM SIGMOD, Vol. 1. 1--26.","journal-title":"Proc. of ACM SIGMOD"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/633025.633056"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/948205.948225"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2674005.2674994"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2023.3278028"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/3236187.3236212"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/3279541"},{"key":"e_1_2_1_23_1","volume-title":"5G security innovation with Cisco. Whitepaper Cisco Public","author":"Geller Michael","year":"2018","unstructured":"Michael Geller and Pramod Nair. 2018. 5G security innovation with Cisco. Whitepaper Cisco Public (2018), 1--29."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2452376.2452456"},{"key":"e_1_2_1_25_1","first-page":"1","article-title":"Memory-Efficient and Flexible Detection of Heavy Hitters in High-Speed Networks","volume":"1","author":"Huang He","year":"2023","unstructured":"He Huang, Jiakun Yu, Yang Du, Jia Liu, Haipeng Dai, and Yu-E Sun. 2023. Memory-Efficient and Flexible Detection of Heavy Hitters in High-Speed Networks. In Proc. of ACM SIGMOD, Vol. 1. 1--24.","journal-title":"Proc. of ACM SIGMOD"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3452840"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3196959.3196986"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comcom.2020.12.016"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3547305.3547306"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1012888.1005709"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1140103.1140295"},{"key":"e_1_2_1_32_1","volume-title":"Proc. of IEEE ISSCS. 1--5.","author":"Lavric Alexandru","year":"2017","unstructured":"Alexandru Lavric and Valentin Popa. 2017. Internet of things and LoRa\u0099 low-power wide-area networks: a survey. In Proc. of IEEE ISSCS. 1--5."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850583.2850594"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403208"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3626292.3626302"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589334.3645581"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2751523"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM48880.2022.9796721"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2934872.2934906"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352136"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465313"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2008.11.001"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-155860869-6\/50038-X"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.14778\/3625054.3625065"},{"key":"e_1_2_1_45_1","first-page":"398","article-title":"Efficient computation of frequent and top-k elements in data streams","volume":"3363","author":"Metwally Ahmed","year":"2005","unstructured":"Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams. In Proc. of Springer ICDT, Vol. 3363. 398--412.","journal-title":"Proc. of Springer ICDT"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2612182"},{"key":"e_1_2_1_47_1","volume-title":"Foundations and Trends\u00ae in Theoretical Computer Science","volume":"1","author":"Shanmugavelayutham","year":"2005","unstructured":"Shanmugavelayutham Muthukrishnan et al. 2005. Data streams: Algorithms and applications. Foundations and Trends\u00ae in Theoretical Computer Science, Vol. 1, 2 (2005), 117--236."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-10928-8_34"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-022-00750-4"},{"key":"e_1_2_1_50_1","first-page":"68","article-title":"The california consumer privacy act: Towards a european-style privacy regime in the united states","volume":"23","author":"Pardau Stuart L","year":"2018","unstructured":"Stuart L Pardau. 2018. The california consumer privacy act: Towards a european-style privacy regime in the united states. J. Tech. L. & Pol'y, Vol. 23 (2018), 68.","journal-title":"J. Tech. L. & Pol'y"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00124"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882948"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.softx.2019.100353"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3583780.3615057"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/582095.582099"},{"key":"e_1_2_1_56_1","volume-title":"Proc. of USENIX Security. 4319--4335","author":"Seo HyungBin","year":"2023","unstructured":"HyungBin Seo and MyungKeun Yoon. 2023. Generative intrusion detection and prevention on data stream. In Proc. of USENIX Security. 4319--4335."},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3386367.3432729"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543856"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196930"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2019.8737499"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183759"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-57959-7"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.14778\/3447689.3447707"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2022.3197968"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.5555\/1965575"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/78922.78925"},{"key":"e_1_2_1_67_1","first-page":"1","article-title":"Separations for Estimating Large Frequency Moments on Data Streams","volume":"198","author":"Woodruff David P","year":"2021","unstructured":"David P Woodruff and Samson Zhou. 2021. Separations for Estimating Large Frequency Moments on Data Streams. In Proc. of ICALP, Vol. 198. 112:1--112:21.","journal-title":"Proc. of ICALP"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588721"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2022.3216025"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2023.3268980"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2017.2753842"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2023.110097"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM41043.2020.9155454"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/IWQoS54832.2022.9812870"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2017.8057088"},{"key":"e_1_2_1_76_1","volume-title":"Efficient Flow Recording with InheritSketch on Programmable Switches. In 2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS). IEEE, 1--11","author":"Xie Guorui","year":"2023","unstructured":"Guorui Xie, Qing Li, Guanglin Duan, Yong Jiang, Zhuyun Qi, Shuo Liu, and Qiaoling Wang. 2023. Efficient Flow Recording with InheritSketch on Programmable Switches. In 2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS). IEEE, 1--11."},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599433"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-021-00687-0"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2023.3303924"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3419204"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2019.2923772"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219978"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230544"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2019.2933868"},{"key":"e_1_2_1_85_1","unstructured":"Yelp. 2021. Yelp Open Dataset. https:\/\/www.yelp.com\/dataset\/"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2023.110059"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.14778\/3514061.3514068"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583147"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.14778\/3450980.3450990"},{"key":"e_1_2_1_90_1","doi-asserted-by":"crossref","unstructured":"Liang Zheng Qingjun Xiao and Xuyuan Cai. 2024. A Universal Sketch for Estimating Heavy Hitters and Per-Element Frequency Moments in Data Streams with Bounded Deletions [technical report]. https:\/\/github.com\/usoop\/Removaleb_Universal_Sketch\/blob\/main\/TechnicalReport.pdf.","DOI":"10.1145\/3698799"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183726"},{"key":"e_1_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2018.8485804"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3698799","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3698799","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T17:45:20Z","timestamp":1774979120000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3698799"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,18]]},"references-count":92,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12,18]]}},"alternative-id":["10.1145\/3698799"],"URL":"https:\/\/doi.org\/10.1145\/3698799","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,18]]}}}