{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,12]],"date-time":"2026-01-12T22:11:38Z","timestamp":1768255898396,"version":"3.49.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:p>Super spreader detection in high-speed data streams is crucial for numerous applications. Although many methods have emerged, existing works can hardly concurrently achieve high memory efficiency, support online detection, enable merging data from different measurement points\/periods, and offer invertibility. This makes them unable to satisfy flexible application requirements. This paper proposes RGS-Sketch, a novel sketch designed to address this problem. The core of RGS-Sketch lies in a new mergeable memory sharing design called register group sharing. This design organizes registers into groups as basic memory sharing units, accommodating the high skewness of real-world data streams and offering high memory efficiency. Besides, it enables online detection through the real-time acquisition of a group's state, which also facilitates invertibility. To enhance detection accuracy further, we propose a limited register update strategy. It blocks small flows from updating registers, thereby reducing memory overhead and estimation noises. Extensive experimental results based on four real-world datasets show that RGS-Sketch significantly outperforms the most accurate baselines in accuracy while maintaining a high throughput. Specifically, it improves the F1 scores by up to 0.643 for measurements at a single point\/period and up to 0.472 across multiple points\/periods.<\/jats:p>","DOI":"10.14778\/3717755.3717779","type":"journal-article","created":{"date-parts":[[2025,5,20]],"date-time":"2025-05-20T15:51:49Z","timestamp":1747756309000},"page":"1237-1249","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["RGS-Sketch: An Accurate, Invertible, and Mergeable Sketch for Online Super Spreader Detection in High-Speed Data Streams"],"prefix":"10.14778","volume":"18","author":[{"given":"Boyu","family":"Zhang","sequence":"first","affiliation":[{"name":"Soochow University"}]},{"given":"He","family":"Huang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Data Intelligence and Advanced Computing in Provincial Universities, Soochow University"}]},{"given":"Yu-E","family":"Sun","sequence":"additional","affiliation":[{"name":"Soochow University"}]},{"given":"Guoju","family":"Gao","sequence":"additional","affiliation":[{"name":"Soochow University"}]}],"member":"320","published-online":{"date-parts":[[2025,5,20]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564762"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2021.108239"},{"key":"e_1_2_1_3_1","volume-title":"The CAIDA UCSD Anonymized Internet Traces","author":"CAIDA.","year":"2016","unstructured":"CAIDA. 2016. The CAIDA UCSD Anonymized Internet Traces 2016. https:\/\/catalog.caida.org\/dataset\/passive_2016_pcap. Accessed: 2019-7-28."},{"key":"e_1_2_1_4_1","volume-title":"The CAIDA UCSD Anonymized Internet Traces","author":"CAIDA.","year":"2019","unstructured":"CAIDA. 2019. The CAIDA UCSD Anonymized Internet Traces 2019. https:\/\/catalog.caida.org\/dataset\/passive_2019_pcap. Accessed: 2021-12-27."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFCOM.2009.5061990"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.001"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNSM.2021.3073597"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM42981.2021.9488425"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM48880.2022.9796702"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the USENIX Security Symposium. 65\u201378","author":"Durumeric Zakir","year":"2014","unstructured":"Zakir Durumeric, Michael Bailey, and J Alex Halderman. 2014. An Internet-Wide view of Internet-Wide scanning. In Proceedings of the USENIX Security Symposium. 65\u201378."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/948205.948225"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2674005.2674994"},{"key":"e_1_2_1_13_1","volume-title":"DMTCS Proceedings","author":"Flajolet Philippe","year":"2007","unstructured":"Philippe Flajolet, Eric Fusy, Olivier Gandouet, and Frederic Meunier. 2007. Hyper-LogLog: the analysis of a near-optimal cardinality estimation algorithm. DMTCS Proceedings (2007), 127\u2013146."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(85)90041-8"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2022.12.009"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2452376.2452456"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2021.3078725"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3617334"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2020.2975625"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TDSC.2021.3111328"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2018.2863942"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFCOM.2007.305"},{"key":"e_1_2_1_23_1","unstructured":"MICHAEL KECHINOV. 2020. eCommerce behavior data from multi category store. https:\/\/www.kaggle.com\/datasets\/mkechinov\/ecommerce-behavior-data-from-multi-category-store. Accessed: 2024-04-10."},{"key":"e_1_2_1_24_1","unstructured":"MICHAEL KECHINOV. 2020. eCommerce Events History in Cosmetics Shop. https:\/\/www.kaggle.com\/datasets\/mkechinov\/ecommerce-events-history-in-cosmetics-shop. Accessed: 2024-04-10."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/1400123.1400125"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the USENIX symposium on networked systems design and implementation. 311\u2013324","author":"Li Yuliang","year":"2016","unstructured":"Yuliang Li, Rui Miao, Changhoon Kim, and Minlan Yu. 2016. FlowRadar: A better NetFlow for data centers. In Proceedings of the USENIX symposium on networked systems design and implementation. 311\u2013324."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2015.2503269"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2021.3108033"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the International conference on database theory. 398\u2013412","author":"Metwally Ahmed","year":"2005","unstructured":"Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams. In Proceedings of the International conference on database theory. 398\u2013412."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.002"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/637201.637222"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2022.3232098"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2009.12.003"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the Symposium on Operating System Design and Implementation. 45\u201360","author":"Singh Sumeet","year":"2004","unstructured":"Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage. 2004. Automated Worm Fingerprinting. In Proceedings of the Symposium on Operating System Design and Implementation. 45\u201360."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2022.3198738"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the Network and Distributed System Security Symposium. 1\u201318","author":"Venkataraman Shobha","year":"2005","unstructured":"Shobha Venkataraman, Dawn Xiaodong Song, Phillip B Gibbons, and Avrim Blum. 2005. New streaming algorithms for fast detection of superspreaders. In Proceedings of the Network and Distributed System Security Symposium. 1\u201318."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/3447689.3447707"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2011.2123094"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/78922.78925"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2745844.2745870"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219978"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2019.2933868"},{"key":"e_1_2_1_43_1","volume-title":"Detecting stealthy spreaders by random aging streaming filters. IEICE transactions on communications 94, 8","author":"Yoon Myung Keun","year":"2011","unstructured":"Myung Keun Yoon and Shigang Chen. 2011. Detecting stealthy spreaders by random aging streaming filters. IEICE transactions on communications 94, 8 (2011), 2274\u20132281."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467353"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3717755.3717779","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,20]],"date-time":"2025-05-20T16:17:55Z","timestamp":1747757875000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3717755.3717779"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["10.14778\/3717755.3717779"],"URL":"https:\/\/doi.org\/10.14778\/3717755.3717779","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2024,12]]},"assertion":[{"value":"2025-05-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}