{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T16:01:44Z","timestamp":1774540904421,"version":"3.50.1"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2020,10]]},"abstract":"<jats:p>Approximate stream processing has attracted much attention recently. Prior art mostly focuses on characteristics like frequency, cardinality, and quantile. Persistence, as a new characteristic, is getting increasing attention. Unlike frequency, persistence highlights behaviors where an item appears recurrently in many time windows of a data stream. There are two typical problems with persistence - persistence estimation and finding persistent items. In this paper, we propose the On-Off sketch to address both problems. For persistence estimation, using the characteristic that the persistence of an item is increased periodically, we compress increments when multiple items are mapped to the same counter, which significantly reduces the error. Compared with the Count-Min sketch, 1) in theory, we prove that the error of the On-Off sketch is always smaller; 2) in experiments, the On-Off sketch achieves around 6.17 times smaller error and 2.2 times higher throughput. For finding persistent items, we propose a technique to separate persistent and non-persistent items, further improving the accuracy. We show that the space complexity of our On-Off sketch is much better than the state-of-the-art (PIE), and it reduces the error up to 4 orders of magnitude and achieves 2.84 times higher throughput than prior algorithms in experiments.<\/jats:p>","DOI":"10.14778\/3425879.3425884","type":"journal-article","created":{"date-parts":[[2020,11,25]],"date-time":"2020-11-25T02:45:23Z","timestamp":1606272323000},"page":"128-140","source":"Crossref","is-referenced-by-count":62,"title":["On-off sketch"],"prefix":"10.14778","volume":"14","author":[{"given":"Yinda","family":"Zhang","sequence":"first","affiliation":[{"name":"Peking University"}]},{"given":"Jinyang","family":"Li","sequence":"additional","affiliation":[{"name":"Peking University"}]},{"given":"Yutian","family":"Lei","sequence":"additional","affiliation":[{"name":"Xiangtan University"}]},{"given":"Tong","family":"Yang","sequence":"additional","affiliation":[{"name":"Peking University"}]},{"given":"Zhetao","family":"Li","sequence":"additional","affiliation":[{"name":"Xiangtan University"}]},{"given":"Gong","family":"Zhang","sequence":"additional","affiliation":[{"name":"Huawei"}]},{"given":"Bin","family":"Cui","sequence":"additional","affiliation":[{"name":"Peking University"}]}],"member":"320","published-online":{"date-parts":[[2020,11,16]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183726"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41019-018-0074-4"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219978"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915223"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-018-9834-8"},{"key":"e_1_2_1_6_1","volume-title":"Proc. ICDE","author":"Haida Z.","year":"2017","unstructured":"Z. Haida , H. Zengfeng , W. Zhewei , Z. Wenjie , and L. Xuemin . Tracking matrix approximation over distributed sliding windows . In Proc. ICDE , 2017 . Z. Haida, H. Zengfeng, W. Zhewei, Z. Wenjie, and L. Xuemin. Tracking matrix approximation over distributed sliding windows. In Proc. ICDE, 2017."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-018-9441-4"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41019-018-0066-4"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-24698-5_7"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/859716.859719"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/646255.684566"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882948"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30570-5_27"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183759"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/3099622.3099627"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137628.3137652"},{"key":"e_1_2_1_17_1","volume-title":"IN AOFA '07: PROCEEDINGS OF THE 2007 INTERNATIONAL CONFERENCE ON ANALYSIS OF ALGORITHMS","author":"Flajolet Philippe","year":"2007","unstructured":"Philippe Flajolet , \u00c9ric Fusy , Olivier Gandouet , and et al. Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm . In IN AOFA '07: PROCEEDINGS OF THE 2007 INTERNATIONAL CONFERENCE ON ANALYSIS OF ALGORITHMS , 2007 . Philippe Flajolet, \u00c9ric Fusy, Olivier Gandouet, and et al. Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm. In IN AOFA '07: PROCEEDINGS OF THE 2007 INTERNATIONAL CONFERENCE ON ANALYSIS OF ALGORITHMS, 2007."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/948205.948225"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(85)90041-8"},{"key":"e_1_2_1_20_1","volume-title":"Foundations of Computer Science","author":"Karnin Zohar","year":"2016","unstructured":"Zohar Karnin , Kevin Lang , and Edo Liberty . Optimal quantile approximation in streams . In Foundations of Computer Science , 2016 . Zohar Karnin, Kevin Lang, and Edo Liberty. Optimal quantile approximation in streams. In Foundations of Computer Science, 2016."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465312"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/376284.375670"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00126"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1452520.1452538"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04342-0_17"},{"key":"e_1_2_1_26_1","unstructured":"Advanced persistent threat. http:\/\/www.usenix.org\/event\/lisa09\/tech\/slides\/daly.pdf.  Advanced persistent threat. http:\/\/www.usenix.org\/event\/lisa09\/tech\/slides\/daly.pdf."},{"issue":"4","key":"e_1_2_1_27_1","first-page":"5054","article-title":"Advanced persistent threat attack detection: an overview","volume":"4","author":"Ghafir Ibrahim","year":"2014","unstructured":"Ibrahim Ghafir and Vaclav Prenosil . Advanced persistent threat attack detection: an overview . Int J Adv Comput Netw Secur , 4 ( 4 ): 5054 , 2014 . Ibrahim Ghafir and Vaclav Prenosil. Advanced persistent threat attack detection: an overview. Int J Adv Comput Netw Secur, 4(4):5054, 2014.","journal-title":"Int J Adv Comput Netw Secur"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1353-4858(11)70086-1"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/597917.597922"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICNP.2014.33"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/11600930_5"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2002259.2002294"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/3025111.3025112"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1561\/0100000060"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/362686.362692"},{"key":"e_1_2_1_36_1","unstructured":"Intel sse2 documentation. https:\/\/software.intel.com\/en-us\/node\/683883.  Intel sse2 documentation. https:\/\/software.intel.com\/en-us\/node\/683883."},{"key":"e_1_2_1_37_1","unstructured":"Source code related to On-Off sketch. https:\/\/github.com\/Sketch-Data-Stream\/On-Off-Sketch.  Source code related to On-Off sketch. https:\/\/github.com\/Sketch-Data-Stream\/On-Off-Sketch."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2749443"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183737"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687671"},{"key":"e_1_2_1_41_1","unstructured":"Intel advanced vector extensions programming reference. https:\/\/software.intel.com\/file\/36945.  Intel advanced vector extensions programming reference. https:\/\/software.intel.com\/file\/36945."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403144"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/3277355.3277443"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/1603899.1603924"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.576"},{"key":"e_1_2_1_46_1","unstructured":"The data center dataset. http:\/\/pages.cs.wisc.edu\/~tbenson\/IMC10_Data.html.  The data center dataset. http:\/\/pages.cs.wisc.edu\/~tbenson\/IMC10_Data.html."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1879141.1879175"},{"key":"e_1_2_1_48_1","unstructured":"The Network dataset Internet Traces. http:\/\/snap.stanford.edu\/data\/.  The Network dataset Internet Traces. http:\/\/snap.stanford.edu\/data\/."},{"key":"e_1_2_1_49_1","unstructured":"The caida anonymized 2016 internet traces. http:\/\/www.caida.org\/data\/overview\/.  The caida anonymized 2016 internet traces. http:\/\/www.caida.org\/data\/overview\/."},{"key":"e_1_2_1_50_1","unstructured":"Hash website. http:\/\/burtleburtle.net\/bob\/hash\/evahash.html.  Hash website. http:\/\/burtleburtle.net\/bob\/hash\/evahash.html."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3425879.3425884","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:06:39Z","timestamp":1672225599000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3425879.3425884"}},"subtitle":["a fast and accurate sketch on persistence"],"short-title":[],"issued":{"date-parts":[[2020,10]]},"references-count":50,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,10]]}},"alternative-id":["10.14778\/3425879.3425884"],"URL":"https:\/\/doi.org\/10.14778\/3425879.3425884","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2020,10]]}}}