{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T07:51:16Z","timestamp":1770537076205,"version":"3.49.0"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2026,1,1]],"date-time":"2026-01-01T00:00:00Z","timestamp":1767225600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T00:00:00Z","timestamp":1768003200000},"content-version":"vor","delay-in-days":9,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100018693","name":"HORIZON EUROPE Framework Programme","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100018693","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["The VLDB Journal"],"published-print":{"date-parts":[[2026,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>A key need in different disciplines is to perform analytics over fast-paced data streams, similar in nature to the traditional OLAP analytics in relational databases - i.e., with aggregates and selection predicates. Storing unbounded streams, however, is not a realistic, or desired approach due to the high storage requirements, and the delays introduced when storing massive data. Accordingly, many synopses\/sketches have been proposed that can summarize the stream in small memory (usually sufficiently small to be stored in RAM), such that aggregate queries can be efficiently approximated, without storing the full stream. However, past synopses predominantly focus on summarizing single-attribute streams, and cannot handle selection predicates and constraints on arbitrary subsets of multiple attributes efficiently. In this work, we propose OmniSketch, the first sketch that scales to fast-paced and complex data streams (with many attributes), and supports count aggregates with predicates on multiple attributes, dynamically chosen at query time. OmniSketch supports streams containing both inserts and deletes, under the bounded deletes streaming model. OmniSketch offers probabilistic guarantees, a favorable space-accuracy tradeoff, and a worst-case logarithmic complexity for updating and for query execution. We demonstrate experimentally with both real and synthetic data that OmniSketch outperforms the state-of-the-art and can approximate complex ad-hoc queries within the configured accuracy guarantees, with small memory requirements.<\/jats:p>","DOI":"10.1007\/s00778-025-00960-6","type":"journal-article","created":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T12:47:12Z","timestamp":1768049232000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["OmniSketch: Multi-dimensional update stream analytics with arbitrary predicates"],"prefix":"10.1007","volume":"35","author":[{"given":"Wieger R.","family":"Punter","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0045-1648","authenticated-orcid":false,"given":"Odysseas","family":"Papapetrou","sequence":"additional","affiliation":[]},{"given":"Minos","family":"Garofalakis","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,1,10]]},"reference":[{"issue":"1","key":"960_CR1","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1006\/jcss.1997.1545","volume":"58","author":"N Alon","year":"1999","unstructured":"Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137\u2013147 (1999)","journal-title":"J. Comput. Syst. Sci."},{"issue":"11","key":"960_CR2","doi-asserted-by":"publisher","first-page":"3937","DOI":"10.14778\/3749646.3749665","volume":"18","author":"L Becchetti","year":"2025","unstructured":"Becchetti, L., Clementi, A., Gual\u00e1, L., Sciarria, L.P., Straziota, A., Stromieri, M.: Approximate 2-hop neighborhoods on incremental graphs: An efficient lazy approach. Proc. VLDB Endow. 18(11), 3937\u20133950 (2025). https:\/\/doi.org\/10.14778\/3749646.3749665","journal-title":"Proc. VLDB Endow."},{"issue":"7","key":"960_CR3","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1145\/362686.362692","volume":"13","author":"BH Bloom","year":"1970","unstructured":"Bloom, B.H.: Space\/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422\u2013426 (1970). https:\/\/doi.org\/10.1145\/362686.362692","journal-title":"Commun. ACM"},{"key":"960_CR4","doi-asserted-by":"publisher","unstructured":"Broder, A.Z.: On the resemblance and containment of documents. In: Proc. of Compression and Complexity of Sequences, pp. 21\u201329. IEEE (1997). https:\/\/doi.org\/10.1109\/SEQUEN.1997.666900","DOI":"10.1109\/SEQUEN.1997.666900"},{"issue":"11","key":"960_CR5","doi-asserted-by":"publisher","first-page":"2241","DOI":"10.1109\/TKDE.2019.2916858","volume":"32","author":"M Bury","year":"2020","unstructured":"Bury, M., Schwiegelshohn, C., Sorella, M.: Similarity search for dynamic data streams. IEEE TKDE 32(11), 2241\u20132253 (2020). https:\/\/doi.org\/10.1109\/TKDE.2019.2916858","journal-title":"IEEE TKDE"},{"key":"960_CR6","unstructured":"CAIDA: The CAIDA UCSD Anonymized Internet Traces Dataset (accessed October 2023) (2019). https:\/\/www.caida.org\/catalog\/datasets\/passive_dataset\/"},{"key":"960_CR7","doi-asserted-by":"publisher","unstructured":"Clementi, A., Gual\u00e0, L., Pep\u00e8\u00a0Sciarria, L., Straziota, A.: Maintaining k-minhash signatures over fully-dynamic data streams with recovery. In: Proc. of WSDM, p. 79-87 (2025). https:\/\/doi.org\/10.1145\/3701551.3703491","DOI":"10.1145\/3701551.3703491"},{"key":"960_CR8","doi-asserted-by":"publisher","unstructured":"Cohen, E., Cormode, G., Duffield, N.G.: Don\u2019t let the negatives bring you down: sampling from streams of signed updates. In: ACM SIGMETRICS\/PERFORMANCE, pp. 343\u2013354 (2012). https:\/\/doi.org\/10.1145\/2254756.2254798","DOI":"10.1145\/2254756.2254798"},{"key":"960_CR9","doi-asserted-by":"publisher","unstructured":"Cohen, E., Kaplan, H.: Tighter estimation using bottom k sketches. Proc. VLDB Endow. 1(1), 213\u2013224 (2008). https:\/\/doi.org\/10.14778\/1453856.1453884","DOI":"10.14778\/1453856.1453884"},{"issue":"1\u20133","key":"960_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1561\/1900000004","volume":"4","author":"G Cormode","year":"2012","unstructured":"Cormode, G., Garofalakis, M.N., Haas, P.J., Jermaine, C.M.: Synopses for massive data: Samples, Histograms, Wavelets. Sketches. Found. Trends Databases 4(1\u20133), 1\u2013294 (2012). https:\/\/doi.org\/10.1561\/1900000004","journal-title":"Sketches. Found. Trends Databases"},{"issue":"1","key":"960_CR11","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1016\/J.JALGOR.2003.12.001","volume":"55","author":"G Cormode","year":"2005","unstructured":"Cormode, G., Muthukrishnan, S.: An improved data stream summary: the Count-Min sketch and its applications. J. Algorithms 55(1), 58\u201375 (2005). https:\/\/doi.org\/10.1016\/J.JALGOR.2003.12.001","journal-title":"J. Algorithms"},{"key":"960_CR12","doi-asserted-by":"publisher","unstructured":"Flajolet, P., \u00c9ric Fusy, Gandouet, O., Meunier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. DMTCS AH, 10 (2007). https:\/\/doi.org\/10.46298\/dmtcs.3545","DOI":"10.46298\/dmtcs.3545"},{"issue":"4","key":"960_CR13","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1007\/S00778-004-0135-3","volume":"13","author":"S Ganguly","year":"2004","unstructured":"Ganguly, S., Garofalakis, M.N., Rastogi, R.: Tracking set-expression cardinalities over continuous update streams. VLDB J. 13(4), 354\u2013369 (2004). https:\/\/doi.org\/10.1007\/S00778-004-0135-3","journal-title":"VLDB J."},{"key":"960_CR14","doi-asserted-by":"publisher","unstructured":"Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.: How to summarize the universe: Dynamic maintenance of quantiles. In: Proc. VDLB, pp. 454\u2013465 (2002). https:\/\/doi.org\/10.1016\/B978-155860869-6\/50047-0","DOI":"10.1016\/B978-155860869-6\/50047-0"},{"key":"960_CR15","unstructured":"Google: Guava: Google core libraries for java. https:\/\/github.com\/google\/guava (2024). Version used: 33.3.1"},{"key":"960_CR16","doi-asserted-by":"publisher","unstructured":"Jayaram, R., Woodruff, D.P.: Data streams with bounded deletions. In: Proc. PODS, pp. 341\u2013354. ACM (2018). https:\/\/doi.org\/10.1145\/3196959.3196986","DOI":"10.1145\/3196959.3196986"},{"key":"960_CR17","unstructured":"Kotz, D., Henderson, T., Abyzov, I., Yeo, J.: CRAWDAD Dartmouth\/campus (v. 2009-09-09) (2022)"},{"key":"960_CR18","doi-asserted-by":"publisher","unstructured":"Li, P., K\u00f6nig, A.C.: b-Bit minwise hashing. In: Proc. WWW, pp. 671\u2013680. ACM (2010). https:\/\/doi.org\/10.1145\/1772690.1772759","DOI":"10.1145\/1772690.1772759"},{"key":"960_CR19","unstructured":"Li, P., Owen, A.B., Zhang, C.: One permutation hashing for efficient search and learning. CoRR abs\/1208.1259 (2012)"},{"key":"960_CR20","doi-asserted-by":"publisher","unstructured":"Liu, Z., Manousis, A., Vorsanger, G., Sekar, V., Braverman, V.: One sketch to rule them all: Rethinking network flow monitoring with univmon. In: Proc. SIGCOMM, pp. 101\u2013114 (2016). https:\/\/doi.org\/10.1145\/2934872.2934906","DOI":"10.1145\/2934872.2934906"},{"issue":"11","key":"960_CR21","doi-asserted-by":"publisher","first-page":"3249","DOI":"10.14778\/3551793.3551867","volume":"15","author":"A Manousis","year":"2022","unstructured":"Manousis, A., Cheng, Z., Basat, R.B., Liu, Z., Sekar, V.: Enabling efficient and general subpopulation analytics in multidimensional data streams. Proc. VLDB Endow. 15(11), 3249\u20133262 (2022). https:\/\/doi.org\/10.14778\/3551793.3551867","journal-title":"Proc. VLDB Endow."},{"key":"960_CR22","unstructured":"Nambiar, R.O., Poess, M.: The making of TPC-DS. In: Proc. VLDB Endow., VLDB \u201906, p. 1049-1058. VLDB Endowment (2006)"},{"key":"960_CR23","doi-asserted-by":"publisher","unstructured":"Pagh, R., St\u00f6ckel, M., Woodruff, D.P.: Is min-wise hashing optimal for summarizing set intersection? In: Proc. PODS, pp. 109\u2013120. ACM (2014). https:\/\/doi.org\/10.1145\/2594538.2594554","DOI":"10.1145\/2594538.2594554"},{"issue":"3","key":"960_CR24","doi-asserted-by":"publisher","first-page":"319","DOI":"10.14778\/3632093.3632098","volume":"17","author":"WR Punter","year":"2023","unstructured":"Punter, W.R., Papapetrou, O., Garofalakis, M.N.: OmniSketch: efficient multi-dimensional high-velocity stream analytics with arbitrary predicates. Proc. VLDB Endow. 17(3), 319\u2013331 (2023)","journal-title":"Proc. VLDB Endow."},{"key":"960_CR25","volume-title":"Mathematical statistics and data analysis","author":"JA Rice","year":"2007","unstructured":"Rice, J.A.: Mathematical statistics and data analysis, vol. 371. Thomson\/Brooks\/Cole Belmont, CA (2007)"},{"key":"960_CR26","unstructured":"Veldhuizen, T.L.: Triejoin: A simple, worst-case optimal join algorithm. In: ICDT, pp. 96\u2013106. OpenProceedings.org (2014)"},{"key":"960_CR27","doi-asserted-by":"publisher","unstructured":"Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M.e.a.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nature Methods 17(3), 261\u2013272 (2020). https:\/\/doi.org\/10.1038\/s41592-019-0686-2","DOI":"10.1038\/s41592-019-0686-2"},{"issue":"11","key":"960_CR28","doi-asserted-by":"publisher","first-page":"2908","DOI":"10.14778\/3551793.3551840","volume":"15","author":"F Zhang","year":"2022","unstructured":"Zhang, F., Wang, S.: Effective indexing for dynamic structural graph clustering. Proc. VLDB Endow. 15(11), 2908\u20132920 (2022). https:\/\/doi.org\/10.14778\/3551793.3551840","journal-title":"Proc. VLDB Endow."},{"issue":"6","key":"960_CR29","doi-asserted-by":"publisher","first-page":"1215","DOI":"10.14778\/3514061.3514068","volume":"15","author":"F Zhao","year":"2022","unstructured":"Zhao, F., Agrawal, D., Abbadi, A.E., Metwally, A.: SpaceSaving$$\\pm $$ an optimal algorithm for frequency estimation and frequent items in the bounded deletion model. Proc. VLDB Endow. 15(6), 1215\u20131227 (2022). https:\/\/doi.org\/10.14778\/3514061.3514068","journal-title":"Proc. VLDB Endow."},{"issue":"7","key":"960_CR30","doi-asserted-by":"publisher","first-page":"1215","DOI":"10.14778\/3450980.3450990","volume":"14","author":"F Zhao","year":"2021","unstructured":"Zhao, F., Maiyya, S., Wiener, R., Agrawal, D., Abbadi, A.E.: KLL$$\\pm $$ Approximate quantile sketches over dynamic datasets. Proc. VLDB Endow. 14(7), 1215\u20131227 (2021). https:\/\/doi.org\/10.14778\/3450980.3450990","journal-title":"Proc. VLDB Endow."}],"container-title":["The VLDB Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00778-025-00960-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00778-025-00960-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00778-025-00960-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T07:27:44Z","timestamp":1770449264000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00778-025-00960-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1]]}},"alternative-id":["960"],"URL":"https:\/\/doi.org\/10.1007\/s00778-025-00960-6","relation":{},"ISSN":["1066-8888","0949-877X"],"issn-type":[{"value":"1066-8888","type":"print"},{"value":"0949-877X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1]]},"assertion":[{"value":"24 August 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 November 2025","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 December 2025","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 January 2026","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"9"}}