{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:14:50Z","timestamp":1779174890290,"version":"3.51.4"},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2022,4,11]],"date-time":"2022-04-11T00:00:00Z","timestamp":1649635200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2022,6,30]]},"abstract":"<jats:p>Data sketches are approximate succinct summaries of long data streams. They are widely used for processing massive amounts of data and answering statistical queries about it. Existing libraries producing sketches are very fast, but do not allow parallelism for creating sketches using multiple threads or querying them while they are being built. We present a generic approach to parallelising data sketches efficiently and allowing them to be queried in real time, while bounding the error that such parallelism introduces. Utilising relaxed semantics and the notion of strong linearisability, we prove our algorithm\u2019s correctness and analyse the error it induces in some specific sketches. Our implementation achieves high scalability while keeping the error small. We have contributed one of our concurrent sketches to the open-source data sketches library.<\/jats:p>","DOI":"10.1145\/3512758","type":"journal-article","created":{"date-parts":[[2022,4,11]],"date-time":"2022-04-11T10:19:12Z","timestamp":1649672352000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Fast Concurrent Data Sketches"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9625-0140","authenticated-orcid":false,"given":"Arik","family":"Rinberg","sequence":"first","affiliation":[{"name":"Technion, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7148-0414","authenticated-orcid":false,"given":"Alexander","family":"Spiegelman","sequence":"additional","affiliation":[{"name":"Facebook"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6147-6924","authenticated-orcid":false,"given":"Edward","family":"Bortnikov","sequence":"additional","affiliation":[{"name":"Yahoo! Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2443-9822","authenticated-orcid":false,"given":"Eshcar","family":"Hillel","sequence":"additional","affiliation":[{"name":"Yahoo! Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6417-1250","authenticated-orcid":false,"given":"Idit","family":"Keidar","sequence":"additional","affiliation":[{"name":"Technion, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5871-0377","authenticated-orcid":false,"given":"Lee","family":"Rhodes","sequence":"additional","affiliation":[{"name":"Yahoo! Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4396-2536","authenticated-orcid":false,"given":"Hadar","family":"Serviansky","sequence":"additional","affiliation":[{"name":"Weizmann Institute, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,4,11]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Java Executive Committee. 2011. Java Language Specification: Chapter 17 - Threads and Locks. Retrieved from https:\/\/docs.oracle.com\/javase\/specs\/jls\/se7\/html\/jls-17.html."},{"key":"e_1_3_2_3_2","unstructured":"Lee Rhodes. 2015. Theta Sketch Equations. Retrieved from https:\/\/github.com\/apache\/datasketches-website\/blob\/master\/docs\/pdf\/ThetaSketchEquations.pdf."},{"key":"e_1_3_2_4_2","unstructured":"Mehrdad Honarkhah and Arya Talebzadeh. 2018. HyperLogLog in Presto: A Significantly Faster Way to Handle Cardinality Estimation. Retrieved from https:\/\/code.fb.com\/data-infrastructure\/hyperloglog\/."},{"key":"e_1_3_2_5_2","unstructured":"Apache DataSketches Committee. 2019. Apache DataSketches. Retrieved from https:\/\/datasketches.apache.org\/."},{"key":"e_1_3_2_6_2","unstructured":"Github User hpx7. 2019. ArrayIndexOutOfBoundsException During Serialization. Retrieved from https:\/\/github.com\/DataSketches\/sketches-core\/issues\/178#issuecomment-365673204."},{"key":"e_1_3_2_7_2","unstructured":"Apache DataSketches Committee. 2019. DataSketches: Concurrent Theta Sketch Implementation. Retrieved from https:\/\/datasketches.apache.org\/docs\/Theta\/ConcurrentThetaSketch.html."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","unstructured":"Mihai Budiu Parikshit Gopalan Lalith Suresh Udi Wieder Han Kruiger and Marcos K. Aguilera. 2019. Hillview: A Big Data Spreadsheet. Retrieved from https:\/\/research.vmware.com\/projects\/hillview.","DOI":"10.14778\/3342263.3342279"},{"key":"e_1_3_2_9_2","unstructured":"Lee Rhodes. 2019. SketchesArgumentException: Key Not Found and No Empty Slot in Table. Retrieved from https:\/\/groups.google.com\/d\/msg\/sketches-user\/S1PEAneLmhk\/dI8RbN6iBAAJ."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-17653-1_29"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2213556.2213562"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3210377.3210411"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/2858788.2688523"},{"key":"e_1_3_2_14_2","volume-title":"Proceedings of the 33rd International Symposium on Distributed Computing (DISC\u201919)","author":"Attiya Hagit","year":"2019","unstructured":"Hagit Attiya and Constantin Enea. 2019. Putting strong linearizability in context: Preserving hyperproperties in programs that use concurrent objects. In Proceedings of the 33rd International Symposium on Distributed Computing (DISC\u201919). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik."},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.5555\/646978.711822"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/1379022.1375591"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/2594538.2594546"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3084693.3104030"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.001"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/1921659.1921667"},{"key":"e_1_3_2_21_2","article-title":"Order statistics","volume":"9","author":"David Herbert Aron","year":"2004","unstructured":"Herbert Aron David and Haikady Navada Nagaraja. 2004. Order statistics. Encyc. Statist. Sci. 9 (2004).","journal-title":"Encyc. Statist. Sci."},{"key":"e_1_3_2_22_2","article-title":"How We Scaled HyperLogLog: Three Real-World Optimizations","unstructured":"Druid. 2010. How We Scaled HyperLogLog: Three Real-World Optimizations. Retrieved from http:\/\/druid.io\/blog\/2014\/02\/18\/hyperloglog-optimizations-for-real-world-systems.html.","journal-title":"Retrieved from http:\/\/druid.io\/blog\/2014\/02\/18\/hyperloglog-optimizations-for-real-world-systems.html"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2010.09.021"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.46298\/dmtcs.3545"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/1993636.1993687"},{"key":"e_1_3_2_26_2","first-page":"317","volume-title":"ACM SIGPLAN Notices","author":"Henzinger Thomas A.","year":"2013","unstructured":"Thomas A. Henzinger, Christoph M. Kirsch, Hannes Payer, Ali Sezgin, and Ana Sokolova. 2013. Quantitative relaxation of concurrent data structures. In ACM SIGPLAN Notices, Vol. 48. ACM, 317\u2013328."},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/2452376.2452456"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2016.17"},{"key":"e_1_3_2_29_2","volume-title":"Distributed Algorithms","author":"Lynch Nancy A.","year":"1996","unstructured":"Nancy A. Lynch. 1996. Distributed Algorithms. Elsevier."},{"key":"e_1_3_2_30_2","article-title":"Multiqueues: Simpler, faster, and better relaxed concurrent priority queues","author":"Rihani Hamza","year":"2014","unstructured":"Hamza Rihani, Peter Sanders, and Roman Dementiev. 2014. Multiqueues: Simpler, faster, and better relaxed concurrent priority queues. arXiv preprint arXiv:1411.1209 (2014).","journal-title":"arXiv preprint arXiv:1411.1209"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.DISC.2020.2"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196930"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-45174-8_29"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3147.3165"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230544"}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3512758","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3512758","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:29Z","timestamp":1750183769000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3512758"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,11]]},"references-count":34,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,6,30]]}},"alternative-id":["10.1145\/3512758"],"URL":"https:\/\/doi.org\/10.1145\/3512758","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"value":"2329-4949","type":"print"},{"value":"2329-4957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,11]]},"assertion":[{"value":"2020-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-04-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}