{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:17:12Z","timestamp":1779175032245,"version":"3.51.4"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Range filters are compact probabilistic data structures that answer approximate range emptiness queries. They are used in many domains, e.g., in key-value stores, to quickly rule out the existence of keys in a given query range and avoid having to search for them in storage. However, all existing range filters exhibit at least one of the following shortcomings: (1) they do not provide robust false positive rate and performance guarantees, (2) they do not support variable-length keys and query ranges, and (3) they do not allow dynamic operations such as insertions, deletions, or expansions.<\/jats:p>\n          <jats:p>We introduce Diva, the first range filter to address all the above challenges simultaneously. Diva learns the dataset's distribution by sampling keys and storing them in a cache-efficient trie. It compresses the keys in-between samples by removing their longest common prefix and truncating their suffixes while leaving enough bits in the middle (i.e., an infix) to allow differentiating between the keys in the sorted order. It stores infixes in constant time dynamic data blocks, which it splits to handle insertions and expansions. It processes a range query by traversing the trie and checking for the inclusion of infixes in the target query range.<\/jats:p>\n          <jats:p>We show, theoretically and empirically, that Diva achieves a false positive rate on par with the state of the art on real-world datasets while supporting dynamicity and variable-length queries and keys.<\/jats:p>","DOI":"10.14778\/3749646.3749664","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T17:55:06Z","timestamp":1757008506000},"page":"3923-3936","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Diva: Dynamic Range Filter for Var-Length Keys and Queries"],"prefix":"10.14778","volume":"18","author":[{"given":"Navid","family":"Eslami","sequence":"first","affiliation":[{"name":"University of Toronto, Toronto, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ioana O.","family":"Bercea","sequence":"additional","affiliation":[{"name":"KTH Royal Institute of Technology, Stockholm, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niv","family":"Dayan","sequence":"additional","affiliation":[{"name":"University of Toronto, Toronto, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2025. EnWiki. Retrieved May 27 2025 from https:\/\/dumps.wikimedia.org\/enwiki\/latest\/"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556556"},{"key":"e_1_2_1_3_1","unstructured":"Apache. 2024. Apache Cassandra. Retrieved October 6 2024 from https:\/\/cassandra.apache.org\/_\/index.html"},{"key":"e_1_2_1_4_1","unstructured":"Apache. 2024. Apache Druid. Retrieved October 6 2024 from https:\/\/druid.apache.org\/"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.3129"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/301250.301323"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.2002.1822"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.SWAT.2024.9"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196896"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1080\/15427951.2004.10129096"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/SFCS.1985.48"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/3659437.3659447"},{"key":"e_1_2_1_13_1","volume-title":"BigQuery Enterprise Data Warehouse. Retrieved","author":"Cloud Google","year":"2024","unstructured":"Google Cloud. 2024. BigQuery Enterprise Data Warehouse. Retrieved October 6, 2024 from https:\/\/cloud.google.com\/bigquery\/"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/356770.356776"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC'20). USENIX Association, USA, Article 4, 15 pages.","author":"Conway Alex","year":"2020","unstructured":"Alex Conway, Abhishek Gupta, Vijay Chidambaran, Martin Farach-Colton, Rick Spillane, Amy Tai, and Rob Johnson. 2020. SplinterDB: closing the bandwidth gap for NVMe key-value stores. In Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC'20). USENIX Association, USA, Article 4, 15 pages."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_17_1","volume-title":"Grafite: Taming Adversarial Queries with Optimal Range Filters. arXiv:2311.15380 [cs.DS]","author":"Costa Marco","year":"2023","unstructured":"Marco Costa, Paolo Ferragina, and Giorgio Vinciguerra. 2023. Grafite: Taming Adversarial Queries with Optimal Range Filters. arXiv:2311.15380 [cs.DS]"},{"key":"e_1_2_1_18_1","volume-title":"Aleph Filter: To Infinity in Constant Time. arXiv:2404.04703 [cs.DB] https:\/\/arxiv.org\/abs\/2404.04703","author":"Dayan Niv","year":"2024","unstructured":"Niv Dayan, Ioana Bercea, and Rasmus Pagh. 2024. Aleph Filter: To Infinity in Constant Time. arXiv:2404.04703 [cs.DB] https:\/\/arxiv.org\/abs\/2404.04703"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589285"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457273"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3483840"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/321812.321820"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3698820"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-024-00873-w"},{"key":"e_1_2_1_25_1","volume-title":"On the Number of Bits Required to Implement an Associative Memory","author":"Fano R.M.","unstructured":"R.M. Fano. 1971. On the Number of Bits Required to Implement an Associative Memory. MIT Project MAC Computer Structures Group. https:\/\/books.google.ca\/books?id=07DeGwAACAAJ"},{"key":"e_1_2_1_26_1","volume-title":"MariaDB Server: The Innovative Open Source Database. Retrieved","author":"Foundation DB","year":"2024","unstructured":"MariaDB Foundation. 2024. MariaDB Server: The Innovative Open Source Database. Retrieved October 6, 2024 from https:\/\/mariadb.org\/"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/2722129.2722181"},{"key":"e_1_2_1_28_1","volume-title":"SOSD: A Benchmark for Learned Indexes. NeurIPS Workshop on Machine Learning for Systems","author":"Kipf Andreas","year":"2019","unstructured":"Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann. 2019. SOSD: A Benchmark for Learned Indexes. NeurIPS Workshop on Machine Learning for Systems (2019)."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526167"},{"key":"e_1_2_1_30_1","volume-title":"Sorting And Searching","author":"Knuth Donald","unstructured":"Donald Knuth. 1973. The Art Of Computer Programming, vol. 3: Sorting And Searching. Addison-Wesley. 391\u2013392 pages."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196909"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544812"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389731"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2168836.2168855"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3421424.3421425"},{"key":"e_1_2_1_36_1","volume-title":"Retrieved","author":"DB.","year":"2024","unstructured":"MongoDB. 2024. WiredTiger Storage Engine. Retrieved January 15, 2024 from https:\/\/www.mongodb.com\/docs\/manual\/core\/wiredtiger\/"},{"key":"e_1_2_1_37_1","unstructured":"Bernhard M\u00f6\u00dfner Christian Riegger Arthur Bernhardt and Ilia Petrov. 2022. bloomRF: On Performing Range-Queries in Bloom-Filters with Piecewise-Monotone Hash Functions and Prefix Hashing. arXiv:2207.04789 [cs.DB]"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s002360050048"},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Rasmus Pagh Gil Segev and Udi Wieder. 2013. How to Approximate A Set Without Knowing Its Size In Advance. arXiv:1304.1188 [cs.DS] https:\/\/arxiv.org\/abs\/1304.1188","DOI":"10.1109\/FOCS.2013.17"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2013.17"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3035963"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3626246.3654681"},{"key":"e_1_2_1_43_1","volume-title":"Database Management Systems (3 ed.)","author":"Ramakrishnan Raghu","unstructured":"Raghu Ramakrishnan and Johannes Gehrke. 2002. Database Management Systems (3 ed.). McGraw-Hill, Inc., USA."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453914"},{"key":"e_1_2_1_45_1","volume-title":"The Snowflake AI Data Cloud. Retrieved","year":"2024","unstructured":"Snowflake. 2024. The Snowflake AI Data Cloud. Retrieved October 6, 2024 from https:\/\/www.snowflake.com\/en\/"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529347"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE55515.2023.00158"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(83)90075-3"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303955"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/355602.361319"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196931"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3749646.3749664","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T03:03:02Z","timestamp":1757041382000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3749646.3749664"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":51,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.14778\/3749646.3749664"],"URL":"https:\/\/doi.org\/10.14778\/3749646.3749664","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}