{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T16:40:12Z","timestamp":1758904812850,"version":"3.44.0"},"reference-count":61,"publisher":"Association for Computing Machinery (ACM)","issue":"4","funder":[{"name":"anjing U35 Talent Cultivation Program","award":["No. U (2024) 001"],"award-info":[{"award-number":["No. U (2024) 001"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["NO.62272223, 62402212, U24B20153"],"award-info":[{"award-number":["NO.62272223, 62402212, U24B20153"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004608","name":"Natural Science Foundation of Jiangsu Province","doi-asserted-by":"publisher","award":["BK20241245"],"award-info":[{"award-number":["BK20241245"]}],"id":[{"id":"10.13039\/501100004608","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,9,22]]},"abstract":"<jats:p>Range filters can check whether a queried range is non-empty within a key set, with no false negatives and a low false positive rate. However, existing range filters fail to address recurring false positives in skewed or adversarial queries. In this paper, we propose Hourglass, an adaptive range filter that defends against recurring false positives through lightweight hybrid encoding and semi-sorted adaptivity. Hourglass partitions keys into prefixes, stored in a semi-sorted cuckoo filter, and suffixes, encoded using hybrid encoding schemes based on their sparsity. By preserving the order of fingerprints, the semi-sorted cuckoo filter improves space efficiency. Additionally, Hourglass introduces a new adaptivity strategy that updates fingerprints without violating the semi-sorting order. Further, Hourglass introduces a correlation-aware space allocation model to optimize space across varying key-query correlation degrees. The evaluations show that Hourglass outperforms state-of-the-art range filters under adversarial workloads, achieving a 9.8-35.4X lower false positive rate. Moreover, they demonstrate that Hourglass delivers robust performance on both synthetic and real-world datasets, as well as under varying key-query correlation degrees.<\/jats:p>","DOI":"10.1145\/3749169","type":"journal-article","created":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T17:17:03Z","timestamp":1758647823000},"page":"1-26","source":"Crossref","is-referenced-by-count":0,"title":["Hourglass: An Adaptive Range Filter with Lightweight Hybrid Encoding"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-2537-3522","authenticated-orcid":false,"given":"Feifan","family":"Liu","sequence":"first","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1565-9997","authenticated-orcid":false,"given":"Rong","family":"Gu","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5764-960X","authenticated-orcid":false,"given":"Meng","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0545-8187","authenticated-orcid":false,"given":"Haipeng","family":"Dai","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-4593-4120","authenticated-orcid":false,"given":"Baohan","family":"Wang","sequence":"additional","affiliation":[{"name":"China Mobile (Suzhou) Software Technology Co., Ltd, Suzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-0892-6026","authenticated-orcid":false,"given":"Jie","family":"Tao","sequence":"additional","affiliation":[{"name":"China Mobile (Suzhou) Software Technology Co., Ltd, Suzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0422-5285","authenticated-orcid":false,"given":"Dian","family":"Shen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Southeast University, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,23]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556556"},{"key":"e_1_2_1_2_1","first-page":"209","article-title":"Evaluating spatial-keyword queries on streaming data. In SIGSPATIAL","author":"Almaslukh Abdulaziz","year":"2018","unstructured":"Abdulaziz Almaslukh and Amr Magdy. 2018. Evaluating spatial-keyword queries on streaming data. In SIGSPATIAL, . ACM, 209-218.","journal-title":"ACM"},{"key":"e_1_2_1_3_1","first-page":"182","article-title":"Bloom Filters, Adaptivity, and the Dictionary Problem. In FOCS","author":"Bender Michael A.","year":"2018","unstructured":"Michael A. Bender, Martin Farach-Colton, Mayank Goswami, Rob Johnson, Samuel McCauley, and Shikha Singh. 2018. Bloom Filters, Adaptivity, and the Dictionary Problem. In FOCS, IEEE Computer Society, 182-193.","journal-title":"IEEE Computer Society"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/362686.362692"},{"key":"e_1_2_1_5_1","first-page":"126","article-title":"Web Caching and Zipf-like Distributions: Evidence and Implications. In INFOCOM","author":"Breslau Lee","year":"1999","unstructured":"Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. 1999. Web Caching and Zipf-like Distributions: Evidence and Implications. In INFOCOM, IEEE Computer Society, 126-134.","journal-title":"IEEE Computer Society"},{"key":"e_1_2_1_6_1","first-page":"2304","article-title":"Weighted Bloom filter. In ISIT","author":"Bruck Jehoshua","year":"2006","unstructured":"Jehoshua Bruck, Jie Gao, and Anxiao Jiang. 2006. Weighted Bloom filter. In ISIT, IEEE, 2304-2308.","journal-title":"IEEE"},{"key":"e_1_2_1_7_1","first-page":"89","article-title":"LogKV: Exploiting key-value stores for event log processing. In CIDR","author":"Cao Zhao","year":"2013","unstructured":"Zhao Cao, Shimin Chen, Feifei Li, Min Wang, and X Sean Wang. 2013. LogKV: Exploiting key-value stores for event log processing. In CIDR, ACM, 89-94.","journal-title":"ACM"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2325"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.246"},{"key":"e_1_2_1_10_1","unstructured":"Guanduo Chen. 2024. Oasis. https:\/\/github.com\/Woooooow-Pro\/Oasis-RangeFilter."},{"key":"e_1_2_1_11_1","first-page":"1911","article-title":"Oasis","volume":"17","author":"Chen Guanduo","year":"2024","unstructured":"Guanduo Chen, Meng Li, Siqiang Luo, and Zhenying He. 2024. Oasis: An Optimal Disjoint Segmented Learned Range Filter. Proc. VLDB Endow., Vol. 17, 8 (2024), 1911-1924.","journal-title":"An Optimal Disjoint Segmented Learned Range Filter. Proc. VLDB Endow."},{"key":"e_1_2_1_12_1","first-page":"1087","article-title":"Realtime Data Processing at Facebook. In SIGMOD","author":"Chen Guoqiang Jerry","year":"2016","unstructured":"Guoqiang Jerry Chen, Janet L. Wiener, Shridhar Iyer, Anshul Jaiswal, Ran Lei, Nikhil Simha, Wei Wang, Kevin Wilfong, Tim Williamson, and Serhat Yilmaz. 2016. Realtime Data Processing at Facebook. In SIGMOD, ACM, 1087-1098.","journal-title":"ACM"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1137\/070710111"},{"key":"e_1_2_1_14_1","unstructured":"Marco Costa. 2024. Grafite. https:\/\/github.com\/marcocosta97\/grafite."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639258"},{"key":"e_1_2_1_16_1","first-page":"259","article-title":"CSRQ","volume":"16","author":"Dai Hua","year":"2016","unstructured":"Hua Dai, Qingqun Ye, Geng Yang, Jia Xu, and Ruiliang He. 2016. CSRQ: Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks. Sensors, Vol. 16, 2 (2016), 259-265.","journal-title":"Communication-Efficient Secure Range Queries in Two-Tiered Sensor Networks. Sensors"},{"key":"e_1_2_1_17_1","first-page":"11700","article-title":"Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier with Application to Real-Time Information Filtering on the Web. In NeurIPS, Curran Associates","author":"Dai Zhenwei","year":"2020","unstructured":"Zhenwei Dai and Anshumali Shrivastava. 2020. Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier with Application to Real-Time Information Filtering on the Web. In NeurIPS, Curran Associates, Inc., 11700-11710.","journal-title":"Inc."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/3436905.3436919"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/321812.321820"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3698820"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2674005.2674994"},{"volume-title":"On the number of bits required to implement an associative memory","author":"Fano Robert Mario","key":"e_1_2_1_22_1","unstructured":"Robert Mario Fano. 1971. On the number of bits required to implement an associative memory. Massachusetts Institute of Technology, Project MAC."},{"key":"e_1_2_1_23_1","volume-title":"Kasper Green Larsen, and Rasmus Pagh.","author":"Goswami Mayank","year":"2015","unstructured":"Mayank Goswami, Allan Gr\u00f8nlund J\u00f8rgensen, Kasper Green Larsen, and Rasmus Pagh. 2015. Approximate Range Emptiness in Constant Time and Optimal Space. In SODA, SIAM, 769-775."},{"key":"e_1_2_1_24_1","first-page":"13","article-title":"Real-Time Range Query Approximation by Means of Adaptive Quad Streaming. In SENSORNETS","author":"Keller Simon","year":"2021","unstructured":"Simon Keller and Rainer Mueller. 2021. Real-Time Range Query Approximation by Means of Adaptive Quad Streaming. In SENSORNETS, SCITEPRESS, 13-24.","journal-title":"SCITEPRESS"},{"key":"e_1_2_1_25_1","unstructured":"Eric R. Knorr. 2022. Proteus. https:\/\/github.com\/Eric-R-Knorr\/Proteus."},{"key":"e_1_2_1_26_1","first-page":"1670","article-title":"Proteus: A Self-Designing Range Filter. In SIGMOD","author":"Knorr Eric R.","year":"2022","unstructured":"Eric R. Knorr, Baptiste Lemaire, Andrew Lim, Siqiang Luo, Huanchen Zhang, Stratos Idreos, and Michael Mitzenmacher. 2022. Proteus: A Self-Designing Range Filter. In SIGMOD, ACM, 1670-1684.","journal-title":"ACM"},{"volume-title":"WADS","author":"Kopelowitz Tsvi","key":"e_1_2_1_27_1","unstructured":"Tsvi Kopelowitz, Samuel McCauley, and Ely Porat. 2021. Support Optimality and Adaptive Cuckoo Filters. In WADS, Springer, 556-570."},{"key":"e_1_2_1_28_1","first-page":"1","article-title":"Telescoping Filter: A Practical Adaptive Filter. In ESA","volume":"60","author":"Lee David J.","year":"2021","unstructured":"David J. Lee, Samuel McCauley, Shikha Singh, and Max Stein. 2021. Telescoping Filter: A Practical Adaptive Filter. In ESA, Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik, 60:1-60:18.","journal-title":"Schloss Dagstuhl - Leibniz-Zentrum f\u00fcr Informatik"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3709736"},{"key":"e_1_2_1_30_1","first-page":"2759","article-title":"Seesaw Counting Filter: An Efficient Guardian for Vulnerable Negative Keys During Dynamic Filtering. In WWW","author":"Li Meng","year":"2022","unstructured":"Meng Li, Deyi Chen, Haipeng Dai, Rongbiao Xie, Siqiang Luo, Rong Gu, Tong Yang, and Guihai Chen. 2022a. Seesaw Counting Filter: An Efficient Guardian for Vulnerable Negative Keys During Dynamic Filtering. In WWW, ACM, 2759-2767.","journal-title":"ACM"},{"key":"e_1_2_1_31_1","first-page":"1940","article-title":"The Reinforcement Cuckoo Filter. In INFOCOM","author":"Li Meng","year":"2024","unstructured":"Meng Li, Wenqi Luo, Haipeng Dai, Huayi Chai, Rong Gu, Xiaoyu Wang, and Guihai Chen. 2024. The Reinforcement Cuckoo Filter. In INFOCOM, IEEE, 1940-1949.","journal-title":"IEEE"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-022-00755-z"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733085.2733100"},{"key":"e_1_2_1_34_1","first-page":"1","article-title":"Algorithmic improvements for fast concurrent cuckoo hashing. In EuroSys","author":"Li Xiaozhou","year":"2014","unstructured":"Xiaozhou Li, David G Andersen, Michael Kaminsky, and Michael J Freedman. 2014a. Algorithmic improvements for fast concurrent cuckoo hashing. In EuroSys, ACM, 1-14.","journal-title":"ACM"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11036-011-0298-2"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389731"},{"volume-title":"ALENEX","author":"Mitzenmacher Michael","key":"e_1_2_1_37_1","unstructured":"Michael Mitzenmacher, Salvatore Pontarelli, and Pedro Reviriego. 2018. Adaptive Cuckoo Filters. In ALENEX, SIAM, 36-47."},{"key":"e_1_2_1_38_1","volume-title":"Proc. ACM Manag. Data","volume":"3","author":"Mo Dingheng","year":"2025","unstructured":"Dingheng Mo, Siqiang Luo, and Stratos Idreos. 2025. How to Grow an LSM-tree? Towards Bridging the Gap Between Theory and Practice. Proc. ACM Manag. Data, Vol. 3, 3 (2025), 173:1-173:25."},{"key":"e_1_2_1_39_1","unstructured":"Bernhard M\u00f6\u00dfner. 2023. BloomRF. https:\/\/github.com\/awitten1\/bloomRF."},{"key":"e_1_2_1_40_1","unstructured":"Bernhard M\u00f6\u00dfner Christian Riegger Arthur Bernhardt and Ilia Petrov. 2023. bloomRF: On Performing Range-Queries in Bloom-Filters with Piecewise-Monotone Hash Functions and Prefix Hashing. In EDBT OpenProceedings.org 131-143."},{"key":"e_1_2_1_41_1","first-page":"775","article-title":"A General-Purpose Counting Filter: Making Every Bit Count. In SIGMOD","author":"Pandey Prashant","year":"2017","unstructured":"Prashant Pandey, Michael A. Bender, Rob Johnson, and Rob Patro. 2017. A General-Purpose Counting Filter: Making Every Bit Count. In SIGMOD, ACM, 775-787.","journal-title":"ACM"},{"key":"e_1_2_1_42_1","first-page":"5271","article-title":"Meta-Learning Neural Bloom Filters. In ICML","author":"Rae Jack W.","year":"2019","unstructured":"Jack W. Rae, Sergey Bartunov, and Timothy P. Lillicrap. 2019. Meta-Learning Neural Bloom Filters. In ICML, PMLR, 5271-5280.","journal-title":"PMLR"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639290"},{"key":"e_1_2_1_44_1","volume-title":"ACM","author":"Su Yunxiang","year":"2023","unstructured":"Yunxiang Su, Wenxuan Ma, and Shaoxu Song. 2023. Learning Autoregressive Model in LSM-Tree based Store. In KDD, ACM, 2061-2071."},{"key":"e_1_2_1_45_1","first-page":"68","article-title":"LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes. In ICDCS","author":"Teng Dejun","year":"2017","unstructured":"Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, and Xiaodong Zhang. 2017. LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes. In ICDCS, IEEE, 68-79.","journal-title":"IEEE"},{"key":"e_1_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Kapil Vaidya. 2022. SNARF. https:\/\/github.com\/kapilvaidya24\/SNARF.","DOI":"10.14778\/3529337.3529347"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529347"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2024.3403997"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654944"},{"volume-title":"DASFAA","author":"Wang Xiujun","key":"e_1_2_1_50_1","unstructured":"Xiujun Wang, Yusheng Ji, Zhe Dang, Xiao Zheng, and Baohua Zhao. 2015. Improved Weighted Bloom Filter and Space Lower Bound Analysis of Algorithms for Approximated Membership Querying. In DASFAA, Springer, 346-362."},{"key":"e_1_2_1_51_1","first-page":"5849","article-title":"HotKey-LSM: A Hotness-Aware LSM-Tree for Big Data Storage. In BigData","author":"Wang Yi","year":"2020","unstructured":"Yi Wang, Peiquan Jin, and Shouhong Wan. 2020. HotKey-LSM: A Hotness-Aware LSM-Tree for Big Data Storage. In BigData, IEEE, 5849-5851.","journal-title":"IEEE"},{"key":"e_1_2_1_52_1","unstructured":"Ziwei Wang. 2023. REncoder. https:\/\/github.com\/Range-Filter\/REncoder."},{"key":"e_1_2_1_53_1","first-page":"2036","article-title":"REncoder: A Space-Time Efficient Range Filter with Local Encoder. In ICDE","author":"Wang Ziwei","year":"2023","unstructured":"Ziwei Wang, Zheng Zhong, Jiarui Guo, Yuhan Wu, Haoyu Li, Tong Yang, Yaofeng Tu, Huanchen Zhang, and Bin Cui. 2023. REncoder: A Space-Time Efficient Range Filter with Local Encoder. In ICDE, IEEE, 2036-2049.","journal-title":"IEEE"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3677128"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407803"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3677138"},{"key":"e_1_2_1_57_1","unstructured":"Huanchen Zhang. 2018. SuRF. https:\/\/github.com\/efficient\/SuRF."},{"key":"e_1_2_1_58_1","first-page":"323","article-title":"SuRF: Practical Range Query Filtering with Fast Succinct Tries. In SIGMOD","author":"Zhang Huanchen","year":"2018","unstructured":"Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky, Kimberly Keeton, and Andrew Pavlo. 2018. SuRF: Practical Range Query Filtering with Fast Succinct Tries. In SIGMOD, ACM, 323-336.","journal-title":"ACM"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.14778\/3547305.3547320"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2021.3113994"},{"key":"e_1_2_1_61_1","volume-title":"REMIX: Efficient Range Query for LSM-trees","author":"Zhong Wenshao","year":"2021","unstructured":"Wenshao Zhong, Chen Chen, Xingbo Wu, and Song Jiang. 2021. REMIX: Efficient Range Query for LSM-trees. In FAST. USENIX Association, 51-64."}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3749169","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T16:22:59Z","timestamp":1758903779000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3749169"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,22]]},"references-count":61,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,9,22]]}},"alternative-id":["10.1145\/3749169"],"URL":"https:\/\/doi.org\/10.1145\/3749169","relation":{},"ISSN":["2836-6573"],"issn-type":[{"type":"electronic","value":"2836-6573"}],"subject":[],"published":{"date-parts":[[2025,9,22]]}}}