{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T06:26:57Z","timestamp":1778048817417,"version":"3.51.4"},"reference-count":54,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2025,2,10]],"date-time":"2025-02-10T00:00:00Z","timestamp":1739145600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100006374","name":"NSF","doi-asserted-by":"publisher","award":["210616,2114218"],"award-info":[{"award-number":["210616,2114218"]}],"id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,2,10]]},"abstract":"<jats:p>\n                    Many key-value stores and database systems use log-structured merge-trees (LSM-trees) as their storage engines because of their excellent write performance. However, the read performance of LSM-trees is suboptimal due to the overlapping sorted runs. Most existing efforts rely on filters to reduce unnecessary I\/Os, but filters fundamentally do not help locate items and often become the bottleneck of the system. We identify that the lack of efficient index is the root cause of subpar read performance in LSM-trees. In this paper, we propose Disco: a compact index for LSM-trees. Disco indexes all the keys in an LSM-tree, so a query does not have to search every run of the LSM-tree. It records compact key representations to minimize the number of key comparisons so as to minimize cache misses and I\/Os for both point and range queries. Disco guarantees that both point queries and seeks issue at most one I\/O to the underlying runs, achieving an I\/O efficiency close to a B\n                    <jats:sup>+<\/jats:sup>\n                    -tree. Disco improves upon REMIX's pioneering multi-run index design with additional compact key representations to help improve read performance. The representations are compact so the cost of persisting Disco to disk is small. Moreover, while a traditional LSM-tree has to choose a more aggressive compaction policy that slows down write performance to have better read performance, a Disco-indexed LSM-tree can employ a write-efficient policy and still have good read performance. Experimental results show that Disco can save I\/Os and improve point and range query performance by up to 220% over RocksDB while maintaining efficient writes.\n                  <\/jats:p>","DOI":"10.1145\/3709683","type":"journal-article","created":{"date-parts":[[2025,2,11]],"date-time":"2025-02-11T15:45:06Z","timestamp":1739288706000},"page":"1-27","source":"Crossref","is-referenced-by-count":2,"title":["Disco: A Compact Index for LSM-trees"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2118-0234","authenticated-orcid":false,"given":"Wenshao","family":"Zhong","sequence":"first","affiliation":[{"name":"University of Illinois Chicago, Chicago, IL, USA, &amp; TikTok Inc., Bellevue, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5799-0194","authenticated-orcid":false,"given":"Chen","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Illinois Chicago, Chicago, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1649-2612","authenticated-orcid":false,"given":"Xingbo","family":"Wu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Cambridge, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4848-7503","authenticated-orcid":false,"given":"Jakob","family":"Eriksson","sequence":"additional","affiliation":[{"name":"University of Illinois Chicago, Chicago, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,2,11]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"11th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2019","author":"Ahn Jung-Sang","year":"2019","unstructured":"Jung-Sang Ahn, Mohiuddin Abdul Qader, Woon-Hak Kang, Hieu Nguyen, Guogen Zhang, and Sami Ben-Romdhane. \"Jungle: Towards Dynamically Adjustable Key-Value Store by Combining LSM-Tree and Copy-On-Write B-Tree\". In: 11th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2019, Renton, WA, USA, July 8--9, 2019. 2019."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2254756.2254766"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/320521.320530"},{"issue":"5","key":"e_1_2_2_4_1","article-title":"An Introduction to B\u03b5-trees and Write-Optimization","volume":"40","author":"Bender Michael A.","year":"2015","unstructured":"Michael A. Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C. Kuszmaul, Donald E. Porter, Jun Yuan, and Yang Zhan. \"An Introduction to B\u03b5-trees and Write-Optimization\". In: login Usenix Mag. 40.5 (2015).","journal-title":"login Usenix Mag."},{"key":"e_1_2_2_5_1","unstructured":"Daniel J. Berstein. Crit-bit trees. url: https:\/\/cr.yp.to\/critbit.html."},{"key":"e_1_2_2_6_1","first-page":"521","volume-title":"Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018","author":"Binna Robert","year":"2018","unstructured":"Robert Binna, Eva Zangerle, Martin Pichl, G\u00fcnther Specht, and Viktor Leis. \"HOT: A Height Optimized Trie Index for Main-Memory Database Systems\". In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018. 2018, pp. 521--534."},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/362686.362692"},{"key":"e_1_2_2_8_1","first-page":"546","volume-title":"Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 12--14, 2003","author":"Brodal Gerth St\u00f8lting","year":"2003","unstructured":"Gerth St\u00f8lting Brodal and Rolf Fagerberg. \"Lower bounds for external memory dictionaries\". In: Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 12--14, 2003, Baltimore, Maryland, USA. 2003, pp. 546--554."},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/3386691.3386712"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3492321.3519555"},{"key":"e_1_2_2_11_1","unstructured":"F\u00e9lix Cloutier. PEXT - Parallel Bits Extract. url: https:\/\/www.felixcloutier.com\/x86\/pext."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588726"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_2_14_1","first-page":"155","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020","author":"Dai Yifan","year":"2020","unstructured":"Yifan Dai, Yien Xu, Aishwarya Ganesan, Ramnatthan Alagappan, Brian Kroth, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. \"From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees\". In: 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4--6, 2020. 2020, pp. 155--171."},{"key":"e_1_2_2_15_1","unstructured":"Our World In Data. Historical cost of computer memory and storage. url: https:\/\/ourworldindata.org\/grapher\/historical-cost-of-computer-memory-and-storage."},{"key":"e_1_2_2_16_1","first-page":"505","volume-title":"Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018","author":"Dayan Niv","year":"2018","unstructured":"Niv Dayan and Stratos Idreos. \"Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging\". In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018. 2018, pp. 505--520."},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3319903"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457273"},{"key":"e_1_2_2_19_1","first-page":"33","volume-title":"19th USENIX Conference on File and Storage Technologies, FAST 2021","author":"Dong Siying","year":"2021","unstructured":"Siying Dong, Andrew Kryczka, Yanqin Jin, and Michael Stumm. \"Evolution of Development Priorities in Key-value Stores Serving Large-scale Applications: The RocksDB Experience\". In: 19th USENIX Conference on File and Storage Technologies, FAST 2021, February 23--25, 2021. 2021, pp. 33--49."},{"key":"e_1_2_2_20_1","unstructured":"Facebook. Index Block Format. url: https:\/\/github.com\/facebook\/rocksdb\/wiki\/Index-Block-Format."},{"key":"e_1_2_2_21_1","unstructured":"Facebook. RocksDB. url: https:\/\/github.com\/facebook\/rocksdb."},{"key":"e_1_2_2_22_1","unstructured":"Facebook. RocksDB Tuning Guide. url: https:\/\/github.com\/facebook\/rocksdb\/wiki\/RocksDBTuning-Guide (visited on 04\/01\/2024)."},{"key":"e_1_2_2_23_1","first-page":"1","volume-title":"Proceedings of the 22nd Annual ACM Symposium on Theory of Computing, May 13--17, 1990","author":"Michael","year":"1990","unstructured":"Michael L. Fredman and Dan E. Willard. \"BLASTING through the Information Theoretic Barrier with FUSION TREES\". In: Proceedings of the 22nd Annual ACM Symposium on Theory of Computing, May 13--17, 1990, Baltimore, Maryland, USA. 1990, pp. 1--7."},{"key":"e_1_2_2_24_1","volume-title":"Proceedings of the Fifteenth EuroSys Conference 2020 (EuroSys'20)","author":"Gilad Eran","year":"2020","unstructured":"Eran Gilad, Edward Bortnikov, Anastasia Braginsky, Yonatan Gottesman, Eshcar Hillel, Idit Keidar, Nurit Moscovici, and Rana Shahout. \"EvenDB: optimizing key-value storage for spatial locality\". In: Proceedings of the Fifteenth EuroSys Conference 2020 (EuroSys'20). 2020, 27:1--27:16."},{"key":"e_1_2_2_25_1","unstructured":"Google. LevelDB. url: https:\/\/github.com\/google\/leveldb."},{"key":"e_1_2_2_26_1","unstructured":"Google. leveldb File format. url: https:\/\/github.com\/google\/leveldb\/blob\/main\/doc\/table_format.md."},{"key":"e_1_2_2_27_1","unstructured":"Andy Goth. critbit. url: https:\/\/wiki.tcl-lang.org\/page\/critbit."},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457297"},{"key":"e_1_2_2_29_1","unstructured":"Intel. Intel Optane SSD 905P Series 960GB 2.5in PCIe x4 3D XPoint. url: https:\/\/www.intel. com\/content\/www\/us\/en\/products\/sku\/147529\/intel-optane-ssd-905p-series-960gb-2--5inpcie-x4--3d-xpoint\/specifications.html."},{"key":"e_1_2_2_30_1","unstructured":"Intel. Intel Solid-State Drive DC S3500 Series. url: https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/product-specifications\/ssd-dc-s3500-spec.pdf."},{"key":"e_1_2_2_31_1","first-page":"1670","volume-title":"SIGMOD '22: International Conference on Management of Data","author":"Knorr Eric R.","year":"2022","unstructured":"Eric R. Knorr, Baptiste Lemaire, Andrew Lim, Siqiang Luo, Huanchen Zhang, Stratos Idreos, and Michael Mitzenmacher. \"Proteus: A Self-Designing Range Filter\". In: SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. 2022, pp. 1670--1684."},{"key":"e_1_2_2_32_1","unstructured":"Cockroach Labs. Cockroach Labs the company building CockroachDB. url: https:\/\/www.cockroachlabs.com\/."},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1773912.1773922"},{"key":"e_1_2_2_34_1","first-page":"1195","volume-title":"Proc. VLDB Endow. 3.1--2","author":"Li Yinan","year":"2010","unstructured":"Yinan Li, Bingsheng He, Robin Jun Yang, Qiong Luo, and Ke Yi. \"Tree Indexing on Solid State Drives\". In: Proc. VLDB Endow. 3.1--2 (2010), pp. 1195--1206."},{"key":"e_1_2_2_35_1","first-page":"133","volume-title":"14th USENIX Conference on File and Storage Technologies (FAST'16)","author":"Lu Lanyue","year":"2016","unstructured":"Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. \"WiscKey: Separating Keys from Values in SSD-conscious Storage\". In: 14th USENIX Conference on File and Storage Technologies (FAST'16). 2016, pp. 133--148."},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389731"},{"key":"e_1_2_2_37_1","unstructured":"John C. McCallum. SSD and Flash Memory Price Decreasing with Time. url: https:\/\/jcmit.net\/ flash2015.htm."},{"key":"e_1_2_2_38_1","first-page":"23","volume-title":"International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, ADMS@VLDB 2022","author":"Mun Ju Hyoung","year":"2022","unstructured":"Ju Hyoung Mun, Zichen Zhu, Aneesh Raman, and Manos Athanassoulis. \"LSM-Trees Under (Memory) Pressure\". In: International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, ADMS@VLDB 2022, Sydney, Australia, September 5, 2022. 2022, pp. 23--35."},{"key":"e_1_2_2_39_1","unstructured":"Arjun Narayan and Peter Mattis. Why we built cockroachdb on top of rocksdb. url: https: \/\/cockroachlabs.com\/blog\/cockroachdb-on-rocksd\/#fast-scans."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/s002360050048"},{"key":"e_1_2_2_41_1","first-page":"69","volume-title":"20th USENIX Conference on File and Storage Technologies, FAST 2022","author":"Qiao Yifan","year":"2022","unstructured":"Yifan Qiao, Xubin Chen, Ning Zheng, Jiangpeng Li, Yang Liu, and Tong Zhang. \"Closing the B-tree vs. LSM-tree Write Amplification Gap on Modern Storage Hardware with Built-in Transparent Compression\". In: 20th USENIX Conference on File and Storage Technologies, FAST 2022, Santa Clara, CA, USA, February 22--24, 2022. 2022, pp. 69--82."},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1326542.1326544"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476274"},{"key":"e_1_2_2_44_1","unstructured":"ScyllaDB. ScyllaDB. url: https:\/\/github.com\/scylladb\/scylla."},{"key":"e_1_2_2_45_1","unstructured":"Facebook Open Source. MyRocks | A RocksDB storage engine with MySQL. url: http:\/\/myrocks.io\/."},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3542929.3563479"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529347"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654944"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE55515.2023.00158"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303955"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196931"},{"key":"e_1_2_2_52_1","volume-title":"10th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2018","author":"Zhang Yueming","year":"2018","unstructured":"Yueming Zhang, Yongkun Li, Fan Guo, Cheng Li, and Yinlong Xu. \"ElasticBF: Fine-grained and Elastic Bloom Filter Towards Efficient Read for LSM-tree-based KV Stores\". In: 10th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2018, Boston, MA, USA, July 9--10, 2018. 2018."},{"key":"e_1_2_2_53_1","first-page":"51","volume-title":"19th USENIX Conference on File and Storage Technologies (FAST 21)","author":"Zhong Wenshao","year":"2021","unstructured":"Wenshao Zhong, Chen Chen, Xingbo Wu, and Song Jiang. \"REMIX: Efficient Range Query for LSM-trees\". In: 19th USENIX Conference on File and Storage Technologies (FAST 21). 2021, pp. 51--64."},{"key":"e_1_2_2_54_1","first-page":"17","volume-title":"Proceedings of the VLDB 2023 PhD Workshop co-located with the 49th International Conference on Very Large Data Bases (VLDB 2023","volume":"3452","author":"Zhu Zichen","year":"2023","unstructured":"Zichen Zhu. \"SHaMBa: Reducing Bloom Filter Overhead in LSM Trees\". In: Proceedings of the VLDB 2023 PhD Workshop co-located with the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28, 2023. Vol. 3452. 2023, pp. 17--20."}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709683","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3709683","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:23:17Z","timestamp":1774981397000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3709683"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,10]]},"references-count":54,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,2,10]]}},"alternative-id":["10.1145\/3709683"],"URL":"https:\/\/doi.org\/10.1145\/3709683","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,10]]}}}