{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T09:02:28Z","timestamp":1775638948331,"version":"3.50.1"},"reference-count":68,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,7]]},"abstract":"<jats:p>A B-Tree is the most widely used range index for larger-than-memory data systems. It organizes data in pages (usually 4 KB) that efficiently align with disk IO operations, fully utilizing each IO operation to narrow down the search space. On the other hand, a B-Tree's page-based organization leads to inefficient caching and high write amplification, as it needs to cache the entire page as a whole while often only a small subset of records are hot, and it needs to write the entire page for a single record update.<\/jats:p>\n          <jats:p>\n            The key insight of this paper is to\n            <jats:italic>separate cache pages from disk pages<\/jats:italic>\n            , i.e., a cache page is no longer a pure mirror of its disk content, but instead, it forms a judiciously chosen subset of the disk page that is worth caching, and can absorb both read and write operations in a consistent manner. Based on this insight, we propose Bf-Tree, a modern B-Tree that is\n            <jats:italic>read-write-optimized<\/jats:italic>\n            by building a new variable-length buffer pool to manage such cache pages, called\n            <jats:italic>mini-pages.<\/jats:italic>\n            Bf-Tree uses this in-memory buffer pool to support efficient record-level caching, buffering recent updates, caching range gaps, as well as mirrors of disk pages when needed. We implement a fully featured and modern Bf-Tree in Rust with 13k lines of code, and show that Bf-Tree is 2.5\u00d7 faster than RocksDB (LSM-Tree) for scan operations, 6\u00d7 faster than a B-Tree for write operations, and 2\u00d7 faster than both B-Trees and LSM-Trees for point lookups. We believe these results firmly establish a new standard for database storage engines of the future.\n          <\/jats:p>","DOI":"10.14778\/3681954.3682012","type":"journal-article","created":{"date-parts":[[2024,8,30]],"date-time":"2024-08-30T16:23:36Z","timestamp":1725035016000},"page":"3442-3455","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Bf-Tree: A Modern Read-Write-Optimized Concurrent Larger-Than-Memory Range Index"],"prefix":"10.14778","volume":"17","author":[{"given":"Xiangpeng","family":"Hao","sequence":"first","affiliation":[{"name":"University of Wisconsin-Madison"}]},{"given":"Badrish","family":"Chandramouli","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]}],"member":"320","published-online":{"date-parts":[[2024,8,30]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3187009.3164147"},{"key":"e_1_2_1_2_1","unstructured":"Jens Axboe. 2019. Efficient IO with io_uring. https:\/\/kernel.dk\/io_uring.pdf."},{"key":"e_1_2_1_3_1","volume-title":"An introduction to B-trees and write-optimization. login","author":"Bender Michael A","year":"2015","unstructured":"Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C Kuszmaul, Donald E Porter, Jun Yuan, and Yang Zhan. 2015. An introduction to B-trees and write-optimization. login; magazine 40, 5 (2015)."},{"key":"e_1_2_1_4_1","volume-title":"An introduction to B-trees and write-optimization. login","author":"Bender Michael A","year":"2015","unstructured":"Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C Kuszmaul, Donald E Porter, Jun Yuan, and Yang Zhan. 2015. An introduction to B-trees and write-optimization. login; magazine 40, 5 (2015)."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477132.3483540"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/3386691.3386712"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583143"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196898"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3236227"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1365815.1365816"},{"key":"e_1_2_1_11_1","volume-title":"2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Conway Alexander","year":"2020","unstructured":"Alexander Conway, Abhishek Gupta, Vijay Chidambaram, Martin Farach-Colton, Richard Spillane, Amy Tai, and Rob Johnson. 2020. {SplinterDB}: Closing the bandwidth gap for {NVMe}{Key-Value} stores. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 49--63."},{"key":"e_1_2_1_12_1","volume-title":"CIDR 2022, Conference on Innovative Data Systems Research.","author":"Crotty Andrew","year":"2022","unstructured":"Andrew Crotty, Viktor Leis, and Andrew Pavlo. 2022. Are You Sure You Want to Use MMAP in Your Database Management System?. In CIDR 2022, Conference on Innovative Data Systems Research."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457273"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2011.44"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556575"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3483840"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the USENIX Annual Technical Conference (ATC). 1--14","author":"Duplyakin Dmitry","year":"2019","unstructured":"Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuangching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. 2019. The Design and Operation of CloudLab. In Proceedings of the USENIX Annual Technical Conference (ATC). 1--14. https:\/\/www.flux.utah.edu\/paper\/duplyakin-atc19"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732967.2732968"},{"key":"e_1_2_1_19_1","volume-title":"Proc. of the bsdcan conference, ottawa, canada.","author":"Evans Jason","year":"2006","unstructured":"Jason Evans. 2006. A scalable concurrent malloc (3) implementation for FreeBSD. In Proc. of the bsdcan conference, ottawa, canada."},{"key":"e_1_2_1_20_1","unstructured":"Facebook. 2021. RocksDB Bloom Filter. https:\/\/github.com\/facebook\/rocksdb\/wiki\/RocksDB-Bloom-Filter. Accessed: 2024-02-01."},{"key":"e_1_2_1_21_1","unstructured":"Facebook. 2023. Block Cache. https:\/\/github.com\/facebook\/rocksdb\/wiki\/Block-Cache. Accessed: 2024-02-01."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554842"},{"key":"e_1_2_1_23_1","unstructured":"Google. 2021. LevelDB - A fast key-value storage library written at Google. https:\/\/github.com\/google\/leveldb. GitHub repository."},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","unstructured":"Goetz Graefe et al. 2011. Modern B-tree techniques. Foundations and Trends\u00ae in Databases 3 4 (2011) 203--402.","DOI":"10.1561\/1900000028"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2909476"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.14778\/3598581.3598584"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639286"},{"key":"e_1_2_1_28_1","volume-title":"Evaluating Persistent Memory Range Indexes: Part Two [Extended Version]. arXiv preprint arXiv:2201.13047","author":"He Yuliang","year":"2022","unstructured":"Yuliang He, Duo Lu, Kaisong Huang, and Tianzheng Wang. 2022. Evaluating Persistent Memory Range Indexes: Part Two [Extended Version]. arXiv preprint arXiv:2201.13047 (2022)."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2019.2936273"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517884"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415535"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1516360.1516365"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the ACM on Programming Languages 2, POPL","author":"Jung Ralf","year":"2017","unstructured":"Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2017. RustBelt: Securing the foundations of the Rust programming language. Proceedings of the ACM on Programming Languages 2, POPL (2017), 1--34."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597635.3598029"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589327"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882905"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611479.3611495"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-34175-6_13"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2018.00026"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544812"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2933349.2933352"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/3372716.3372728"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544834"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3533737.3535091"},{"key":"e_1_2_1_46_1","first-page":"1","article-title":"Pea Hash: A Performant Extendible Adaptive Hashing Index","volume":"1","author":"Liu Zhuoxuan","year":"2023","unstructured":"Zhuoxuan Liu and Shimin Chen. 2023. Pea Hash: A Performant Extendible Adaptive Hashing Index. Proceedings of the ACM on Management of Data 1, 1 (2023), 1--25.","journal-title":"Proceedings of the ACM on Management of Data"},{"key":"e_1_2_1_47_1","volume-title":"Umar Farooq Minhas, and Tianzheng Wang","author":"Lu Baotong","year":"2021","unstructured":"Baotong Lu, Jialin Ding, Eric Lo, Umar Farooq Minhas, and Tianzheng Wang. 2021. APEX: A high-performance learned index on persistent memory. arXiv preprint arXiv:2105.00683 (2021)."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3033273"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.14778\/3430915.3430916"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.14778\/3397230.3397232"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/128765.128770"},{"key":"e_1_2_1_52_1","volume-title":"PostgreSQL: introduction and concepts","author":"Momjian Bruce","unstructured":"Bruce Momjian. 2001. PostgreSQL: introduction and concepts. Vol. 192. Addison-Wesley New York."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1007\/s002360050048"},{"key":"e_1_2_1_54_1","volume-title":"LSM-tree Write Amplification Gap on Modern Storage Hardware with Built-in Transparent Compression. In 20th USENIX Conference on File and Storage Technologies (FAST 22)","author":"Qiao Yifan","year":"2022","unstructured":"Yifan Qiao, Xubin Chen, Ning Zheng, Jiangpeng Li, Yang Liu, and Tong Zhang. 2022. Closing the B+-tree vs. LSM-tree Write Amplification Gap on Modern Storage Hardware with Built-in Transparent Compression. In 20th USENIX Conference on File and Storage Technologies (FAST 22). USENIX Association, Santa Clara, CA, 69--82. https:\/\/www.usenix.org\/conference\/fast22\/presentation\/qiao"},{"key":"e_1_2_1_55_1","unstructured":"RocksDB Team. 2021. RocksDB Secondary Cache. https:\/\/rocksdb.org\/blog\/2021\/05\/27\/rocksdb-secondary-cache.html"},{"key":"e_1_2_1_56_1","unstructured":"Samsung Semiconductor. 2024. PM9A3 (MZQL2960HCJR-00A07) - Datacenter SSD. https:\/\/semiconductor.samsung.com\/us\/ssd\/datacenter-ssd\/pm9a3\/mzql2960hcjr-00a07\/ Accessed: 2024-05-27."},{"key":"e_1_2_1_57_1","volume-title":"Continuous fuzzing with libfuzzer and addresssanitizer. In 2016 IEEE Cybersecurity Development (SecDev)","author":"Serebryany Kosta","unstructured":"Kosta Serebryany. 2016. Continuous fuzzing with libfuzzer and addresssanitizer. In 2016 IEEE Cybersecurity Development (SecDev). IEEE, 157--157."},{"key":"e_1_2_1_58_1","unstructured":"Konstantin Serebryany Derek Bruening Alexander Potapenko and Dmitriy Vyukov. 2012. {AddressSanitizer}: A fast address sanity checker. In 2012 USENIX annual technical conference (USENIX ATC 12). 309--318."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056101"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3517824"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196895"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3121133"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.5555\/3357062.3357089"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/CloudCom.2017.14"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.14778\/3561261.3561270"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.14778\/3514061.3514066"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.14778\/3565816.3565826"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.85"},{"key":"e_1_2_1_69_1","volume-title":"Two is Better Than One: The Case for 2-Tree for Skewed Data Sets. memory 11","author":"Zhou Xinjing","year":"2023","unstructured":"Xinjing Zhou, Xiangyao Yu, Goetz Graefe, and Michael Stonebraker. 2023. Two is Better Than One: The Case for 2-Tree for Skewed Data Sets. memory 11 (2023), 13."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3681954.3682012","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,4]],"date-time":"2024-09-04T18:32:44Z","timestamp":1725474764000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3681954.3682012"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7]]},"references-count":68,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,7]]}},"alternative-id":["10.14778\/3681954.3682012"],"URL":"https:\/\/doi.org\/10.14778\/3681954.3682012","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2024,7]]},"assertion":[{"value":"2024-08-30","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}