{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T09:24:21Z","timestamp":1770542661613,"version":"3.49.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,5,29]],"date-time":"2024-05-29T00:00:00Z","timestamp":1716940800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100006374","name":"NSF","doi-asserted-by":"publisher","award":["2337806"],"award-info":[{"award-number":["2337806"]}],"id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,5,29]]},"abstract":"<jats:p>B-trees are widely recognized as one of the most important index structures in database systems, providing efficient query processing capabilities. Over the past few decades, many techniques have been developed to enhance the efficiency of B-trees from various perspectives. Among them, B-tree compression is an important technique introduced as early as the 1970s to improve both space efficiency and query performance. Since then, several B-tree compression techniques have been developed. However, to our surprise, we have found that these B-tree compression techniques were never compared against each other in prior works. Consequently, many important questions remain unanswered, such as whether B-tree compression is truly effective or not. If it is effective, under what scenarios and which B-tree compression methods should be employed? In this paper, we conduct the first experimental evaluation of seven widely used B-tree compression techniques using both synthetic and real datasets. Based on our evaluation, we present lessons and insights that can be leveraged to guide system design decisions in modern databases regarding the use of B-tree compression.<\/jats:p>","DOI":"10.1145\/3654972","type":"journal-article","created":{"date-parts":[[2024,5,30]],"date-time":"2024-05-30T09:44:53Z","timestamp":1717062293000},"page":"1-25","source":"Crossref","is-referenced-by-count":5,"title":["Revisiting B-tree Compression: An Experimental Study"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-2171-5036","authenticated-orcid":false,"given":"Chuqing","family":"Gao","sequence":"first","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3932-6011","authenticated-orcid":false,"given":"Shreya","family":"Ballijepalli","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3039-1175","authenticated-orcid":false,"given":"Jianguo","family":"Wang","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,5,30]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"2007. WEBSPAM-UK2007 Dataset. https:\/\/chato.cl\/webspam\/datasets\/uk2007\/"},{"key":"e_1_2_2_2_1","unstructured":"2008. SNAP Memetracker Dataset. https:\/\/www.kaggle.com\/datasets\/snap\/snap-memetracker"},{"key":"e_1_2_2_3_1","unstructured":"2022. The Default Page Size Change of SQLite 3.12.0. https:\/\/www.sqlite.org\/pgszchng2016.html"},{"key":"e_1_2_2_4_1","unstructured":"2022. Source Code of WiredTiger's B-Tree Implementation. https:\/\/github.com\/wiredtiger\/wiredtiger\/tree\/develop\/ src\/btree"},{"key":"e_1_2_2_5_1","unstructured":"2023. CREATE INDEX Statement in SAP HANA (https:\/\/help.sap.com\/docs\/SAP_HANA_PLATFORM\/ 4fe29514fd584807ac9f2a04f6754767\/20d44b4175191014a940afff4b47c7ea.html)."},{"key":"e_1_2_2_6_1","unstructured":"2023. Database Page Layout in PostgreSQL 16. https:\/\/www.postgresql.org\/docs\/current\/storage-page-layout.html"},{"key":"e_1_2_2_7_1","unstructured":"2023. MyISAM Source Code in MySQL. https:\/\/github.com\/mysql\/mysql-server\/tree\/ a246bad76b9271cb4333634e954040a970222e0a\/storage\/myisam"},{"key":"e_1_2_2_8_1","unstructured":"2023. MySQL Reference Manual. https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/key-space.html#: :text=Prefix% 20compression%20is%20used%20on when%20you%20create%20the%20table."},{"key":"e_1_2_2_9_1","unstructured":"2023. TPC-H Benchmark. https:\/\/www.tpc.org\/tpch\/"},{"key":"e_1_2_2_10_1","unstructured":"2023. WiredTiger Documentation. https:\/\/source.wiredtiger.com\/11.1.0\/file_formats.html#file_formats_compression"},{"key":"e_1_2_2_11_1","doi-asserted-by":"crossref","unstructured":"Daniel J. Abadi Samuel Madden and Miguel Ferreira. 2006. Integrating compression and execution in column-oriented database systems. In SIGMOD. 671--682.","DOI":"10.1145\/1142473.1142548"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007780050031"},{"key":"e_1_2_2_13_1","doi-asserted-by":"crossref","unstructured":"G. Antoshenkov D. Lomet and J. Murray. 1996. Order Preserving String Compression. In ICDE. 655--663.","DOI":"10.1109\/ICDE.1996.492216"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00288683"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/320521.320530"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687573"},{"key":"e_1_2_2_17_1","doi-asserted-by":"crossref","unstructured":"Carsten Binnig Stefan Hildenbrand and Franz F\u00e4rber. 2009. Dictionary-based Order-preserving String Compression for Main Memory Column Stores. In SIGMOD. 283--296.","DOI":"10.1145\/1559845.1559877"},{"key":"e_1_2_2_18_1","doi-asserted-by":"crossref","unstructured":"Philip Bohannon Peter McIlroy and Rajeev Rastogi. 2001. Main-Memory Index Structures with Fixed-Size Partial Keys. In SIGMOD. 163--174.","DOI":"10.1145\/375663.375681"},{"key":"e_1_2_2_19_1","unstructured":"Lars Breddemann. 2020. What is CPB-Tree in SAP HANA? https:\/\/www.lbreddemann.org\/what-is-cpb-tree-in-saphana\/"},{"key":"e_1_2_2_20_1","volume-title":"Bowman","author":"Bumbulis Peter","year":"2002","unstructured":"Peter Bumbulis and Ivan T. Bowman. 2002. A Compact B-tree. In SIGMOD. 533--541."},{"key":"e_1_2_2_21_1","volume-title":"Converged Index: The Secret Sauce Behind Rockset's Fast Queries. https:\/\/rockset.com\/blog\/ converged-indexing-the-secret-sauce-behind-rocksets-fast-queries\/","author":"Canadi Igor","year":"2019","unstructured":"Igor Canadi. 2019. Converged Index: The Secret Sauce Behind Rockset's Fast Queries. https:\/\/rockset.com\/blog\/ converged-indexing-the-secret-sauce-behind-rocksets-fast-queries\/"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/3648160.3648180"},{"key":"e_1_2_2_23_1","doi-asserted-by":"crossref","unstructured":"Zhiyuan Chen Johannes Gehrke and Flip Korn. 2001. Query Optimization In Compressed Database Systems. In SIGMOD. 271--282.","DOI":"10.1145\/375663.375692"},{"key":"e_1_2_2_24_1","unstructured":"Gregg Christman. 2022. Compression Features Included with Oracle Database Enterprise Edition. https:\/\/blogs.oracle. com\/dbstorage\/post\/compression-features-included-with-oracle-database-enterprise-edition"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/356770.356776"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2463710"},{"key":"e_1_2_2_27_1","volume-title":"Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David B. Lomet, and Tim Kraska.","author":"Ding Jialin","year":"2020","unstructured":"Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David B. Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. In SIGMOD. 969--984."},{"key":"e_1_2_2_28_1","volume-title":"A Survey of B-tree Locking Techniques. TODS 35, 3","author":"Graefe Goetz","year":"2010","unstructured":"Goetz Graefe. 2010. A Survey of B-tree Locking Techniques. TODS 35, 3 (2010), 16:1--16:26."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000028"},{"key":"e_1_2_2_30_1","doi-asserted-by":"crossref","unstructured":"Goetz Graefe and Per-\u00c5ke Larson. 2001. B-Tree Indexes and CPU Caches. In ICDE. 349--358.","DOI":"10.1109\/ICDE.2001.914847"},{"key":"e_1_2_2_31_1","volume-title":"Patel","author":"Hankins Richard A.","year":"2003","unstructured":"Richard A. Hankins and Jignesh M. Patel. 2003. Effect of Node Size on the Performance of Cache-conscious B-trees. In SIGMETRICS. 283--294."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454211"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807206"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2016.07.004"},{"key":"e_1_2_2_35_1","volume-title":"Jeffrey Dean, and Neoklis Polyzotis.","author":"Kraska Tim","year":"2018","unstructured":"Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In SIGMOD. 489--504."},{"key":"e_1_2_2_36_1","doi-asserted-by":"crossref","unstructured":"Harald Lang Alexander Beischl Viktor Leis Peter A. Boncz Thomas Neumann and Alfons Kemper. 2020. Tree-Encoded Bitmaps. In SIGMOD. 937--967.","DOI":"10.1145\/3318464.3380588"},{"key":"e_1_2_2_37_1","doi-asserted-by":"crossref","unstructured":"Viktor Leis Alfons Kemper and Thomas Neumann. 2013. The Adaptive Radix Tree: ARTful Indexing for Main-memory Databases. In ICDE. 38--49.","DOI":"10.1109\/ICDE.2013.6544812"},{"key":"e_1_2_2_38_1","doi-asserted-by":"crossref","unstructured":"Justin J. Levandoski David B. Lomet and Sudipta Sengupta. 2013. The Bw-Tree: A B-tree for New Hardware Platforms. In ICDE. 302--313.","DOI":"10.1109\/ICDE.2013.6544834"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476305"},{"key":"e_1_2_2_40_1","volume-title":"Elmore","author":"Liu Chunwei","year":"2019","unstructured":"Chunwei Liu, McKade Umbenhower, Hao Jiang, Pranav Subramaniam, Jihong Ma, and Aaron J. Elmore. 2019. Mostly Order Preserving Dictionaries. In ICDE. 1214--1225."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/603867.603878"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824078"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3090163.3090164"},{"key":"e_1_2_2_44_1","doi-asserted-by":"crossref","unstructured":"Jianguo Wang Chunbin Lin Yannis Papakonstantinou and Steven Swanson. 2017. An Experimental Study of Bitmap Compression vs. Inverted List Compression. In SIGMOD. 993--1008.","DOI":"10.1145\/3035918.3064007"},{"key":"e_1_2_2_45_1","volume-title":"Andersen","author":"Wang Ziqi","year":"2018","unstructured":"Ziqi Wang, Andrew Pavlo, Hyeontaek Lim, Viktor Leis, Huanchen Zhang, Michael Kaminsky, and David G. Andersen. 2018. Building a Bw-Tree Takes More Than Just Buzz Words. In SIGMOD. 473--488."},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1132863.1132864"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611479.3611502"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352124"},{"key":"e_1_2_2_49_1","doi-asserted-by":"crossref","unstructured":"Feng Zhang Weitao Wan Chenyang Zhang Jidong Zhai Yunpeng Chai Haixiang Li and Xiaoyong Du. 2022. CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases. In SIGMOD. 1655--1669.","DOI":"10.1145\/3514221.3526130"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-020-00636-3"},{"key":"e_1_2_2_51_1","doi-asserted-by":"crossref","unstructured":"Huanchen Zhang Xiaoxuan Liu David G. Andersen Michael Kaminsky Kimberly Keeton and Andrew Pavlo. 2020. Order-Preserving Key Compression for In-Memory Search Trees. In SIGMOD. 1601--1615.","DOI":"10.1145\/3318464.3380583"},{"key":"e_1_2_2_52_1","volume-title":"Carsten Binnig, Rodrigo Fonseca, and Tim Kraska.","author":"Ziegler Tobias","year":"2019","unstructured":"Tobias Ziegler, Sumukha Tumkur Vani, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. Designing Distributed Tree-based Index Structures for Fast RDMA-capable Networks. In SIGMOD. 741--758."}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3654972","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3654972","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T14:37:18Z","timestamp":1755787038000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3654972"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,29]]},"references-count":52,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,5,29]]}},"alternative-id":["10.1145\/3654972"],"URL":"https:\/\/doi.org\/10.1145\/3654972","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,29]]}}}