{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,11]],"date-time":"2026-06-11T16:08:00Z","timestamp":1781194080891,"version":"3.54.1"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:p>Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art \"traditional\" baselines. Using four real-world datasets, we demonstrate that learned index structures can indeed outperform non-learned indexes in read-only in-memory workloads over a dense array. We investigate the impact of caching, pipelining, dataset size, and key size. We study the performance profile of learned index structures, and build an explanation for why learned models achieve such good performance. Finally, we investigate other important properties of learned index structures, such as their performance in multi-threaded systems and their build times.<\/jats:p>","DOI":"10.14778\/3421424.3421425","type":"journal-article","created":{"date-parts":[[2020,10,28]],"date-time":"2020-10-28T01:15:11Z","timestamp":1603847711000},"page":"1-13","source":"Crossref","is-referenced-by-count":127,"title":["Benchmarking learned indexes"],"prefix":"10.14778","volume":"14","author":[{"given":"Ryan","family":"Marcus","sequence":"first","affiliation":[{"name":"MIT CSAIL \/ Intel Labs"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Andreas","family":"Kipf","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alexander","family":"van Renen","sequence":"additional","affiliation":[{"name":"TUM"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mihail","family":"Stoian","sequence":"additional","affiliation":[{"name":"TUM"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sanchit","family":"Misra","sequence":"additional","affiliation":[{"name":"Intel Labs"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alfons","family":"Kemper","sequence":"additional","affiliation":[{"name":"TUM"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Thomas","family":"Neumann","sequence":"additional","affiliation":[{"name":"TUM"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Tim","family":"Kraska","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2020,10,27]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"C++ lower_bound http:\/\/cplusplus.com\/reference\/algorithm\/lower_bound\/.  C++ lower_bound http:\/\/cplusplus.com\/reference\/algorithm\/lower_bound\/."},{"key":"e_1_2_1_2_1","unstructured":"RobinMap https:\/\/github.com\/Tessil\/robin-map.  RobinMap https:\/\/github.com\/Tessil\/robin-map."},{"key":"e_1_2_1_3_1","unstructured":"RocksDB https:\/\/rocksdb.org\/.  RocksDB https:\/\/rocksdb.org\/."},{"key":"e_1_2_1_4_1","unstructured":"Searching on sorted data benchmark https:\/\/learned.systems\/sosd.  Searching on sorted data benchmark https:\/\/learned.systems\/sosd."},{"key":"e_1_2_1_5_1","unstructured":"SIMD Cuckoo Hash https:\/\/github.com\/stanford-futuredata\/index-baselines.  SIMD Cuckoo Hash https:\/\/github.com\/stanford-futuredata\/index-baselines."},{"key":"e_1_2_1_6_1","unstructured":"STX B+ Tree https:\/\/panthema.net\/2007\/stx-btree\/.  STX B+ Tree https:\/\/panthema.net\/2007\/stx-btree\/."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/2002974.2002975"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(76)90071-5"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196896"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389711"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-43883-8_2"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/3389133.3389135"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3319860"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1140402.1140409"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807206"},{"key":"e_1_2_1_16_1","volume-title":"ML for Systems at NeurIPS, MLForSystems @ NeurIPS '19","author":"Kipf A.","year":"2019","unstructured":"A. Kipf , R. Marcus , A. van Renen , M. Stoian , A. Kemper , T. Kraska , and T. Neumann . SOSD: A Benchmark for Learned Indexes . In ML for Systems at NeurIPS, MLForSystems @ NeurIPS '19 , Dec. 2019 . A. Kipf, R. Marcus, A. van Renen, M. Stoian, A. Kemper, T. Kraska, and T. Neumann. SOSD: A Benchmark for Learned Indexes. In ML for Systems at NeurIPS, MLForSystems @ NeurIPS '19, Dec. 2019."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3401071.3401659"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196909"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544812"},{"issue":"1","key":"e_1_2_1_20_1","first-page":"393","article-title":"LSM-based storage techniques: A survey","volume":"29","author":"Luo C.","year":"2020","unstructured":"C. Luo and M. J. Carey . LSM-based storage techniques: A survey . PVLDB , 29 ( 1 ): 393 -- 418 , Jan. 2020 . C. Luo and M. J. Carey. LSM-based storage techniques: A survey. PVLDB, 29(1):393--418, Jan. 2020.","journal-title":"PVLDB"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3384706"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/364995.365000"},{"key":"e_1_2_1_23_1","volume-title":"ML for Systems at NeurIPS, MLForSystems @ NeurIPS '19","author":"Nathan V.","year":"2019","unstructured":"V. Nathan , J. Ding , M. Alizadeh , and T. Kraska . Learning Multi-dimensional Indexing . In ML for Systems at NeurIPS, MLForSystems @ NeurIPS '19 , Dec. 2019 . V. Nathan, J. Ding, M. Alizadeh, and T. Kraska. Learning Multi-dimensional Indexing. In ML for Systems at NeurIPS, MLForSystems @ NeurIPS '19, Dec. 2019."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-70504-8_12"},{"key":"e_1_2_1_25_1","unstructured":"Peter Bailis Kai Sheng Tai Pratiksha Thaker and Matei Zaharia. Don't Throw Out Your Algorithms Book Just Yet: Classical Data Structures That Can Outperform Learned Indexes (blog post) https:\/\/dawn.cs.stanford.edu\/2018\/01\/11\/index-baselines\/ 2018.  Peter Bailis Kai Sheng Tai Pratiksha Thaker and Matei Zaharia. Don't Throw Out Your Algorithms Book Just Yet: Classical Data Structures That Can Outperform Learned Indexes (blog post) https:\/\/dawn.cs.stanford.edu\/2018\/01\/11\/index-baselines\/ 2018."},{"key":"e_1_2_1_26_1","unstructured":"Peter Boncz and Thomas Neumann. The Case for B-Tree Index Structures (blog post) http:\/\/databasearchitects.blogspot.com\/2017\/12\/the-case-for-b-tree-index-structures.html 2017.  Peter Boncz and Thomas Neumann. The Case for B-Tree Index Structures (blog post) http:\/\/databasearchitects.blogspot.com\/2017\/12\/the-case-for-b-tree-index-structures.html 2017."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850583.2850585"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/3236187.3236205"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3300075"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303955"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-014-0355-0"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196931"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3421424.3421425","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:18:16Z","timestamp":1672226296000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3421424.3421425"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["10.14778\/3421424.3421425"],"URL":"https:\/\/doi.org\/10.14778\/3421424.3421425","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2020,9]]}}}