{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T08:54:07Z","timestamp":1775638447559,"version":"3.50.1"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2015,11]]},"abstract":"<jats:p>Hashing is a solved problem. It allows us to get constant time access for lookups. Hashing is also simple. It is safe to use an arbitrary method as a black box and expect good performance, and optimizations to hashing can only improve it by a negligible delta. Why are all of the previous statements plain wrong? That is what this paper is about. In this paper we thoroughly study hashing for integer keys and carefully analyze the most common hashing methods in a five-dimensional requirements space: (1) data-distribution, (2) load factor, (3) dataset size, (4) read\/write-ratio, and (5) un\/successful-ratio. Each point in that design space may potentially suggest a different hashing scheme, and additionally also a different hash function. We show that a right or wrong decision in picking the right hashing scheme and hash function combination may lead to significant difference in performance. To substantiate this claim, we carefully analyze two additional dimensions: (6) five representative hashing schemes (which includes an improved variant of Robin Hood hashing), (7) four important classes of hash functions widely used today. That is, we consider 20 different combinations in total. Finally, we also provide a glimpse about the effect of table memory layout and the use of SIMD instructions. Our study clearly indicates that picking the right combination may have considerable impact on insert and lookup performance, as well as memory footprint. A major conclusion of our work is that hashing should be considered a white box before blindly using it in applications, such as query processing. Finally, we also provide a strong guideline about when to use which hashing method.<\/jats:p>","DOI":"10.14778\/2850583.2850585","type":"journal-article","created":{"date-parts":[[2016,2,1]],"date-time":"2016-02-01T14:10:31Z","timestamp":1454335831000},"page":"96-107","source":"Crossref","is-referenced-by-count":52,"title":["A seven-dimensional analysis of hashing methods and its implications on query processing"],"prefix":"10.14778","volume":"9","author":[{"given":"Stefan","family":"Richter","sequence":"first","affiliation":[{"name":"Saarland University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Victor","family":"Alvarez","sequence":"additional","affiliation":[{"name":"TU Braunschweig"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jens","family":"Dittrich","sequence":"additional","affiliation":[{"name":"Saarland University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113370"},{"key":"e_1_2_1_2_1","unstructured":"A. Appleby. Murmurhash3 64-bit finalizer. Version 19\/02\/15. https:\/\/code.google.com\/p\/smhasher\/wiki\/MurmurHash3.  A. Appleby. Murmurhash3 64-bit finalizer. Version 19\/02\/15. https:\/\/code.google.com\/p\/smhasher\/wiki\/MurmurHash3."},{"key":"e_1_2_1_3_1","volume-title":"Main-memory hash joins on modern processor architectures","author":"Balkesen C.","year":"2014","unstructured":"C. Balkesen , J. Teubner , G. Alonso , and M. Ozsu . Main-memory hash joins on modern processor architectures . IEEE TKDE , 2014 . C. Balkesen, J. Teubner, G. Alonso, and M. Ozsu. Main-memory hash joins on modern processor architectures. IEEE TKDE, 2014."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735496.2735499"},{"key":"e_1_2_1_6_1","volume-title":"Introduction to Algorithms","author":"Cormen T. H.","year":"1990","unstructured":"T. H. Cormen , C. E. Leiserson , and R. L. Rivest . Introduction to Algorithms . MIT Press , Cambridge, MA, USA , 1990 . T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, Cambridge, MA, USA, 1990."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/646511.695324"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1006\/jagm.1997.0873"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-70575-8_32"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/1496770.1496857"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2151171.2151174"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00224-004-1195-x"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807206"},{"key":"e_1_2_1_14_1","volume-title":"Notes on \"open\" addressing. Unpublished Memorandum","author":"Knuth D.","year":"1963","unstructured":"D. Knuth . Notes on \"open\" addressing. Unpublished Memorandum , 1963 . D. Knuth. Notes on \"open\" addressing. Unpublished Memorandum, 1963."},{"key":"e_1_2_1_15_1","volume-title":"Addison Wesley","author":"Knuth D. E.","year":"1998","unstructured":"D. E. Knuth . The Art of Computer Programming, Volume 3: (2nd Ed.) Sorting and Searching . Addison Wesley , 1998 . D. E. Knuth. The Art of Computer Programming, Volume 3: (2nd Ed.) Sorting and Searching. Addison Wesley, 1998."},{"key":"e_1_2_1_16_1","volume-title":"LNCS","author":"Lang H.","year":"2015","unstructured":"H. Lang , V. Leis , M.-C. Albutiu , T. Neumann , and A. Kemper . Massively Parallel NUMA-Aware Hash Joins, pages 3--14 . LNCS , 2015 . H. Lang, V. Leis, M.-C. Albutiu, T. Neumann, and A. Kemper. Massively Parallel NUMA-Aware Hash Joins, pages 3--14. LNCS, 2015."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544812"},{"key":"e_1_2_1_18_1","first-page":"1","volume-title":"Some Open Questions Related to Cuckoo Hashing","author":"Mitzenmacher M.","year":"2009","unstructured":"M. Mitzenmacher . Some Open Questions Related to Cuckoo Hashing , volume 5757 , pages 1 -- 10 . LNCS , 2009 . M. Mitzenmacher. Some Open Questions Related to Cuckoo Hashing, volume 5757, pages 1--10. LNCS, 2009."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.002"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2220357.2220361"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1565694.1565705"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/1496770.1496842"},{"key":"e_1_2_1_23_1","volume-title":"University of Waterloo","author":"Viola A.","year":"1995","unstructured":"A. Viola . Analysis of hashing algorithms and a new mathematical transform . University of Waterloo , 1995 . A. Viola. Analysis of hashing algorithms and a new mathematical transform. University of Waterloo, 1995."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2850583.2850585","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:17:24Z","timestamp":1672222644000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2850583.2850585"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,11]]},"references-count":22,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2015,11]]}},"alternative-id":["10.14778\/2850583.2850585"],"URL":"https:\/\/doi.org\/10.14778\/2850583.2850585","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2015,11]]}}}