{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T06:26:29Z","timestamp":1778048789791,"version":"3.51.4"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2009,8]]},"abstract":"<jats:p>In relational query processing, there are generally two choices for access paths when performing a predicate lookup for which no clustered index is available. One option is to use an unclustered index. Another is to perform a complete sequential scan of the table. Many analytical workloads do not benefit from the availability of unclustered indexes; the cost of random disk I\/O becomes prohibitive for all but the most selective queries.<\/jats:p>\n          <jats:p>It has been observed that a secondary index on an unclustered attribute can perform well under certain conditions if the unclustered attribute is correlated with a clustered index attribute [4]. The clustered index will co-locate values and the correlation will localize access through the unclustered attribute to a subset of the pages. In this paper, we show that in a real application (SDSS) and widely used benchmark (TPC-H), there exist many cases of attribute correlation that can be exploited to accelerate queries. We also discuss a tool that can automatically suggest useful pairs of correlated attributes. It does so using an analytical cost model that we developed, which is novel in its awareness of the effects of clustering and correlation.<\/jats:p>\n          <jats:p>Furthermore, we propose a data structure called a Correlation Map (CM) that expresses the mapping between the correlated attributes, acting much like a secondary index. The paper also discusses how bucketing on the domains of both attributes in the correlated attribute pair can dramatically reduce the size of the CM to be potentially orders of magnitude smaller than that of a secondary B+Tree index. This reduction in size allows us to create a large number of CMs that improve performance for a wide range of queries. The small size also reduces maintenance costs as we demonstrate experimentally.<\/jats:p>","DOI":"10.14778\/1687627.1687765","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"1222-1233","source":"Crossref","is-referenced-by-count":30,"title":["Correlation maps"],"prefix":"10.14778","volume":"2","author":[{"given":"Hideaki","family":"Kimura","sequence":"first","affiliation":[{"name":"Brown University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"George","family":"Huo","sequence":"additional","affiliation":[{"name":"Google, Inc."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Rasin","sequence":"additional","affiliation":[{"name":"Brown University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Samuel","family":"Madden","sequence":"additional","affiliation":[{"name":"MIT CSAIL"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stanley B.","family":"Zdonik","sequence":"additional","affiliation":[{"name":"Brown University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2009,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Optimizing queries that access correlated datetime columns. http:\/\/msdn.microsoft.com\/en-us\/library\/ms177416(SQL.90).aspx.  Optimizing queries that access correlated datetime columns. http:\/\/msdn.microsoft.com\/en-us\/library\/ms177416(SQL.90).aspx."},{"key":"e_1_2_1_2_1","unstructured":"Postgresql 8.1.17 documentation: pg_stats. http:\/\/www.postgresql.org\/docs\/8.1\/static\/view-pg-stats.html.  Postgresql 8.1.17 documentation: pg_stats. http:\/\/www.postgresql.org\/docs\/8.1\/static\/view-pg-stats.html."},{"key":"e_1_2_1_3_1","unstructured":"Using extended statistics to optimize multi-column relationships and function-based statistics. http:\/\/www.oracle.com\/technology\/obe\/11gr1_db\/perform\/multistats\/multicolstats.htm.  Using extended statistics to optimize multi-column relationships and function-based statistics. http:\/\/www.oracle.com\/technology\/obe\/11gr1_db\/perform\/multistats\/multicolstats.htm."},{"key":"e_1_2_1_4_1","unstructured":"P. Brown and P. Haas. BHUNT: Automatic Discovery of Fuzzy Algebraic Constraints in Relational Data. In VLDB'03.   P. Brown and P. Haas. BHUNT: Automatic Discovery of Fuzzy Algebraic Constraints in Relational Data. In VLDB'03 ."},{"key":"e_1_2_1_5_1","volume-title":"http:\/\/developer.ebay.com\/products\/trading","author":"API.","year":"2008","unstructured":"eBay Developer API. http:\/\/developer.ebay.com\/products\/trading , 2008 . eBay Developer API. http:\/\/developer.ebay.com\/products\/trading, 2008."},{"key":"e_1_2_1_6_1","unstructured":"P. Gibbons. Distinct sampling for highly-accurate answers to distinct values queries and event reports. In VLDB'01.   P. Gibbons. Distinct sampling for highly-accurate answers to distinct values queries and event reports. In VLDB'01 ."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/645484.656546"},{"key":"e_1_2_1_8_1","unstructured":"M. Hammer and S. Zdonik. Knowledge based query processing. In VLDB'80.   M. Hammer and S. Zdonik. Knowledge based query processing. In VLDB'80 ."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007641"},{"key":"e_1_2_1_10_1","volume-title":"JASA'93","author":"Bunge J.","unstructured":"J. Bunge Estimating the number of species: A review . JASA'93 . J. Bunge et al. Estimating the number of species: A review. JASA'93."},{"key":"e_1_2_1_11_1","volume-title":"Data mining the sdss skyserver database","author":"Gray J.","year":"2002","unstructured":"J. Gray Data mining the sdss skyserver database , 2002 . J. Gray et al. Data mining the sdss skyserver database, 2002."},{"key":"e_1_2_1_12_1","volume-title":"VLDB","author":"King J.","year":"1981","unstructured":"J. King . Quist : A system for semantic query optimization in relational databases . In VLDB , 1981 . J. King. Quist: A system for semantic query optimization in relational databases. In VLDB, 1981."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/335168.335230"},{"key":"e_1_2_1_14_1","volume-title":"Random sampling from databases - a survey","author":"Olken F.","year":"1995","unstructured":"F. Olken and D. Rotem . Random sampling from databases - a survey , 1995 . F. Olken and D. Rotem. Random sampling from databases - a survey, 1995."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/375663.375749"},{"key":"e_1_2_1_16_1","unstructured":"PostgreSQL home page. http:\/\/www.postgresql.org\/.  PostgreSQL home page. http:\/\/www.postgresql.org\/."},{"key":"e_1_2_1_17_1","unstructured":"Q. Cheng etal Implementation of two semantic query optimization techniques in the DB2 universal database. In VLDB'99.   Q. Cheng et al. Implementation of two semantic query optimization techniques in the DB2 universal database. In VLDB'99 ."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/320521.320530"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/38713.38736"},{"key":"e_1_2_1_20_1","unstructured":"T. Apaydin etal Approximate encoding for direct access and query processing over compressed bitmaps. In VLDB'06.   T. Apaydin et al. Approximate encoding for direct access and query processing over compressed bitmaps. In VLDB'06 ."},{"key":"e_1_2_1_21_1","volume-title":"TODS'90","author":"Chakravarthy U.","unstructured":"U. Chakravarthy Automatic generation of production rules for integrity maintenance . TODS'90 . U. Chakravarthy et al. Automatic generation of production rules for integrity maintenance. TODS'90."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2008.4497572"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1687627.1687765","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:36:27Z","timestamp":1672227387000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1687627.1687765"}},"subtitle":["a compressed access method for exploiting soft functional dependencies"],"short-title":[],"issued":{"date-parts":[[2009,8]]},"references-count":22,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,8]]}},"alternative-id":["10.14778\/1687627.1687765"],"URL":"https:\/\/doi.org\/10.14778\/1687627.1687765","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2009,8]]}}}