{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T20:27:39Z","timestamp":1767904059929,"version":"3.49.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2005,12,1]],"date-time":"2005-12-01T00:00:00Z","timestamp":1133395200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2005,12]]},"abstract":"<jats:p>Database systems use precomputed synopses of data to estimate the cost of alternative plans during query optimization. A number of alternative synopsis structures have been proposed, but histograms are by far the most commonly used. While histograms have proved to be very effective in (cost estimation for) single-table selections, queries with joins have long been seen as a challenge; under a model where histograms are maintained for individual tables, a celebrated result of Ioannidis and Christodoulakis [1991] observes that errors propagate exponentially with the number of joins in a query.In this article, we make two main contributions. First, we study the space complexity of using synopses for query optimization from a novel information-theoretic perspective. In particular, we offer evidence in support of histograms for single-table selections, including an analysis over data distributions known to be common in practice, and illustrate their limitations for join queries. Second, for a broad class of common queries involving joins (specifically, all queries involving only key-foreign key joins) we show that the strategy of storing a small precomputed sample of the database yields probabilistic guarantees that are almost space-optimal, which is an important property if these samples are to be used as database statistics. This is the first such optimality result, to our knowledge, and suggests that precomputed samples might be an effective way to circumvent the error propagation problem for queries with key-foreign key joins. We support this result empirically through an experimental study that demonstrates the effectiveness of precomputed samples, and also shows the increasing difference in the effectiveness of samples versus multidimensional histograms as the number of joins in the query grows.<\/jats:p>","DOI":"10.1145\/1114244.1114251","type":"journal-article","created":{"date-parts":[[2006,5,8]],"date-time":"2006-05-08T16:09:20Z","timestamp":1147104560000},"page":"1102-1127","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Synopses for query optimization"],"prefix":"10.1145","volume":"30","author":[{"given":"Raghav","family":"Kaushik","sequence":"first","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]},{"given":"Jeffrey F.","family":"Naughton","sequence":"additional","affiliation":[{"name":"University of Wisconsin-Madison, Madison, WI"}]},{"given":"Raghu","family":"Ramakrishnan","sequence":"additional","affiliation":[{"name":"University of Wisconsin-Madison, Madison, WI"}]},{"given":"Venkatesan T.","family":"Chakravarthy","sequence":"additional","affiliation":[{"name":"IBM India Research Lab, New Delhi, India"}]}],"member":"320","published-online":{"date-parts":[[2005,12]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304207"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/303976.303978"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/237814.237823"},{"key":"e_1_2_1_5_1","unstructured":"Blake C. and Merz C. 1998. UCI repository of machine learning databases.  Blake C. and Merz C. 1998. UCI repository of machine learning databases."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564722"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/375663.375686"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Chakrabarti K.","unstructured":"Chakrabarti , K. , Garofalakis , M. , Rastogi , R. , and Shim , K . 2000. Approximate query answering using wavelets . In Proceedings of the International Conference on Very Large Databases. Chakrabarti, K., Garofalakis, M., Rastogi, R., and Shim, K. 2000. Approximate query answering using wavelets. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/543613.543650"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/275487.275492"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304206"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276343"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/329.318578"},{"key":"e_1_2_1_14_1","volume-title":"Elementary Numerical Analysis: An Algorithmic Approach","author":"Conte S. D.","unstructured":"Conte , S. D. and de Boor , C. 1972. Elementary Numerical Analysis: An Algorithmic Approach . McGraw Hill , New York . Conte, S. D. and de Boor, C. 1972. Elementary Numerical Analysis: An Algorithmic Approach. McGraw Hill, New York."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564699"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Faloutsos C.","unstructured":"Faloutsos , C. and Jagadish , H . 1992. On B-tree indices for skewed distributions . In Proceedings of the International Conference on Very Large Databases. Faloutsos, C. and Jagadish, H. 1992. On B-tree indices for skewed distributions. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_17_1","unstructured":"Gasser T. Engel J. and Seifert B. 1985. Non-parametric density estimation. Ann. Stat.  Gasser T. Engel J. and Seifert B. 1985. Non-parametric density estimation. Ann. Stat."},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Gibbons P.","unstructured":"Gibbons , P. , Matias , Y. , and Poosala , V . 1997. Fast incremental maintenance of approximate histograms . In Proceedings of the International Conference on Very Large Databases. Gibbons, P., Matias, Y., and Poosala, V. 1997. Fast incremental maintenance of approximate histograms. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335448"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/130283.130335"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.995539"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1270.1498"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Ioannidis Y. E.","year":"1993","unstructured":"Ioannidis , Y. E. 1993 . Universality of serial histograms . In Proceedings of the International Conference on Very Large Databases. Ioannidis, Y. E. 1993. Universality of serial histograms. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Ioannidis Y. E.","year":"2003","unstructured":"Ioannidis , Y. E. 2003 . The history of histograms . In Proceedings of the International Conference on Very Large Databases. Ioannidis, Y. E. 2003. The history of histograms. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/115790.115835"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/169725.169708"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Jagadish H.","unstructured":"Jagadish , H. , Poosala , V. , Koudas , N. , Sevcik , K. , Muthukrishnan , S. , and Suel , T . 1998. Optimal histograms with quality guarantees . In Proceedings of the International Conference on Very Large Databases. Jagadish, H., Poosala, V., Koudas, N., Sevcik, K., Muthukrishnan, S., and Suel, T. 1998. Optimal histograms with quality guarantees. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/335168.335223"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Kushilevitz E. and Nisan N. 1997. Communication Complexity. Cambridge University Press.   Kushilevitz E. and Nisan N. 1997. Communication Complexity. Cambridge University Press.","DOI":"10.1017\/CBO9780511574948"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/93597.93611"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/50202.50205"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the International Conference on Very Large Databases.","author":"Olken F.","unstructured":"Olken , F. and Rotem , D . 1986. Simple random sampling from relational databases . In Proceedings of the International Conference on Very Large Databases. Olken, F. and Rotem, D. 1986. Simple random sampling from relational databases. In Proceedings of the International Conference on Very Large Databases."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/602259.602294"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223841"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/233269.233342"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/582095.582099"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304199"},{"key":"e_1_2_1_38_1","volume-title":"C: The Art of Scientific Computing","author":"William S.","year":"1993","unstructured":"William , S. , Press, H. , Flannery, B., and Vetterling, W. 1993 . Numerical Recipes in C: The Art of Scientific Computing . Cambridge University Press . William, S., Press, H., Flannery, B., and Vetterling, W. 1993. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/800135.804414"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1114244.1114251","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1114244.1114251","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:08:38Z","timestamp":1750262918000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1114244.1114251"}},"subtitle":["A space-complexity perspective"],"short-title":[],"issued":{"date-parts":[[2005,12]]},"references-count":38,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2005,12]]}},"alternative-id":["10.1145\/1114244.1114251"],"URL":"https:\/\/doi.org\/10.1145\/1114244.1114251","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"value":"0362-5915","type":"print"},{"value":"1557-4644","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,12]]},"assertion":[{"value":"2005-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}