{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:45:14Z","timestamp":1750308314824,"version":"3.41.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2007,6,1]],"date-time":"2007-06-01T00:00:00Z","timestamp":1180656000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2007,6]]},"abstract":"<jats:p>\n            The exact computation of aggregate queries, like the size of join of two relations, usually requires large amounts of memory (constrained in data-streaming) or communication (constrained in distributed computation) and large processing times. In this situation, approximation techniques with provable guarantees, like sketches, are one possible solution. The performance of sketches depends crucially on the ability to generate particular pseudo-random numbers. In this article we investigate both theoretically and empirically the problem of generating\n            <jats:italic>k<\/jats:italic>\n            -wise independent pseudo-random numbers and, in particular, that of generating 3- and 4-wise independent pseudo-random numbers that are fast range-summable (i.e., they can be summed in sublinear time). Our specific contributions are: (a) we provide a thorough comparison of the various pseudo-random number generating schemes; (b) we study both theoretically and empirically the fast range-summation property of 3- and 4-wise independent generating schemes; (c) we provide algorithms for the fast range-summation of two 3-wise independent schemes, BCH and extended Hamming; and (d) we show convincing theoretical and empirical evidence that the extended Hamming scheme performs as well as any 4-wise independent scheme for estimating the size of join of two relations using AMS sketches, even though it is only 3-wise independent. We use this scheme to generate estimators that significantly outperform state-of-the-art solutions for two problems, namely,\n            <jats:italic>size of spatial joins<\/jats:italic>\n            and\n            <jats:italic>selectivity estimation<\/jats:italic>\n            .\n          <\/jats:p>","DOI":"10.1145\/1242524.1242528","type":"journal-article","created":{"date-parts":[[2007,9,14]],"date-time":"2007-09-14T13:44:55Z","timestamp":1189777495000},"page":"11","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["Pseudo-random number generation for sketch-based estimations"],"prefix":"10.1145","volume":"32","author":[{"given":"Florin","family":"Rusu","sequence":"first","affiliation":[{"name":"University of Florida, Gainesville, FL"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alin","family":"Dobra","sequence":"additional","affiliation":[{"name":"University of Florida, Gainesville, FL"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2007,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2005.118"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/0196-6774(86)90019-2"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.2001.1813"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/237814.237823"},{"volume-title":"Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms","author":"Bar-Yosseff Z.","key":"e_1_2_1_5_1","unstructured":"Bar-Yosseff , Z. , Kumar , R. , and Sivakumar , D . 2002. Reductions in streaming algorithms, with an application to counting triangles in graphs . In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms ( Philadelphia, PA). 623--632. Bar-Yosseff, Z., Kumar, R., and Sivakumar, D. 2002. Reductions in streaming algorithms, with an application to counting triangles in graphs. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (Philadelphia, PA). 623--632."},{"volume-title":"Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms","author":"Calderbank A. R.","key":"e_1_2_1_6_1","unstructured":"Calderbank , A. R. , Gilbert , A. , Levchenko , K. , Muthukrishnan , S. , and Strauss , M . 2005. Improved range-summable random variable construction algorithms . In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms ( Philadelphia, PA). 840--849. Calderbank, A. R., Gilbert, A., Levchenko, K., Muthukrishnan, S., and Strauss, M. 2005. Improved range-summable random variable construction algorithms. In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (Philadelphia, PA). 840--849."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(79)90044-8"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(03)00400-6"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007646"},{"key":"e_1_2_1_10_1","unstructured":"Deskins W. E. 1996. Abstract Algebra. Dover New York.  Deskins W. E. 1996. Abstract Algebra. Dover New York."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564699"},{"key":"e_1_2_1_12_1","volume-title":"Tech. Rep. TR-90-033, ICSI","author":"Ehrenfeucht A.","year":"1990","unstructured":"Ehrenfeucht , A. and Karpinski , M . 1990 . The computational complexity of (xor-and) counting problems. Tech. Rep. TR-90-033, ICSI , Berkeley, California . Ehrenfeucht, A. and Karpinski, M. 1990. The computational complexity of (xor-and) counting problems. Tech. Rep. TR-90-033, ICSI, Berkeley, California."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1137\/S0097539799361701"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872790"},{"volume-title":"Proceedings of the 9th International Conference on Extending Database Technology. Springer, 569--586","author":"Ganguly S.","key":"e_1_2_1_15_1","unstructured":"Ganguly , S. , Garofalakis , M. , and Rastogi , R . 2004. Processing data-stream join aggregates using skimmed sketches . In Proceedings of the 9th International Conference on Extending Database Technology. Springer, 569--586 . Ganguly, S., Garofalakis, M., and Rastogi, R. 2004. Processing data-stream join aggregates using skimmed sketches. In Proceedings of the 9th International Conference on Extending Database Technology. Springer, 569--586."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1198389"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.108"},{"key":"e_1_2_1_18_1","volume-title":"Electron. Colloq. Comput. Complex. 4, 20","author":"Goldreich O.","year":"1997","unstructured":"Goldreich , O. 1997 . A sample of samplers: A computational perspective on sampling . Electron. Colloq. Comput. Complex. 4, 20 . Goldreich, O. 1997. A sample of samplers: A computational perspective on sampling. Electron. Colloq. Comput. Complex. 4, 20."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-1478-6"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1006\/jagm.1993.1014"},{"volume-title":"Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science. IEEE Computer Society","author":"Kempe D.","key":"e_1_2_1_21_1","unstructured":"Kempe , D. , Dobra , A. , and Gehrke , J . 2003. Gossip-Based computation of aggregate information . In Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science. IEEE Computer Society , Washington, DC. 482--491. Kempe, D., Dobra, A., and Gehrke, J. 2003. Gossip-Based computation of aggregate information. In Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science. IEEE Computer Society, Washington, DC. 482--491."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/948205.948236"},{"key":"e_1_2_1_23_1","unstructured":"Levchenko K. 2005. http:\/\/www.cse.ucsd.edu\/~klevchen\/II-2005.pdf.  Levchenko K. 2005. http:\/\/www.cse.ucsd.edu\/~klevchen\/II-2005.pdf."},{"key":"e_1_2_1_24_1","volume-title":"Tech. Rep. UCB\/CSD-95-880, EECS","author":"Luby M.","year":"1995","unstructured":"Luby , M. and Wigderson , A . 1995 . Pairwise independence and derandomization. Tech. Rep. UCB\/CSD-95-880, EECS , University of California at California , Berkeley, Berkeley . Luby, M. and Wigderson, A. 1995. Pairwise independence and derandomization. Tech. Rep. UCB\/CSD-95-880, EECS, University of California at California, Berkeley, Berkeley."},{"volume-title":"Mathematical Statistics","author":"Shao J.","key":"e_1_2_1_25_1","unstructured":"Shao , J. 1999. Mathematical Statistics . Springer , New York . Shao, J. 1999. Mathematical Statistics. Springer, New York."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564741"},{"volume-title":"Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms","author":"Thorup M.","key":"e_1_2_1_27_1","unstructured":"Thorup , M. and Zhang , Y . 2004. Tabulation based 4-universal hashing with applications to second moment estimation . In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms ( Philadelphia, PA). 615--624. Thorup, M. and Zhang, Y. 2004. Tabulation based 4-universal hashing with applications to second moment estimation. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (Philadelphia, PA). 615--624."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(81)90033-7"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1242524.1242528","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1242524.1242528","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:38:56Z","timestamp":1750268336000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1242524.1242528"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,6]]},"references-count":28,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2007,6]]}},"alternative-id":["10.1145\/1242524.1242528"],"URL":"https:\/\/doi.org\/10.1145\/1242524.1242528","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"type":"print","value":"0362-5915"},{"type":"electronic","value":"1557-4644"}],"subject":[],"published":{"date-parts":[[2007,6]]},"assertion":[{"value":"2007-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}