{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T20:35:18Z","timestamp":1765485318074,"version":"3.41.0"},"reference-count":66,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2008,8,1]],"date-time":"2008-08-01T00:00:00Z","timestamp":1217548800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100002920","name":"Research Grants Council, University Grants Committee, Hong Kong","doi-asserted-by":"publisher","award":["HKU 7155\/06E"],"award-info":[{"award-number":["HKU 7155\/06E"]}],"id":[{"id":"10.13039\/501100002920","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2008,8]]},"abstract":"<jats:p>\n            Hierarchical synopsis structures offer a viable alternative in terms of efficiency and flexibility in relation to traditional summarization techniques such as histograms. Previous research on such structures has mostly focused on a single model, based on the Haar wavelet decomposition. In previous work, we have introduced a more refined, wavelet-inspired hierarchical index structure for synopsis construction: the Haar\n            <jats:sup>+<\/jats:sup>\n            tree. The chief advantages of this structure are twofold. First, it achieves higher synopsis quality at the task of summarizing data sets with sharp discontinuities than state-of-the-art histogram and Haar wavelet techniques. Second, thanks to its search space delimitation capacity, Haar\n            <jats:sup>+<\/jats:sup>\n            synopsis construction operates in time\n            <jats:italic>linear<\/jats:italic>\n            in the size of the data set for\n            <jats:italic>any<\/jats:italic>\n            monotonic distributive error metric. Contemporaneous research has introduced another hierarchical synopsis structure, the compact hierarchical histogram (CHH). In this article, we elaborate on both these structures. First, we formally prove that the CHH, in its default binary-hierarchy form, is a simplified variant of a Haar\n            <jats:sup>+<\/jats:sup>\n            tree. We then focus on the summarization problem, with both these hierarchical synopsis structures, in which an error guarantee expressed by a\n            <jats:italic>maximum-error<\/jats:italic>\n            metric is required. We show that this problem is most efficiently solved through its dual, space-minimization counterpart, which can also achieve\n            <jats:italic>optimal quality<\/jats:italic>\n            . In this case, there is a benefit to be gained by specializing the algorithm for each structure; hence, our algorithm for optimal-quality maximum-error CHH requires\n            <jats:italic>low polynomial<\/jats:italic>\n            time; on the other hand, optimal-quality Haar\n            <jats:sup>+<\/jats:sup>\n            synopses for maximum-error metrics are constructed in exponential time; hence, we also develop a low-polynomial-time approximation scheme for the maximum-error Haar\n            <jats:sup>+<\/jats:sup>\n            case. Furthermore, we extend our approach for both general-error and maximum-error Haar\n            <jats:sup>+<\/jats:sup>\n            synopses to arbitrary dimensionality. In our experimental study, (i) we confirm the theoretically expected superiority of Haar\n            <jats:sup>+<\/jats:sup>\n            synopses over Haar wavelet methods in both construction time and achieved quality for representative error metrics; (ii) we demonstrate that Haar\n            <jats:sup>+<\/jats:sup>\n            synopses are also constructed faster than optimal plain histograms, and, moreover, achieve higher synopsis quality with highly discontinuous data sets; such an advantage of a hierarchical synopsis structure over a histogram had been intuitively expressed, but never experimentally verified; and (iii) we show that Haar\n            <jats:sup>+<\/jats:sup>\n            synopsis quality supersedes that of a CHH.\n          <\/jats:p>","DOI":"10.1145\/1386118.1386124","type":"journal-article","created":{"date-parts":[[2008,9,4]],"date-time":"2008-09-04T12:51:35Z","timestamp":1220532695000},"page":"1-53","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":23,"title":["Hierarchical synopses with optimal error guarantees"],"prefix":"10.1145","volume":"33","author":[{"given":"Panagiotis","family":"Karras","sequence":"first","affiliation":[{"name":"National University of Singapore, Law Link, Singapore"}]},{"given":"Nikos","family":"Mamoulis","sequence":"additional","affiliation":[{"name":"University of Hong Kong, Pokfulam Road, Hong Kong"}]}],"member":"320","published-online":{"date-parts":[[2008,9,3]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304198"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304207"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1281192.1281197"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/366573.366611"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/375663.375686"},{"volume-title":"Proceedings of the IEEE International Conference on Data Engineering (ICDE'07)","author":"Buragohain C.","key":"e_1_2_2_6_1","unstructured":"Buragohain , C. , Shrivastava , N. , and Suri , S . 2007. Space efficient streaming algorithms for the maximum error histogram . In Proceedings of the IEEE International Conference on Data Engineering (ICDE'07) . Buragohain, C., Shrivastava, N., and Suri, S. 2007. Space efficient streaming algorithms for the maximum error histogram. In Proceedings of the IEEE International Conference on Data Engineering (ICDE'07)."},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/767141.767147"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/568518.568520"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.4304\/jcp.2.8.64-76"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/11687238_4"},{"volume-title":"Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM'05)","author":"Deligiannakis A.","key":"e_1_2_2_11_1","unstructured":"Deligiannakis , A. , Garofalakis , M. , and Roussopoulos , N . 2005. A fast approximation scheme for probabilistic wavelet synopses . In Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM'05) . Deligiannakis, A., Garofalakis, M., and Roussopoulos, N. 2005. A fast approximation scheme for probabilistic wavelet synopses. In Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM'05)."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242524.1242527"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/375663.375685"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066677.1066817"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/974750.974753"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1055558.1055582"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1114244.1114246"},{"volume-title":"Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM","author":"Gibbons P. B.","key":"e_1_2_2_18_1","unstructured":"Gibbons , P. B. and Matias , Y . 1999. Synopsis data structures for massive data sets . In Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM , New York. Gibbons, P. B. and Matias, Y. 1999. Synopsis data structures for massive data sets. In Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York."},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/581751.581753"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/509907.509966"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/375551.375598"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1198389"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/99.388960"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-007-0083-9"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2007.913569"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1132863.1132873"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/543613.543637"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-007-0052-3"},{"volume-title":"Proceedings of the 13th International Conference on Very Large Data Bases (VLDB'04)","author":"Guha S.","key":"e_1_2_2_29_1","unstructured":"Guha , S. , Shim , K. , and Woo , J . 2004. REHIST: Relative error histogram construction algorithms . In Proceedings of the 13th International Conference on Very Large Data Bases (VLDB'04) . Guha, S., Shim, K., and Woo, J. 2004. REHIST: Relative error histogram construction algorithms. In Proceedings of the 13th International Conference on Very Large Data Bases (VLDB'04)."},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-003-0090-4"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01456326"},{"key":"e_1_2_2_32_1","volume-title":"Proceedings of the 19th International Conference on Very Large Data Bases (VLDB'93)","author":"Ioannidis Y. E.","year":"1993","unstructured":"Ioannidis , Y. E. 1993 . Universality of serial histograms . In Proceedings of the 19th International Conference on Very Large Data Bases (VLDB'93) . Ioannidis, Y. E. 1993. Universality of serial histograms. In Proceedings of the 19th International Conference on Very Large Data Bases (VLDB'93)."},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/645505.656447"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223841"},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB'99)","author":"Ioannidis Y. E.","key":"e_1_2_2_35_1","unstructured":"Ioannidis , Y. E. and Poosala , V . 1999. Histogram-based approximation of set-valued query-answers . In Proceedings of the International Conference on Very Large Data Bases (VLDB'99) . Ioannidis, Y. E. and Poosala, V. 1999. Histogram-based approximation of set-valued query-answers. In Proceedings of the International Conference on Very Large Data Bases (VLDB'99)."},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB'98)","author":"Jagadish H. V.","key":"e_1_2_2_36_1","unstructured":"Jagadish , H. V. , Koudas , N. , Muthukrishnan , S. , Poosala , V. , Sevcik , K. C. , and Suel , T . 1998. Optimal histograms with quality guarantees . In Proceedings of the International Conference on Very Large Data Bases (VLDB'98) . Jagadish, H. V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K. C., and Suel, T. 1998. Optimal histograms with quality guarantees. In Proceedings of the International Conference on Very Large Data Bases (VLDB'98)."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066189"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1137\/1036095"},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB'05)","author":"Karras P.","key":"e_1_2_2_39_1","unstructured":"Karras , P. and Mamoulis , N . 2005. One-pass wavelet synopses for maximum-error metrics . In Proceedings of the International Conference on Very Large Data Bases (VLDB'05) . Karras, P. and Mamoulis, N. 2005. One-pass wavelet synopses for maximum-error metrics. In Proceedings of the International Conference on Very Large Data Bases (VLDB'05)."},{"volume-title":"Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE'07)","author":"Karras P.","key":"e_1_2_2_40_1","unstructured":"Karras , P. and Mamoulis , N . 2007. The Haar+ tree: A refined synopsis data structure . In Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE'07) . Karras, P. and Mamoulis, N. 2007. The Haar+ tree: A refined synopsis data structure. In Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE'07)."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2008.4497433"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1281192.1281235"},{"volume-title":"Proceedings of the 24th International Colloquium on Automata, Languages and Programming (ICALP'97)","author":"Khanna S.","key":"e_1_2_2_43_1","unstructured":"Khanna , S. , Muthukrishnan , S. , and Skiena , S . 1997. Efficient array partitioning . In Proceedings of the 24th International Colloquium on Automata, Languages and Programming (ICALP'97) . Khanna, S., Muthukrishnan, S., and Skiena, S. 1997. Efficient array partitioning. In Proceedings of the 24th International Colloquium on Automata, Languages and Programming (ICALP'97)."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/335168.335223"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183512.1183519"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/11841036_46"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2006.11.018"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276344"},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB'00)","author":"Matias Y.","key":"e_1_2_2_49_1","unstructured":"Matias , Y. , Vitter , J. S. , and Wang , M . 2000. Dynamic maintenance of wavelet-based histograms . In Proceedings of the International Conference on Very Large Data Bases (VLDB'00) . Matias, Y., Vitter, J. S., and Wang, M. 2000. Dynamic maintenance of wavelet-based histograms. In Proceedings of the International Conference on Very Large Data Bases (VLDB'00)."},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-9892.1994.tb00186.x"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/50202.50205"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1007\/11590156_23"},{"volume-title":"Proceedings of the 7th International Conference on Database Theory (ICDT'99)","author":"Muthukrishnan S.","key":"e_1_2_2_53_1","unstructured":"Muthukrishnan , S. , Poosala , V. , and Suel , T . 1999. On rectangular partitionings in two dimensions: Algorithms, complexity, and applications . In Proceedings of the 7th International Conference on Database Theory (ICDT'99) . Muthukrishnan, S., Poosala, V., and Suel, T. 1999. On rectangular partitionings in two dimensions: Algorithms, complexity, and applications. In Proceedings of the 7th International Conference on Database Theory (ICDT'99)."},{"volume-title":"Proceedings of Foundation of Software Technology and Theoretical Computer Science (FSTTCS'03)","author":"Muthukrishnan S.","key":"e_1_2_2_54_1","unstructured":"Muthukrishnan , S. and Strauss , M . 2003a. Maintenance of multidimensional histograms . In Proceedings of Foundation of Software Technology and Theoretical Computer Science (FSTTCS'03) . Springer, Berlin, Germany. Muthukrishnan, S. and Strauss, M. 2003a. Maintenance of multidimensional histograms. In Proceedings of Foundation of Software Technology and Theoretical Computer Science (FSTTCS'03). Springer, Berlin, Germany."},{"volume-title":"Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'03)","author":"Muthukrishnan S.","key":"e_1_2_2_55_1","unstructured":"Muthukrishnan , S. and Strauss , M . 2003b. Rangesum histograms . In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'03) . ACM, New York. Muthukrishnan, S. and Strauss, M. 2003b. Rangesum histograms. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'03). ACM, New York."},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1007\/11561071_65"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.11.006"},{"key":"e_1_2_2_58_1","first-page":"5","article-title":"Approximate query answering using histograms","volume":"22","author":"Poosala V.","year":"1999","unstructured":"Poosala , V. , Ganti , V. , and Ioannidis , Y. E. 1999 . Approximate query answering using histograms . IEEE Data Eng. Bull. 22 , 4, 5 -- 14 . Poosala, V., Ganti, V., and Ioannidis, Y. E. 1999. Approximate query answering using histograms. IEEE Data Eng. Bull. 22, 4, 5--14.","journal-title":"IEEE Data Eng. Bull."},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB'97)","author":"Poosala V.","key":"e_1_2_2_59_1","unstructured":"Poosala , V. and Ioannidis , Y. E . 1997. Selectivity estimation without the attribute value independence assumption . In Proceedings of the International Conference on Very Large Data Bases (VLDB'97) . Poosala, V. and Ioannidis, Y. E. 1997. Selectivity estimation without the attribute value independence assumption. In Proceedings of the International Conference on Very Large Data Bases (VLDB'97)."},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/233269.233342"},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB'06)","author":"Reiss F.","key":"e_1_2_2_61_1","unstructured":"Reiss , F. , Garofalakis , M. , and Hellerstein , J. M . 2006. Compact histograms for hierarchical identifiers . In Proceedings of the International Conference on Very Large Data Bases (VLDB'06) . Reiss, F., Garofalakis, M., and Hellerstein, J. M. 2006. Compact histograms for hierarchical identifiers. In Proceedings of the International Conference on Very Large Data Bases (VLDB'06)."},{"volume-title":"Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE'06)","author":"Srivastava U.","key":"e_1_2_2_62_1","unstructured":"Srivastava , U. , Haas , P. J. , Markl , V. , Kutsch , M. , and Tran , T. M . 2006 . In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE'06) . Srivastava, U., Haas, P. J., Markl, V., Kutsch, M., and Tran, T. M. 2006. In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE'06)."},{"volume-title":"Proceedings of the 6th SIAM International Conference on Data Mining (SDM).","author":"Terzi E.","key":"e_1_2_2_63_1","unstructured":"Terzi , E. and Tsaparas , P . 2006. Efficient algorithms for sequence segmentation . In Proceedings of the 6th SIAM International Conference on Data Mining (SDM). Terzi, E. and Tsaparas, P. 2006. Efficient algorithms for sequence segmentation. In Proceedings of the 6th SIAM International Conference on Data Mining (SDM)."},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564741"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304199"},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/288627.288645"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1386118.1386124","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1386118.1386124","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:57:47Z","timestamp":1750255067000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1386118.1386124"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8]]},"references-count":66,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2008,8]]}},"alternative-id":["10.1145\/1386118.1386124"],"URL":"https:\/\/doi.org\/10.1145\/1386118.1386124","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"type":"print","value":"0362-5915"},{"type":"electronic","value":"1557-4644"}],"subject":[],"published":{"date-parts":[[2008,8]]},"assertion":[{"value":"2007-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2008-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2008-09-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}