{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T11:48:34Z","timestamp":1763466514895},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2009,8]]},"abstract":"<jats:p>\n            In many decision making applications, users typically issue aggregate queries. To evaluate these computationally expensive queries,\n            <jats:italic>online aggregation<\/jats:italic>\n            has been developed to provide approximate answers (with their respective confidence intervals) quickly, and to continuously refine the answers. In this paper, we extend the online aggregation technique to a distributed context where sites are maintained in a DHT (Distributed Hash Table) network. Our Distributed Online Aggregation (DoA) scheme iteratively and progressively produces approximate aggregate answers as follows: in each iteration, a small set of random samples are retrieved from the data sites and distributed to the processing sites; at each processing site, a local aggregate is computed based on the allocated samples; at a coordinator site, these local aggregates are combined into a global aggregate. DoA adaptively grows the number of processing nodes as the sample size increases. To further reduce the sampling overhead, the samples are retained as a precomputed synopsis over the network to be used for processing future queries. We also study how these synopsis can be maintained incrementally. We have conducted extensive experiments on PlanetLab. The results show that our DoA scheme reduces the initial waiting time significantly and provides high quality approximate answers with running confidence intervals progressively.\n          <\/jats:p>","DOI":"10.14778\/1687627.1687678","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"443-454","source":"Crossref","is-referenced-by-count":44,"title":["Distributed online aggregations"],"prefix":"10.14778","volume":"2","author":[{"given":"Sai","family":"Wu","sequence":"first","affiliation":[{"name":"National University of Singapore, Singapore"}]},{"given":"Shouxu","family":"Jiang","sequence":"additional","affiliation":[{"name":"Harbin Institute of Technology, Harbin, China"}]},{"given":"Beng Chin","family":"Ooi","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}]},{"given":"Kian-Lee","family":"Tan","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2009,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"http:\/\/www.comp.nus.edu.sg\/s3p2p\/.  http:\/\/www.comp.nus.edu.sg\/s3p2p\/."},{"key":"e_1_2_1_2_1","unstructured":"http:\/\/www.planet-lab.org.  http:\/\/www.planet-lab.org."},{"key":"e_1_2_1_3_1","unstructured":"http:\/\/www.tpc.org\/tpch.  http:\/\/www.tpc.org\/tpch."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304207"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4379(02)00051-0"},{"key":"e_1_2_1_6_1","volume-title":"IDEAS","author":"Albrecht J.","year":"1998","unstructured":"J. Albrecht and W. Lehner . On-line analytical processing in distributed data warehouses . In IDEAS , 1998 . J. Albrecht and W. Lehner. On-line analytical processing in distributed data warehouses. In IDEAS, 1998."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.23"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2007.71"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/HICSS.2006.126"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1052199.1052206"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/1270387.1270884"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_2_1_13_1","volume-title":"University of California","author":"Random O. F.","year":"1993","unstructured":"O. F. Random sampling from databases. In PhD Thesis , University of California , 1993 . O. F. Random sampling from databases. In PhD Thesis, University of California, 1993."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.peva.2005.01.002"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304208"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/253262.253291"},{"key":"e_1_2_1_17_1","volume-title":"WWW","author":"Henzinger M. R.","year":"2000","unstructured":"M. R. Henzinger , A. Heydon , M. Mitzenmacher , and M. Najork . On near-uniform url sampling . In WWW , 2000 . M. R. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. On near-uniform url sampling. In WWW, 2000."},{"key":"e_1_2_1_18_1","unstructured":"R. Huebsch J. M. Hellerstein N. Lanham B. T. Loo S. Shenker and I. Stoica. Querying the internet with pier. In VLDB.   R. Huebsch J. M. Hellerstein N. Lanham B. T. Loo S. Shenker and I. Stoica. Querying the internet with pier. In VLDB."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1272998.1273005"},{"key":"e_1_2_1_20_1","volume-title":"Middleware","author":"Jelasity M.","year":"2004","unstructured":"M. Jelasity , R. Guerraoui , A.-M. Kermarrec , and M. van Steen . The peer sampling service: experimental evaluation of unstructured gossip-based implementations . In Middleware , 2004 . M. Jelasity, R. Guerraoui, A.-M. Kermarrec, and M. van Steen. The peer sampling service: experimental evaluation of unstructured gossip-based implementations. In Middleware, 2004."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1082469.1082470"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066222"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/236711.236713"},{"key":"e_1_2_1_24_1","volume-title":"Random sampling from databases","author":"Olken F.","year":"1993","unstructured":"F. Olken . Random sampling from databases . 1993 . F. Olken. Random sampling from databases. 1993."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/645477.654663"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.2213"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2002.808407"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/1032653.1033778"},{"key":"e_1_2_1_29_1","volume-title":"VLDB","author":"Tan K.-L.","year":"1999","unstructured":"K.-L. Tan , C. H. Goh , and B. C. Ooi . Online feedback for nested aggregate queries with multi-threading . In VLDB , 1999 . K.-L. Tan, C. H. Goh, and B. C. Ooi. Online feedback for nested aggregate queries with multi-threading. In VLDB, 1999."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376647"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1687627.1687678","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:28:24Z","timestamp":1672226904000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1687627.1687678"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,8]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,8]]}},"alternative-id":["10.14778\/1687627.1687678"],"URL":"https:\/\/doi.org\/10.14778\/1687627.1687678","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2009,8]]}}}