{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,21]],"date-time":"2025-05-21T06:56:53Z","timestamp":1747810613696},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"1-2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2010,9]]},"abstract":"<jats:p>Source and object selection and retrieval from large multi-source data sets are fundamental operations in many applications. In this paper, we initiate research on efficient source (e.g., database) and object selection algorithms on large multi-source data sets. Specifically, in order to acquire a specified number of satisfying objects with minimum cost over multiple databases, the query engine needs to determine the access overhead for individual data sources, the overhead of retrieving objects from each source, and possibly other statistics such as estimating the frequency of finding a satisfying object in order to determine how many objects to retrieve from each data source. We adopt a probabilistic approach to source selection utilizing a cost structure and a dynamic programming model for computing the optimal number of objects to retrieve from each data source. Such a structure can be a valuable asset where there is a monetary or time related cost associated with accessing large distributed databases. We present a thorough experimental evaluation to validate our techniques using real-world data sets.<\/jats:p>","DOI":"10.14778\/1920841.1920982","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"1125-1136","source":"Crossref","is-referenced-by-count":1,"title":["An access cost-aware approach for object retrieval over multiple sources"],"prefix":"10.14778","volume":"3","author":[{"given":"Benjamin","family":"Arai","sequence":"first","affiliation":[{"name":"University of California, Riverside"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gautam","family":"Das","sequence":"additional","affiliation":[{"name":"University of Texas, Arlington"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dimitrios","family":"Gunopulos","sequence":"additional","affiliation":[{"name":"University of Athens, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vagelis","family":"Hristidis","sequence":"additional","affiliation":[{"name":"Florida International University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nick","family":"Koudas","sequence":"additional","affiliation":[{"name":"University of Toronto"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2010,9]]},"reference":[{"key":"e_1_2_1_1_1","series-title":"Series B, page 39(1):138","volume-title":"Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society","author":"Arthur Dempster N. L.","year":"1977","unstructured":"N. L. Arthur Dempster and D. Rubin . Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society , Series B, page 39(1):138 , 1977 . N. L. Arthur Dempster and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, page 39(1):138, 1977."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872764"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2005.115"},{"key":"e_1_2_1_4_1","volume-title":"U.C.","author":"Bindel D.","unstructured":"D. Bindel , Y. Chen , P. Eaton , D. Geels , R. Gummadi , S. Rhea , H. Weatherspoon , W. Weimer , C. Wells , B. Zhao , and J. Kubiatowicz . Oceanstore: An extremely wide-area storage system . In U.C. Berkeley Technical Report UCB\/\/CSD-00-1102, 1999. D. Bindel, Y. Chen, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, B. Zhao, and J. Kubiatowicz. Oceanstore: An extremely wide-area storage system. In U.C. Berkeley Technical Report UCB\/\/CSD-00-1102, 1999."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1024694.1024706"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.10286"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/215206.215328"},{"key":"e_1_2_1_8_1","first-page":"205","volume-title":"OSDI","author":"Chang F.","year":"2006","unstructured":"F. Chang , J. Dean , S. Ghemawat , W. C. Hsieh , D. A. Wallach , M. Burrows , T. Chandra , A. Fikes , and R. Gruber . Bigtable: A distributed storage system for structured data . In OSDI , pages 205 -- 218 . USENIX Association , 2006 . F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber. Bigtable: A distributed storage system for structured data. In OSDI, pages 205--218. USENIX Association, 2006."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1247480.1247550"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2009.112"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/256163.256164"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/375551.375567"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/314516.314517"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1008683107812"},{"key":"e_1_2_1_15_1","first-page":"78","volume-title":"VLDB","author":"Gravano L.","year":"1995","unstructured":"L. Gravano and H. Garcia-Molina . Generalizing gloss to vector-space databases and broker hierarchies . In VLDB , pages 78 -- 89 , San Francisco, CA , USA, 1995 . Morgan Kaufmann Publishers Inc . L. Gravano and H. Garcia-Molina. Generalizing gloss to vector-space databases and broker hierarchies. In VLDB, pages 78--89, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/191843.191869"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1963.10500830"},{"key":"e_1_2_1_18_1","unstructured":"The Internet Movie Database (IMDB) http:\/\/www.imdb.org\/.  The Internet Movie Database (IMDB) http:\/\/www.imdb.org\/."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/606272.606297"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.379239"},{"key":"e_1_2_1_21_1","unstructured":"LexisNexus. http:\/\/www.lexisnexis.com\/.  LexisNexus. http:\/\/www.lexisnexis.com\/."},{"key":"e_1_2_1_22_1","first-page":"547","volume-title":"ICDE","author":"Liu Z.","year":"2004","unstructured":"Z. Liu , C. Luo , J. Cho , and W. W. Chu . A probabilistic approach to metasearching with adaptive probing . In ICDE , pages 547 -- 559 , 2004 . Z. Liu, C. Luo, J. Cho, and W. W. Chu. A probabilistic approach to metasearching with adaptive probing. In ICDE, pages 547--559, 2004."},{"key":"e_1_2_1_23_1","first-page":"89","article-title":"Source selection and ranking in the websemantics architecture: Using quality of data metadata","volume":"55","author":"Mihaila G. A.","year":"2001","unstructured":"G. A. Mihaila , L. Raschid , and M.-E. Vidal . Source selection and ranking in the websemantics architecture: Using quality of data metadata . In Advances in Computers 55 : 89 -- 119 , 2001 . G. A. Mihaila, L. Raschid, and M.-E. Vidal. Source selection and ranking in the websemantics architecture: Using quality of data metadata. In Advances in Computers 55: 89--119, 2001.","journal-title":"Advances in Computers"},{"key":"e_1_2_1_24_1","first-page":"355","volume-title":"Learning in Graphical Models","author":"Neal R.","year":"2005","unstructured":"R. Neal and G. Hinton . A view of the EM algorithm that justifies incremental, sparse, and other variants . Learning in Graphical Models , pages 355 -- 368 , 2005 . R. Neal and G. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in Graphical Models, pages 355--368, 2005."},{"key":"e_1_2_1_25_1","unstructured":"PubMed. http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/.  PubMed. http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/."},{"key":"e_1_2_1_26_1","volume-title":"FAST","author":"Rhea S. C.","year":"2003","unstructured":"S. C. Rhea , P. R. Eaton , D. Geels , H. Weatherspoon , B. Y. Zhao , and J. Kubiatowicz . Pond: The oceanstore prototype . In FAST , 2003 . S. C. Rhea, P. R. Eaton, D. Geels, H. Weatherspoon, B. Y. Zhao, and J. Kubiatowicz. Pond: The oceanstore prototype. In FAST, 2003."},{"key":"e_1_2_1_27_1","first-page":"359","volume-title":"Introduction to Mathematical Statistics","author":"Robert Hogg J. M.","year":"2005","unstructured":"J. M. Robert Hogg and A. Craig . Introduction to Mathematical Statistics . Pearson Prentice Hall , pages 359 -- 364 , 2005 . J. M. Robert Hogg and A. Craig. Introduction to Mathematical Statistics. Pearson Prentice Hall, pages 359--364, 2005."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2009.109"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1316689.1316746"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1017074.1017091"},{"key":"e_1_2_1_31_1","unstructured":"Westlaw. http:\/\/www.westlaw.com\/.  Westlaw. http:\/\/www.westlaw.com\/."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.290974"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1920841.1920982","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:34:47Z","timestamp":1672227287000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1920841.1920982"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,9]]},"references-count":32,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2010,9]]}},"alternative-id":["10.14778\/1920841.1920982"],"URL":"https:\/\/doi.org\/10.14778\/1920841.1920982","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2010,9]]}}}