{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T13:38:30Z","timestamp":1774532310833,"version":"3.50.1"},"reference-count":23,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2009,8]]},"abstract":"<jats:p>The production environment for analytical data management applications is rapidly changing. Many enterprises are shifting away from deploying their analytical databases on high-end proprietary machines, and moving towards cheaper, lower-end, commodity hardware, typically arranged in a shared-nothing MPP architecture, often in a virtualized environment inside public or private \"clouds\". At the same time, the amount of data that needs to be analyzed is exploding, requiring hundreds to thousands of machines to work in parallel to perform the analysis.<\/jats:p>\n          <jats:p>There tend to be two schools of thought regarding what technology to use for data analysis in such an environment. Proponents of parallel databases argue that the strong emphasis on performance and efficiency of parallel databases makes them well-suited to perform such analysis. On the other hand, others argue that MapReduce-based systems are better suited due to their superior scalability, fault tolerance, and flexibility to handle unstructured data. In this paper, we explore the feasibility of building a hybrid system that takes the best features from both technologies; the prototype we built approaches parallel databases in performance and efficiency, yet still yields the scalability, fault tolerance, and flexibility of MapReduce-based systems.<\/jats:p>","DOI":"10.14778\/1687627.1687731","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"922-933","source":"Crossref","is-referenced-by-count":547,"title":["HadoopDB"],"prefix":"10.14778","volume":"2","author":[{"given":"Azza","family":"Abouzeid","sequence":"first","affiliation":[{"name":"Yale University"}]},{"given":"Kamil","family":"Bajda-Pawlikowski","sequence":"additional","affiliation":[{"name":"Yale University"}]},{"given":"Daniel","family":"Abadi","sequence":"additional","affiliation":[{"name":"Yale University"}]},{"given":"Avi","family":"Silberschatz","sequence":"additional","affiliation":[{"name":"Yale University"}]},{"given":"Alexander","family":"Rasin","sequence":"additional","affiliation":[{"name":"Brown University"}]}],"member":"320","published-online":{"date-parts":[[2009,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Hadoop. Web Page. hadoop.apache.org\/core\/.  Hadoop. Web Page. hadoop.apache.org\/core\/."},{"key":"e_1_2_1_2_1","unstructured":"HadoopDB Project. Web page. db.cs.yale.edu\/hadoopdb\/hadoopdb.html.  HadoopDB Project. Web page. db.cs.yale.edu\/hadoopdb\/hadoopdb.html."},{"key":"e_1_2_1_3_1","unstructured":"Vertica. www.vertica.com\/.  Vertica. www.vertica.com\/."},{"key":"e_1_2_1_4_1","unstructured":"D. Abadi. What is the right way to measure scale? DBMS Musings Blog. dbmsmusings.blogspot.com\/2009\/06\/what-is-right-way-to-measure-scale.html.  D. Abadi. What is the right way to measure scale? DBMS Musings Blog. dbmsmusings.blogspot.com\/2009\/06\/what-is-right-way-to-measure-scale.html."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945462"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454166"},{"key":"e_1_2_1_7_1","unstructured":"G. Czajkowski. Sorting 1pb with mapreduce. googleblog.blogspot.com\/2008\/11\/sorting-1pb-with-mapreduce.html.  G. Czajkowski. Sorting 1pb with mapreduce. googleblog.blogspot.com\/2008\/11\/sorting-1pb-with-mapreduce.html."},{"key":"e_1_2_1_8_1","volume-title":"OSDI","author":"Dean J.","year":"2004"},{"key":"e_1_2_1_9_1","unstructured":"D. DeWitt and M. Stonebraker. MapReduce: A major step backwards. DatabaseColumn Blog. www.databasecolumn. com\/2008\/01\/mapreduce-a-major-step-back.html.  D. DeWitt and M. Stonebraker. MapReduce: A major step backwards. DatabaseColumn Blog. www.databasecolumn. com\/2008\/01\/mapreduce-a-major-step-back.html."},{"key":"e_1_2_1_10_1","volume-title":"GAMMA - A High Performance Dataflow Database Machine. In VLDB '86","author":"DeWitt D. J.","year":"1986"},{"key":"e_1_2_1_11_1","unstructured":"Facebook. Hive. Web page. issues.apache.org\/jira\/browse\/HADOOP-3601.  Facebook. Hive. Web page. issues.apache.org\/jira\/browse\/HADOOP-3601."},{"key":"e_1_2_1_12_1","volume-title":"An Overview of The System Software of A Parallel Relational Database Machine. In VLDB '86","author":"Fushimi S.","year":"1986"},{"key":"e_1_2_1_13_1","unstructured":"Hadoop Project. Hadoop Cluster Setup. Web Page. hadoop.apache.org\/core\/docs\/current\/cluster_setup.html.  Hadoop Project. Hadoop Cluster Setup. Web Page. hadoop.apache.org\/core\/docs\/current\/cluster_setup.html."},{"key":"e_1_2_1_14_1","volume-title":"Proc. of CIDR","author":"Hamilton J.","year":"2009"},{"key":"e_1_2_1_15_1","unstructured":"Hive Project. Hive SVN Repository. Accessed May 19th 2009. svn.apache.org\/viewvc\/hadoop\/hive\/.  Hive Project. Hive SVN Repository. Accessed May 19th 2009. svn.apache.org\/viewvc\/hadoop\/hive\/."},{"key":"e_1_2_1_16_1","unstructured":"J. N. Hoover. Start-Ups Bring Google's Parallel Processing To Data Warehousing. InformationWeek August 29th 2008.  J. N. Hoover. Start-Ups Bring Google's Parallel Processing To Data Warehousing. InformationWeek August 29th 2008."},{"key":"e_1_2_1_17_1","unstructured":"S. Madden D. DeWitt and M. Stonebraker. Database parallelism choices greatly impact scalability. DatabaseColumn Blog. www.databasecolumn.com\/2007\/10\/database-parallelism-choices.html.  S. Madden D. DeWitt and M. Stonebraker. Database parallelism choices greatly impact scalability. DatabaseColumn Blog. www.databasecolumn.com\/2007\/10\/database-parallelism-choices.html."},{"key":"e_1_2_1_18_1","unstructured":"Mayank Bawa. A $5.1M Addendum to our Series B. www.asterdata.com\/blog\/index.php\/2009\/02\/25\/a-51m-addendum-to-our-series-b\/.  Mayank Bawa. A $5.1M Addendum to our Series B. www.asterdata.com\/blog\/index.php\/2009\/02\/25\/a-51m-addendum-to-our-series-b\/."},{"key":"e_1_2_1_19_1","unstructured":"C. Monash. The 1-petabyte barrier is crumbling. www.networkworld.com\/community\/node\/31439.  C. Monash. The 1-petabyte barrier is crumbling. www.networkworld.com\/community\/node\/31439."},{"key":"e_1_2_1_20_1","unstructured":"C. Monash. Cloudera presents the MapReduce bull case. DBMS2 Blog. www.dbms2.com\/2009\/04\/15\/cloudera-presents-the-mapreduce-bull-case\/.  C. Monash. Cloudera presents the MapReduce bull case. DBMS2 Blog. www.dbms2.com\/2009\/04\/15\/cloudera-presents-the-mapreduce-bull-case\/."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376726"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559865"},{"key":"e_1_2_1_24_1","volume-title":"VLDB","author":"Stonebraker M.","year":"2005"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1687627.1687731","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:33:36Z","timestamp":1672227216000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1687627.1687731"}},"subtitle":["an architectural hybrid of MapReduce and DBMS technologies for analytical workloads"],"short-title":[],"issued":{"date-parts":[[2009,8]]},"references-count":23,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,8]]}},"alternative-id":["10.14778\/1687627.1687731"],"URL":"https:\/\/doi.org\/10.14778\/1687627.1687731","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2009,8]]}}}