{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T20:40:25Z","timestamp":1654116025207},"reference-count":16,"publisher":"IGI Global","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,10,1]]},"abstract":"<p>To achieve high reliability and scalability, most large-scale data warehouse systems have adopted the cluster-based architecture. In this context, MapReduce has emerged as a promising architecture for large scale data warehousing and data analytics on commodity clusters. The MapReduce framework offers several lucrative features such as high fault-tolerance, scalability and use of a variety of hardware from low to high range. But these benefits have resulted in substantial performance compromise. In this paper, we propose the design of a novel cluster-based data warehouse system, Daenyrys for data processing on Hadoop \u2013 an open source implementation of the MapReduce framework under the umbrella of Apache. Daenyrys is a data management system which has the capability to take decision about the optimum partitioning scheme for the Hadoop's distributed file system (DFS). The optimum partitioning scheme improves the performance of the complete framework. The choice of the optimum partitioning is query-context dependent. In Daenyrys, the columns are formed into optimized groups to provide the basis for the partitioning of tables vertically. Daenyrys has an algorithm that monitors the context of current queries and based on the observations, it re-partitions the DFS for better performance and resource utilization. In the proposed system, Hive, a MapReduce-based SQL-like query engine is supported above the DFS.<\/p>","DOI":"10.4018\/ijcac.2013100104","type":"journal-article","created":{"date-parts":[[2014,4,15]],"date-time":"2014-04-15T12:06:52Z","timestamp":1397563612000},"page":"38-50","source":"Crossref","is-referenced-by-count":1,"title":["A Context-Based Performance Enhancement Algorithm for Columnar Storage in MapReduce with Hive"],"prefix":"10.4018","volume":"3","author":[{"given":"Yashvardhan","family":"Sharma","sequence":"first","affiliation":[{"name":"Department of Computer Science and Information Systems, Birla Institute of Technology and Science, Pilani, India"}]},{"given":"Saurabh","family":"Verma","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Systems, Birla Institute of Technology and Science, Pilani, India"}]},{"given":"Sumit","family":"Kumar","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Systems, Birla Institute of Technology and Science, Pilani, India"}]},{"given":"Shivam","family":"U.","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Systems, Birla Institute of Technology and Science, Pilani, India"}]}],"member":"2432","reference":[{"issue":"2","key":"ijcac.2013100104-0","doi-asserted-by":"crossref","DOI":"10.14778\/1687553.1687625","article-title":"Column-oriented database systems.","volume":"2","author":"D. J.Abadi","year":"2009","journal-title":"Proceedings of the VLDB Endowment"},{"key":"ijcac.2013100104-1","doi-asserted-by":"crossref","unstructured":"Abadi, D. J., Madden, S. R., & Hachem, N. (2008, June 9-12). Column-stores vs. row-stores: How different are they really? In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, Canada.","DOI":"10.1145\/1376616.1376712"},{"issue":"1-2","key":"ijcac.2013100104-2","doi-asserted-by":"crossref","first-page":"1459","DOI":"10.14778\/1920841.1921020","article-title":"Cheetah: A high performance, custom data warehouse on top of MapReduce.","volume":"3","author":"S.Chen","year":"2010","journal-title":"Proceedings of the VLDB Endowment"},{"issue":"12","key":"ijcac.2013100104-3","doi-asserted-by":"crossref","first-page":"1802","DOI":"10.14778\/2367502.2367519","article-title":"Interactive analytical processing in big data systems: A cross-industry study of MapReduce workloads.","volume":"5","author":"Y.Chen","year":"2012","journal-title":"Proceedings of the VLDB Endowment"},{"key":"ijcac.2013100104-4","unstructured":"Condie, T., et al. (2010). MapReduce online. NSDI, 10(4)."},{"key":"ijcac.2013100104-5","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"ijcac.2013100104-6","doi-asserted-by":"publisher","DOI":"10.1145\/1629175.1629198"},{"key":"ijcac.2013100104-7","doi-asserted-by":"publisher","DOI":"10.1145\/129888.129894"},{"issue":"12","key":"ijcac.2013100104-8","doi-asserted-by":"crossref","first-page":"2014","DOI":"10.14778\/2367502.2367562","article-title":"Efficient big data processing in Hadoop MapReduce.","volume":"5","author":"J.Dittrich","year":"2012","journal-title":"Proceedings of the VLDB Endowment"},{"key":"ijcac.2013100104-9","doi-asserted-by":"crossref","unstructured":"Kaldewey, T., Shekita, E. J., & Tata, S. (2012, March 27-30). Clydesdale: Structured data processing on MapReduce. In Proceedings of the 15th International Conference on Extending Database Technology, Berlin, Germany.","DOI":"10.1145\/2247596.2247600"},{"key":"ijcac.2013100104-10","unstructured":"L\u00e4hteenm\u00e4ki, P. (2012). MapReduce with columnar storage. In Proceedings of the Seminar on Columnar Databases, University of Helsinki."},{"key":"ijcac.2013100104-11","doi-asserted-by":"crossref","unstructured":"Pavlo, A., Paulson, E., Rasin, A., Abadi, D. J., DeWitt, D. J., Madden, S., & Stonebraker, M. (2009, June 29-July 2). A comparison of approaches to large-scale data analysis. In Proceedings of the 35th SIGMOD International Conference on Management of Data, Providence, RI.","DOI":"10.1145\/1559845.1559865"},{"key":"ijcac.2013100104-12","doi-asserted-by":"crossref","unstructured":"Shvachko, K., et al. (2010). The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). IEEE.","DOI":"10.1109\/MSST.2010.5496972"},{"key":"ijcac.2013100104-13","unstructured":"Stonebraker, M., Abadi, D., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., et al. (2005). C-store: A column-oriented DBMS. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB), pp. 553-564)."},{"issue":"2","key":"ijcac.2013100104-14","doi-asserted-by":"crossref","first-page":"1626","DOI":"10.14778\/1687553.1687609","article-title":"Hive: A warehousing solution over a map-reduce framework.","volume":"2","author":"A.Thusoo","year":"2009","journal-title":"Proceedings of the VLDB Endowment"},{"key":"ijcac.2013100104-15","doi-asserted-by":"crossref","unstructured":"Xu, C., Zhou, M., & Qian, W. (2010). Materialized view maintenance in columnar storage for massive data analysis. In Proceedings of the 2010 4th International on Universal Communication Symposium (IUCS). IEEE.","DOI":"10.1109\/IUCS.2010.5666768"}],"container-title":["International Journal of Cloud Applications and Computing"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=105509","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,1]],"date-time":"2022-06-01T20:07:46Z","timestamp":1654114066000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/ijcac.2013100104"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2013,10,1]]},"references-count":16,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2013,10]]}},"URL":"https:\/\/doi.org\/10.4018\/ijcac.2013100104","relation":{},"ISSN":["2156-1834","2156-1826"],"issn-type":[{"value":"2156-1834","type":"print"},{"value":"2156-1826","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,10,1]]}}}