{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T05:13:08Z","timestamp":1776316388937,"version":"3.50.1"},"reference-count":108,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2022,11,7]],"date-time":"2022-11-07T00:00:00Z","timestamp":1667779200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Data is the lifeblood of any organization. In today\u2019s world, organizations recognize the vital role of data in modern business intelligence systems for making meaningful decisions and staying competitive in the field. Efficient and optimal data analytics provides a competitive edge to its performance and services. Major organizations generate, collect and process vast amounts of data, falling under the category of big data. Managing and analyzing the sheer volume and variety of big data is a cumbersome process. At the same time, proper utilization of the vast collection of an organization\u2019s information can generate meaningful insights into business tactics. In this regard, two of the popular data management systems in the area of big data analytics (i.e., data warehouse and data lake) act as platforms to accumulate the big data generated and used by organizations. Although seemingly similar, both of them differ in terms of their characteristics and applications. This article presents a detailed overview of the roles of data warehouses and data lakes in modern enterprise data management. We detail the definitions, characteristics and related works for the respective data management frameworks. Furthermore, we explain the architecture and design considerations of the current state of the art. Finally, we provide a perspective on the challenges and promising research directions for the future.<\/jats:p>","DOI":"10.3390\/bdcc6040132","type":"journal-article","created":{"date-parts":[[2022,11,8]],"date-time":"2022-11-08T11:46:43Z","timestamp":1667908003000},"page":"132","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":129,"title":["An Overview of Data Warehouse and Data Lake in Modern Enterprise Data Management"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4957-5804","authenticated-orcid":false,"given":"Athira","family":"Nambiar","sequence":"first","affiliation":[{"name":"Department of Computational Intelligence, School of Computing, SRM Institute of Science and Technology, Chennai 603203, India"}]},{"given":"Divyansh","family":"Mundra","sequence":"additional","affiliation":[{"name":"Department of Computational Intelligence, School of Computing, SRM Institute of Science and Technology, Chennai 603203, India"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1186\/s40537-015-0030-3","article-title":"Big data analytics: A survey","volume":"2","author":"Tsai","year":"2015","journal-title":"J. Big Data"},{"key":"ref_2","unstructured":"(2022, October 27). Big Data\u2014Statistics & Facts. Available online: https:\/\/www.statista.com\/topics\/1464\/big-data\/."},{"key":"ref_3","unstructured":"Wise, J. (2022, October 27). Big Data Statistics 2022: Facts, Market Size & Industry Growth. Available online: https:\/\/earthweb.com\/big-data-statistics\/."},{"key":"ref_4","unstructured":"Jain, A. (2022, October 27). The 5 V\u2019s of Big Data. Available online: https:\/\/www.ibm.com\/blogs\/watson-health\/the-5-vs-of-big-data\/."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.ijinfomgt.2014.10.007","article-title":"Beyond the hype: Big data concepts, methods, and analytics","volume":"35","author":"Gandomi","year":"2015","journal-title":"Int. J. Inf. Manag."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1007\/978-3-319-25013-7_16","article-title":"Big Data Analytics as a Service for Business Intelligence","volume":"Volume 9373","author":"Sun","year":"2015","journal-title":"Open and Big Data Management and Innovation"},{"key":"ref_7","unstructured":"(2022, October 27). Big Data and Analytics Services Global Market Report. Available online: https:\/\/www.reportlinker.com\/p06246484\/Big-Data-and-Analytics-Services-Global-Market-Report.html."},{"key":"ref_8","unstructured":"(2022, October 27). BI & Analytics Software Market Value Worldwide 2019\u20132025. Available online: https:\/\/www.statista.com\/statistics\/590054\/worldwide-business-analytics-software-vendor-market\/."},{"key":"ref_9","unstructured":"Kumar, S. (2022, October 27). What Is a Data Repository and What Is it Used for?. Available online: https:\/\/stealthbits.com\/blog\/what-is-a-data-repository-and-what-is-it-used-for\/."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"03025","DOI":"10.1051\/itmconf\/20181703025","article-title":"Data lake: A new, ideology in big data era","volume":"17","author":"Khine","year":"2018","journal-title":"ITM Web Conf."},{"key":"ref_11","first-page":"349","article-title":"A Survey: Data Warehouse Architecture","volume":"8","author":"Arif","year":"2015","journal-title":"Int. J. Hybrid Inf. Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/978-981-33-6893-4_19","article-title":"Data Lake Versus Data Warehouse Architecture: A Comparative Study","volume":"Volume 745","author":"Bennani","year":"2022","journal-title":"WITS 2020"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"34","DOI":"10.21015\/vtcs.v15i1.487","article-title":"A Comparative Analysis of Traditional and Cloud Data Warehouse","volume":"6","author":"Rehman","year":"2018","journal-title":"VAWKUM Trans. Comput. Sci."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1147\/sj.271.0060","article-title":"An architecture for a business and information system","volume":"27","author":"Devlin","year":"1988","journal-title":"IBM Syst. J."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Garani, G., Chernov, A., Savvas, I., and Butakova, M. (2019, January 12\u201314). A Data Warehouse Approach for Business Intelligence. Proceedings of the 2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), Napoli, Italy.","DOI":"10.1109\/WETICE.2019.00022"},{"key":"ref_16","first-page":"8263","article-title":"A Review of Data Warehousing and Business Intelligence in different perspective","volume":"5","author":"Gupta","year":"2014","journal-title":"Int. J. Comput. Sci. Inf. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sagiroglu, S., and Sinanc, D. (2013, January 20\u201324). Big data: A review. Proceedings of the 2013 International Conference on Collaboration Technologies and Systems (CTS), San Diego, CA, USA.","DOI":"10.1109\/CTS.2013.6567202"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Miloslavskaya, N., and Tolstoy, A. (2016, January 22\u201324). Application of Big Data, Fast Data, and Data Lake Concepts to Information Security Issues. Proceedings of the 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), Vienna, Austria.","DOI":"10.1109\/W-FiCloud.2016.41"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Giebler, C., Stach, C., Schwarz, H., and Mitschang, B. (2018, January 26\u201328). BRAID\u2014A Hybrid Processing Architecture for Big Data. Proceedings of the 7th International Conference on Data Science, Technology and Applications, Porto, Portugal.","DOI":"10.5220\/0006861802940301"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1109\/MIC.2017.3481351","article-title":"The Lambda and the Kappa","volume":"21","author":"Lin","year":"2017","journal-title":"IEEE Internet Comput."},{"key":"ref_21","unstructured":"Devlin, B. (2022, October 27). Thirty Years of Data Warehousing\u2014Part 1. Available online: https:\/\/www.irmconnects.com\/thirty-years-of-data-warehousing-part-1\/."},{"key":"ref_22","unstructured":"Inmon, W.H. (2005). Building the Data Warehouse, Wiley Publishing. [4th ed.]."},{"key":"ref_23","first-page":"217","article-title":"Comprehensive survey on data warehousing research","volume":"10","author":"Chandra","year":"2018","journal-title":"Int. J. Inf. Technol."},{"key":"ref_24","unstructured":"Sim\u00f5es, D.M. (2010, January 28\u201330). Enterprise Data Warehouses: A conceptual framework for a successful implementation. Proceedings of the Canadian Council for Small Business & Entrepreneurship Annual Conference, Calgary, AL, Canada."},{"key":"ref_25","first-page":"153","article-title":"Data Warehouse as a Backbone for Business Intelligence: Issues and Challenges","volume":"33","year":"2011","journal-title":"Eur. J. Econ. Financ. Adm. Sci."},{"key":"ref_26","unstructured":"(2022, October 27). Report by Market Research Future (MRFR). Available online: https:\/\/finance.yahoo.com\/news\/data-warehouse-dwaas-market-predicted-153000649.html."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1145\/248603.248616","article-title":"An overview of data warehousing and OLAP technology","volume":"26","author":"Chaudhuri","year":"1997","journal-title":"ACM Sigmod Rec."},{"key":"ref_28","unstructured":"Codd, E.F., Codd, S.B., and Salley, C.T. (1993). Providing OLAP to User-Analysts: An IT Mandate, Codd & Associates."},{"key":"ref_29","unstructured":"(2022, October 27). The Best Applications of Data Warehousing. Available online: https:\/\/datachannel.co\/blogs\/best-applications-of-data-warehousing\/."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Hai, R., Quix, C., and Jarke, M. (2021). Data lake concept and systems: A survey. arXiv.","DOI":"10.1007\/978-3-319-32010-6_309"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zagan, E., and Danubianu, M. (2020, January 21\u201323). Data Lake Approaches: A Survey. Proceedings of the 2020 International Conference on Development and Application Systems (DAS), Suceava, Romania.","DOI":"10.1109\/DAS49615.2020.9108912"},{"key":"ref_32","first-page":"823","article-title":"Data Lakes: A Survey Paper","volume":"Volume 5","author":"Boudhir","year":"2022","journal-title":"Innovations in Smart Cities Applications"},{"key":"ref_33","unstructured":"Dixon, J. (2022, October 27). Pentaho, Hadoop, and Data Lakes. Available online: https:\/\/jamesdixon.wordpress.com\/2010\/10\/14\/pentaho-hadoop-and-data-lakes\/."},{"key":"ref_34","unstructured":"King, T. (2022, October 27). The Emergence of Data Lake: Pros and Cons. Available online: https:\/\/solutionsreview.com\/data-integration\/the-emergence-of-data-lake-pros-and-cons\/."},{"key":"ref_35","unstructured":"Alrehamy, H., and Walker, C. (2015, January 26\u201328). Personal Data Lake with Data Gravity Pull. Proceedings of the IEEE Fifth International Conference on Big Data and Cloud Computing 2015, Beijing, China."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Yang, Q., Ge, M., and Helfert, M. (2019, January 3\u20135). Analysis of Data Warehouse Architectures: Modeling and Classification. Proceedings of the 21st International Conference on Enterprise Information Systems, Heraklion, Greece.","DOI":"10.5220\/0007728006040611"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Yessad, L., and Labiod, A. (2016, January 15\u201318). Comparative study of data warehouses modeling approaches: Inmon, Kimball and Data Vault. Proceedings of the 2016 International Conference on System Reliability and Science (ICSRS), Paris, France.","DOI":"10.1109\/ICSRS.2016.7815845"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"11","DOI":"10.5121\/ijdms.2010.2402","article-title":"A Survey on Data Warehouse Evolution","volume":"2","author":"Oueslati","year":"2010","journal-title":"Int. J. Database Manag. Syst."},{"key":"ref_39","first-page":"3","article-title":"A Survey of Real-Time Data Warehouse and ETL","volume":"5","author":"Ali","year":"2014","journal-title":"Int. J. Sci. Eng. Res."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Aftab, U., and Siddiqui, G.F. (2018, January 10\u201313). Big Data Augmentation with Data Warehouse: A Survey. Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA.","DOI":"10.1109\/BigData.2018.8622182"},{"key":"ref_41","unstructured":"Alsqour, M., Matouk, K., and Owoc, M. (2012, January 9\u201312). A survey of data warehouse architectures\u2014Preliminary results. Proceedings of the Federated Conference on Computer Science and Information Systems, Wroclaw, Poland."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Rizzi, S., Abell\u00f3, A., Lechtenb\u00f6rger, J., and Trujillo, J. (2006, January 10). Research in data warehouse modeling and design: Dead or alive?. Proceedings of the 9th ACM international workshop on Data warehousing and OLAP, DOLAP \u201906, Arlington, VA, USA.","DOI":"10.1145\/1183512.1183515"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Krogstie, J., and Reijers, H.A. (2018). KAYAK: A Framework for Just-in-Time Data Preparation in a Data Lake. Advanced Information Systems Engineering, Springer International Publishing. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-319-91563-0"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Gao, Y., Huang, S., and Parameswaran, A. (2018, January 10\u201315). Navigating the Data Lake with DATAMARAN: Automatically Extracting Structure from Log Datasets. Proceedings of the 2018 International Conference on Management of Data, Houston, TX, USA.","DOI":"10.1145\/3183713.3183746"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1016\/j.proenv.2016.03.117","article-title":"Extraction, Transformation, and Loading (ETL) Module for Hotspot Spatial Data Warehouse Using Geokettle","volume":"33","author":"Astriani","year":"2016","journal-title":"Procedia Environ. Sci."},{"key":"ref_46","first-page":"5","article-title":"Managing Google\u2019s data lake: An overview of the Goods system","volume":"39","author":"Halevy","year":"2016","journal-title":"IEEE Data Eng. Bull."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Dehne, F., Robillard, D., Rau-Chaplin, A., and Burke, N. (2016, January 13\u201315). VOLAP: A Scalable Distributed System for Real-Time OLAP with High Velocity Data. Proceedings of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan.","DOI":"10.1109\/CLUSTER.2016.29"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"854","DOI":"10.1145\/1093382.1093388","article-title":"Capturing summarizability with integrity constraints in OLAP","volume":"30","author":"Hurtado","year":"2005","journal-title":"ACM Trans. Database Syst."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Farid, M., Roatis, A., Ilyas, I.F., Hoffmann, H.F., and Chu, X. (July, January 26). CLAMS: Bringing Quality to Data Lakes. Proceedings of the 2016 International Conference on Management of Data, SIGMOD \u201916, San Francisco, CA, USA.","DOI":"10.1145\/2882903.2899391"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"1902","DOI":"10.14778\/3352063.3352095","article-title":"Juneau: Data lake management for Jupyter","volume":"12","author":"Zhang","year":"2019","journal-title":"Proc. VLDB Endow."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Zhu, E., Deng, D., Nargesian, F., and Miller, R.J. (July, January 30). JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes. Proceedings of the 2019 International Conference on Management of Data, SIGMOD \u201919, Amsterdam, The Netherlands.","DOI":"10.1145\/3299869.3300065"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Beheshti, A., Benatallah, B., Nouri, R., Chhieng, V.M., Xiong, H., and Zhao, X. (2017, January 6\u201310). CoreDB: A Data Lake Service. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM \u201917, Singapore.","DOI":"10.1145\/3132847.3133171"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Hai, R., Geisler, S., and Quix, C. (July, January 26). Constance: An Intelligent Data Lake System. Proceedings of the 2016 International Conference on Management of Data, SIGMOD \u201916, San Francisco, CA, USA.","DOI":"10.1145\/2882903.2899389"},{"key":"ref_54","unstructured":"Ahmed, A.S., Salem, A.M., and Alhabibi, Y.A. (2006, January 23\u201327). Combining the Data Warehouse and Operational Data Store. Proceedings of the Eighth International Conference on Enterprise Information Systems, Paphos, Cyprus."},{"key":"ref_55","unstructured":"(2022, October 27). Software Architecture: N Tier, 3 Tier, 1 Tier, 2 Tier Architecture. Available online: https:\/\/www.appsierra.com\/blog\/url."},{"key":"ref_56","unstructured":"Han, S.W. (1997). Three-Tier Architecture for Sentinel Applications and Tools: Separating Presentation from Functionality. [Ph.D. Thesis, University of Florida]."},{"key":"ref_57","unstructured":"(2022, October 27). What Is Three-Tier Architecture. Available online: https:\/\/www.ibm.com\/in-en\/cloud\/learn\/three-tier-architecture."},{"key":"ref_58","first-page":"3686","article-title":"Big Data\u2014Solutions for RDBMS Problems\u2014A Survey","volume":"2","author":"Phaneendra","year":"2013","journal-title":"Int. J. Adv. Res. Comput. Commun. Eng."},{"key":"ref_59","unstructured":"Simitsis, A., Vassiliadis, P., and Sellis, T. (2005, January 5\u20138). Optimizing ETL processes in data warehouses. Proceedings of the 21st International Conference on Data Engineering (ICDE\u201905), Tokyo, Japan."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.ijmedinf.2019.03.006","article-title":"Privacy-enhancing ETL-processes for biomedical data","volume":"126","author":"Prasser","year":"2019","journal-title":"Int. J. Med. Inform."},{"key":"ref_61","first-page":"279","article-title":"Metadata for Big Data: A preliminary investigation of metadata quality issues in research data repositories","volume":"34","author":"Rousidis","year":"2014","journal-title":"Inf. Serv. Use"},{"key":"ref_62","unstructured":"Mailvaganam, H. (2022, September 25). Introduction to OLAP\u2014Slice, Dice and Drill! 2007. Data Warehousing Review. Retrieved on 18 March 2008. Available online: https:\/\/web.archive.org\/web\/20180928201202\/http:\/\/dwreview.com\/OLAP\/Introduction_OLAP.html."},{"key":"ref_63","unstructured":"Pendse, N. (2022, October 27). What is OLAP?. Available online: https:\/\/dssresources.com\/papers\/features\/pendse04072002.htm."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"2551","DOI":"10.4028\/www.scientific.net\/AMM.321-324.2551","article-title":"Solution for Data Growth Problem of MOLAP","volume":"321\u2013324","author":"Xu","year":"2013","journal-title":"Appl. Mech. Mater."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Dehne, F., Eavis, T., and Rau-Chaplin, A. (2003, January 12\u201315). Parallel multi-dimensional ROLAP indexing. Proceedings of the CCGrid 2003. 3rd IEEE\/ACM International Symposium on Cluster Computing and the Grid, Tokyo, Japan.","DOI":"10.1109\/CCGRID.2003.1199356"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Shvachko, K., Kuang, H., Radia, S., and Chansler, R. (2010, January 3\u20137). The Hadoop Distributed File System. Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, USA.","DOI":"10.1109\/MSST.2010.5496972"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Luo, Z., Niu, L., Korukanti, V., Sun, Y., Basmanova, M., He, Y., Wang, B., Agrawal, D., Luo, H., and Tang, C. (2022, January 9\u201312). From Batch Processing to Real Time Analytics: Running Presto\u00ae at Scale. Proceedings of the 2022 IEEE 38th International Conference on Data Engineering (ICDE), Kuala Lumpur, Malaysia.","DOI":"10.1109\/ICDE53745.2022.00165"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Sethi, R., Traverso, M., Sundstrom, D., Phillips, D., Xie, W., Sun, Y., Yegitbasi, N., Jin, H., Hwang, E., and Shingte, N. (2019, January 8\u20131). Presto: SQL on Everything. Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China.","DOI":"10.1109\/ICDE.2019.00196"},{"key":"ref_69","unstructured":"Kinley, J. (2022, October 27). The Lambda Architecture: Principles for Architecting Realtime Big Data Systems. Available online: http:\/\/jameskinley.tumblr.1084com\/post\/37398560534\/thelambda-architecture-principles-for."},{"key":"ref_70","unstructured":"Ferrera  Bertran, P. (2022, September 25). Lambda Architecture: A state-of-the-Art. Datasalt. 17 January 2014. Available online: https:\/\/github.com\/pereferrera\/trident-lambda-splout."},{"key":"ref_71","first-page":"28","article-title":"Apache Flink\u2122: Stream and Batch Processing in a Single Engine","volume":"36","author":"Carbone","year":"2015","journal-title":"Bull. IEEE Comput. Soc. Tech. Comm. Data Eng."},{"key":"ref_72","unstructured":"Kreps, J. (2022, October 27). Questioning the Lambda Architecture. Available online: https:\/\/www.oreilly.com\/radar\/questioning-the-lambda-architecture\/."},{"key":"ref_73","unstructured":"(2022, October 27). Data Vault vs Star Schema vs Third Normal Form: Which Data Model to Use?. Available online: https:\/\/www.matillion.com\/resources\/blog\/data-vault-vs-star-schema-vs-third-normal-form-which-data-model-to-use."},{"key":"ref_74","unstructured":"Patranabish, D. (2022, October 27). Data Lakes: The New Enabler of Scalability in Cross Channel Analytics\u2014Tech-Talk by Durjoy Patranabish | ET CIO. Available online: http:\/\/cio.economictimes.indiatimes.com\/tech-talk\/data-lakes-the-new-enabler-of-scalability-in-cross-channel-analytics\/585."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"1986","DOI":"10.14778\/3352063.3352116","article-title":"Data lake management: Challenges and opportunities","volume":"12","author":"Nargesian","year":"2019","journal-title":"Proc. VLDB Endow."},{"key":"ref_76","unstructured":"(2022, October 27). A Brief Look at 4 Major Data Compliance Standards: GDPR, HIPAA, PCI DSS, CCPA. Available online: https:\/\/www.pentasecurity.com\/blog\/4-data-compliance-standards-gdpr-hipaa-pci-dss-ccpa\/."},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1007\/s10844-020-00608-7","article-title":"On data lake architectures and metadata management","volume":"56","author":"Sawadogo","year":"2021","journal-title":"J. Intell. Inf. Syst."},{"key":"ref_78","unstructured":"(2022, October 27). Overview of Amazon Web Services: AWS Whitepaper. Available online: https:\/\/d1.awsstatic.com\/whitepapers\/aws-overview.pdf."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"3162","DOI":"10.14778\/3476311.3476391","article-title":"The evolution of Amazon redshift","volume":"14","author":"Pandis","year":"2021","journal-title":"Proc. VLDB Endow."},{"key":"ref_80","unstructured":"(2022, October 27). Microsoft Azure Documentation. Available online: http:\/\/azure.microsoft.com\/en-us\/documentation\/."},{"key":"ref_81","unstructured":"(2022, October 27). Automate Your Data Warehouse. Available online: https:\/\/www.oracle.com\/autonomous-database\/autonomous-data-warehouse\/."},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Dageville, B., Cruanes, T., Zukowski, M., Antonov, V., Avanes, A., Bock, J., Claybaugh, J., Engovatov, D., Hentschel, M., and Huang, J. (July, January 26). The Snowflake Elastic Data Warehouse. Proceedings of the 2016 International Conference on Management of Data, San Francisco, CA, USA.","DOI":"10.1145\/2882903.2903741"},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1007\/s13222-017-0272-7","article-title":"Data Lakes","volume":"17","author":"Mathis","year":"2017","journal-title":"Datenbank-Spektrum"},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Zagan, E., and Danubianu, M. (2021, January 11\u201313). Cloud DATA LAKE: The new trend of data storage. Proceedings of the 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Online.","DOI":"10.1109\/HORA52670.2021.9461293"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Ramakrishnan, R., Sridharan, B., Douceur, J.R., Kasturi, P., Krishnamachari-Sampath, B., Krishnamoorthy, K., Li, P., Manu, M., Michaylov, S., and Ramos, R. (2017, January 14\u201319). Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics. Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD \u201917, Chicago, IL, USA.","DOI":"10.1145\/3035918.3056100"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Perner, P. (2014). Big Data Analytics: A Literature Review Paper. Advances in Data Mining. Applications and Theoretical Aspects, Springer International Publishing. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-319-08976-8"},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.bdr.2015.01.006","article-title":"Significance and Challenges of Big Data Research","volume":"2","author":"Jin","year":"2015","journal-title":"Big Data Res."},{"key":"ref_88","first-page":"1","article-title":"Challenges of big data storage and management","volume":"6","author":"Agrawal","year":"2016","journal-title":"Glob. J. Inf. Technol. Emerg. Technol."},{"key":"ref_89","first-page":"2218","article-title":"Big Data Storage and Challenges","volume":"5","author":"Padgavankar","year":"2014","journal-title":"Int. J. Comput. Sci. Inf. Technol."},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Kadadi, A., Agrawal, R., Nyamful, C., and Atiq, R. (2014, January 27\u201330). Challenges of data integration and interoperability in big data. Proceedings of the 2014 IEEE International Conference on Big Data (Big Data), Washington, DC, USA.","DOI":"10.1109\/BigData.2014.7004486"},{"key":"ref_91","unstructured":"(2022, October 27). Best Data Integration Tools. Available online: https:\/\/www.peerspot.com\/categories\/data-integration-tools."},{"key":"ref_92","first-page":"15","article-title":"Big Data Security Issues and Challenges","volume":"2","author":"Toshniwal","year":"2014","journal-title":"Int. J. Innov. Res. Adv. Eng."},{"key":"ref_93","doi-asserted-by":"crossref","unstructured":"Jonker, W., and Petkovi\u0107, M. (2014). Big Security for Big Data: Addressing Security Challenges for the Big Data Infrastructure. Secure Data Management, Springer International Publishing. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-319-06811-4"},{"key":"ref_94","first-page":"3","article-title":"Implementation issues of enterprise data warehousing and business intelligence in the healthcare industry","volume":"12","author":"Chen","year":"2012","journal-title":"Commun. IIMA"},{"key":"ref_95","doi-asserted-by":"crossref","unstructured":"Cuzzocrea, A., Bellatreche, L., and Song, I.Y. (2013, January 28). Data warehousing and OLAP over big data: Current challenges and future research directions. Proceedings of the Sixteenth International Workshop on Data Warehousing and OLAP, DOLAP \u201913, San Francisco, CA, USA.","DOI":"10.1145\/2513190.2517828"},{"key":"ref_96","first-page":"41","article-title":"A Descriptive Classification of Causes of Data Quality Problems in Data Warehousing","volume":"7","author":"Singh","year":"2010","journal-title":"Int. J. Comput. Sci. Issues"},{"key":"ref_97","unstructured":"Longbottom, C., and Bamforth, R. (2022, October 27). Optimising the Data Warehouse. Available online: https:\/\/www.it-daily.net\/downloads\/WP_Optimising-the-data-warehouse.pdf."},{"key":"ref_98","doi-asserted-by":"crossref","unstructured":"Santos, R.J., Bernardino, J., and Vieira, M. (2011, January 27\u201329). A survey on data security in data warehousing: Issues, challenges and opportunities. Proceedings of the 2011 IEEE EUROCON\u2014International Conference on Computer as a Tool, Lisbon, Portugal.","DOI":"10.1109\/EUROCON.2011.5929314"},{"key":"ref_99","unstructured":"(2022, October 28). Responsibilities of a Data Warehouse Governance Committee. Available online: https:\/\/docs.oracle.com\/cd\/E29633_01\/CDMOG\/GUID-7E43F311-4510-4F1E-A17E-693F94BD0EC7.htm."},{"key":"ref_100","doi-asserted-by":"crossref","unstructured":"Gupta, S., and Giri, V. (2018). Practical Enterprise Data Lake Insights: Handle Data-Driven Challenges in an Enterprise Big Data Lake, Apress. [1st ed.].","DOI":"10.1007\/978-1-4842-3522-5"},{"key":"ref_101","doi-asserted-by":"crossref","unstructured":"Ordonez, C., Song, I.Y., Anderst-Kotsis, G., Tjoa, A.M., and Khalil, I. (2019). Leveraging the Data Lake: Current State and Challenges. Big Data Analytics and Knowledge Discovery, Springer International Publishing. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-030-27520-4"},{"key":"ref_102","unstructured":"Lock, M. (2022, October 27). Maximizing Your Data Lake with a Cloud or Hybrid Approach. Available online: https:\/\/technology-signals.com\/wp-content\/uploads\/download-manager-files\/maximizingyourdatalake.pdf."},{"key":"ref_103","unstructured":"Kumar, N. (2022, October 27). Cloud Data Warehouse Is the Future of Data Storage. Available online: https:\/\/www.sigmoid.com\/blogs\/cloud-data-warehouse-is-the-future-of-data-storage\/."},{"key":"ref_104","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1093\/jamia\/ocab278","article-title":"Migrating a research data warehouse to a public cloud: Challenges and opportunities","volume":"29","author":"Kahn","year":"2022","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_105","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2015\/718390","article-title":"A Cognitive Adopted Framework for IoT Big-Data Management and Knowledge Discovery Prospective","volume":"2015","author":"Mishra","year":"2015","journal-title":"Int. J. Distrib. Sens. Netw."},{"key":"ref_106","doi-asserted-by":"crossref","unstructured":"Schewe, K.D., and Singh, N.K. (2019). Keeping the Data Lake in Form: DS-kNN Datasets Categorization Using Proximity Mining. Model and Data Engineering, Springer International Publishing. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-030-32065-2"},{"key":"ref_107","doi-asserted-by":"crossref","unstructured":"Bogatu, A., Fernandes, A.A.A., Paton, N.W., and Konstantinou, N. (2020, January 20\u201324). Dataset Discovery in Data Lakes. Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA.","DOI":"10.1109\/ICDE48307.2020.00067"},{"key":"ref_108","unstructured":"Armbrust, M., Ghodsi, A., Xin, R., and Zaharia, M. (2021, January 11\u201315). Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics. Proceedings of the Conference on Innovative Data Systems Research, Virtual Event."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/4\/132\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:11:57Z","timestamp":1760145117000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/4\/132"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,7]]},"references-count":108,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["bdcc6040132"],"URL":"https:\/\/doi.org\/10.3390\/bdcc6040132","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,7]]}}}