{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T10:25:44Z","timestamp":1776075944675,"version":"3.50.1"},"reference-count":67,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2020,7,27]],"date-time":"2020-07-27T00:00:00Z","timestamp":1595808000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"PHC CEDRE 42415YJ, French Ministry of European and Foreign Affairs (MEAE), French Ministry of Higher Education, Research and Innovation (MESRI) and Lebanese Ministry of Education and Higher Education (MEHE)","award":["42415YJ"],"award-info":[{"award-number":["42415YJ"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>The realm of big data has brought new venues for knowledge acquisition, but also major challenges including data interoperability and effective management. The great volume of miscellaneous data renders the generation of new knowledge a complex data analysis process. Presently, big data technologies provide multiple solutions and tools towards the semantic analysis of heterogeneous data, including their accessibility and reusability. However, in addition to learning from data, we are faced with the issue of data storage and management in a cost-effective and reliable manner. This is the core topic of this paper. A data lake, inspired by the natural lake, is a centralized data repository that stores all kinds of data in any format and structure. This allows any type of data to be ingested into the data lake without any restriction or normalization. This could lead to a critical problem known as data swamp, which can contain invalid or incoherent data that adds no values for further knowledge acquisition. To deal with the potential avalanche of data, some legislation is required to turn such heterogeneous datasets into manageable data. In this article, we address this problem and propose some solutions concerning innovative methods, derived from a multidisciplinary science perspective to manage data lake. The proposed methods imitate the supply chain management and natural lake principles with an emphasis on the importance of the data life cycle, to implement responsible data governance for the data lake.<\/jats:p>","DOI":"10.3390\/fi12080126","type":"journal-article","created":{"date-parts":[[2020,7,27]],"date-time":"2020-07-27T09:24:49Z","timestamp":1595841889000},"page":"126","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Data Lake Governance: Towards a Systemic and Natural Ecosystem Analogy"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6145-9227","authenticated-orcid":false,"given":"Marzieh","family":"Derakhshannia","sequence":"first","affiliation":[{"name":"LIRMM, Univ. Montpellier, CNRS, 34090 Montpellier, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8062-2808","authenticated-orcid":false,"given":"Carmen","family":"Gervet","sequence":"additional","affiliation":[{"name":"Espace Dev, Univ. Montpellier, IRD, Univ Guyane, Univ. R\u00e9union, 34293 Montpellier, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9917-4606","authenticated-orcid":false,"given":"Hicham","family":"Hajj-Hassan","sequence":"additional","affiliation":[{"name":"CNRS-L, Beirut P.O. Box 11-8281, Lebanon"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3708-6429","authenticated-orcid":false,"given":"Anne","family":"Laurent","sequence":"additional","affiliation":[{"name":"LIRMM, Univ. Montpellier, CNRS, 34090 Montpellier, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6323-0135","authenticated-orcid":false,"given":"Arnaud","family":"Martin","sequence":"additional","affiliation":[{"name":"CEFE, Univ. Montpellier, CNRS, 34293 Montpellier, France"}]}],"member":"1968","published-online":{"date-parts":[[2020,7,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR Guiding Principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Nat. Sci. Data"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Madera, C., and Laurent, A. (2016, January 1\u20134). The next Information Architecture Evolution: The Data Lake Wave. Proceedings of the 8th International Conference on Management of Digital EcoSystems, Biarritz, France.","DOI":"10.1145\/3012071.3012077"},{"key":"ref_3","unstructured":"Russom, P. (2017). Data lakes: Purposes, practices, patterns, and platforms. TDWI White Paper, Talend."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Fang, H.L. (2015, January 8\u201312). Managing data lakes in big data era: What\u2019s a data lake and why has it became popular in data management ecosystem. Proceedings of the 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), Shenyang, China.","DOI":"10.1109\/CYBER.2015.7288049"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Khine, P., and Wang, Z. (2020, July 02). Data Lake: A New Ideology in Big Data Era. Available online: https:\/\/www.itm-conferences.org\/articles\/itmconf\/pdf\/2018\/02\/itmconf_wcsn2018_03025.pdf.","DOI":"10.1051\/itmconf\/20181703025"},{"key":"ref_6","unstructured":"White, T. (2015). Hadoop: The Definitive Guide, O\u2019Reilly. [4th ed.]."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Sawadogo, P.N., and Darmont, J. (2020). On Data Lake Architectures and Metadata Management. J. Intell. Inf. Syst., to appear.","DOI":"10.1007\/s10844-020-00608-7"},{"key":"ref_8","unstructured":"Gorelik, A. (2019). The Enterprise Big Data Lake: Delivering the Promise of Big Data and Data Science, O\u2019Reilly Media."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Ladley, J. (2012). Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program, Elsevier Science. ITPro Collection.","DOI":"10.1016\/B978-0-12-415829-0.00003-4"},{"key":"ref_10","unstructured":"Paschalidi, C. (2020, July 02). Data Governance: A Conceptual Framework in Order to Prevent Your Data Lake from Becoming a Data Swamp. Available online: https:\/\/ltu.diva-portal.org\/smash\/record.jsf?pid=diva2%3A1019917&dswid=2135."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Loshin, D. (2013). Chapter 5\u2014Data Governance for Big Data Analytics: Considerations for Data Policies and Processes. Big Data Analytics, Morgan Kaufmann.","DOI":"10.1016\/B978-0-12-417319-4.00005-3"},{"key":"ref_12","unstructured":"Wende, K. (2020, July 02). A Model for Data Governance-Organising Accountabilities for Data Quality Management. Available online: https:\/\/www.alexandria.unisg.ch\/publications\/67284."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Al-Ruithe, M., Benkhelifa, E., and Hameed, K. (2018). Data Governance Taxonomy: Cloud versus Non-Cloud. Sustainability, 10.","DOI":"10.3390\/su10010095"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"424","DOI":"10.1016\/j.ijinfomgt.2019.07.008","article-title":"Data governance: A conceptual framework, structured review, and research agenda","volume":"49","author":"Abraham","year":"2019","journal-title":"Int. J. Inf. Manag."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Aisyah, M., and Ruldeviyani, Y. (2018, January 27\u201328). Designing data governance structure based on data management body of knowledge (DMBOK) Framework: A case study on Indonesia deposit insurance corporation (IDIC). Proceedings of the 2018 International Conference on Advanced Computer Science and Information Systems (ICACSIS 2018), Yogyakarta, Indonesia.","DOI":"10.1109\/ICACSIS.2018.8618151"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1016\/j.procs.2019.04.082","article-title":"Towards a Data Governance Framework for Third Generation Platforms","volume":"151","author":"Yebenes","year":"2019","journal-title":"Procedia Comput. Sci."},{"key":"ref_17","first-page":"1057","article-title":"Data Governance and Data Sharing Agreements for Community-Wide Health Information Exchange: Lessons from the Beacon Communities","volume":"2","author":"Allen","year":"2014","journal-title":"J. Electron. Health Data Methods"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1016\/j.techfore.2017.09.040","article-title":"Governance of big data collaborations: How to balance regulatory compliance and disruptive innovation","volume":"129","year":"2018","journal-title":"Technol. Forecast. Soc. Chang."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/j.jaccedu.2016.12.002","article-title":"Data governance case at KrauseMcMahon LLP in an era of self-service BI and Big Data","volume":"38","author":"Riggins","year":"2017","journal-title":"J. Account. Educ."},{"key":"ref_20","first-page":"150","article-title":"Some practical experiences in data governance","volume":"38","author":"Panian","year":"2010","journal-title":"World Acad. Sci. Eng. Technol."},{"key":"ref_21","unstructured":"Thomas, G. (2006). The DGI Data Governance Framework."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1007\/s10845-007-0022-z","article-title":"A systematic approach for supply chain improvement using design structure matrix","volume":"18","author":"Chen","year":"2007","journal-title":"J. Intell. Manuf."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2336","DOI":"10.1016\/j.jclepro.2017.11.176","article-title":"Integrating the environmental and social sustainability pillars into the lean and agile supply chain management paradigms: A literature review and future research directions","volume":"172","author":"Ciccullo","year":"2018","journal-title":"J. Clean. Prod."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1287\/opre.1080.0628","article-title":"OR FORUM\u2014The Evolution of Closed-Loop Supply Chain Research","volume":"57","author":"Guide","year":"2009","journal-title":"Oper. Res."},{"key":"ref_25","first-page":"302","article-title":"Life and Death of Data in Data Lakes: Preserving Data Usability and Responsible Governance","volume":"Volume 11938","author":"Yacoubi","year":"2019","journal-title":"Proceedings of the Internet Science\u20146th International Conference (INSCI 2019)"},{"key":"ref_26","unstructured":"Waters, D. (1999). Logistics strategies for North America in the Global logistics and distribution planning. the Global Logistics and Distribution Planning, Kogan Page Limited."},{"key":"ref_27","first-page":"194","article-title":"A New Introduction to Supply Chains and Supply Chain Management: Definitions and Theories Perspective","volume":"5","year":"2012","journal-title":"Int. Bus. Res."},{"key":"ref_28","unstructured":"Simchi-levi, D., Kaminsky, P., and Simchi-Levi, E. (2003). Designing and Managing the Supply Chain: Concepts, Strategies, and Case Studies, McGraw-Hill\/Irwin."},{"key":"ref_29","unstructured":"Delfmann, W., and Albers, S. (2000). Supply Chain Management in the Global Context, Cologne Publisher. Working Paper 102."},{"key":"ref_30","unstructured":"Harland, C. (1994). Supply Chain Management: Perceptions of Requirements and Performance in European Automotive Aftermarket Supply Chains. [Ph.D. Thesis, University of Warwick]."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.ijpe.2008.08.002","article-title":"A multi-objective stochastic programming approach for supply chain design considering risk","volume":"116","author":"Azaron","year":"2008","journal-title":"Int. J. Prod. Econ."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1016\/j.cie.2014.08.004","article-title":"Solving a new bi-objective location-routing-inventory problem in a distribution network by meta-heuristics","volume":"76","author":"Nekooghadirli","year":"2014","journal-title":"Comput. Ind. Eng."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.omega.2014.12.004","article-title":"On the fair optimization of cost and customer service level in a supply chain under disruption risks","volume":"53","author":"Sawik","year":"2015","journal-title":"Omega"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/S0925-5273(98)00079-6","article-title":"Supply chain design and analysis:: Models and methods","volume":"55","author":"Beamon","year":"1998","journal-title":"Int. J. Prod. Econ."},{"key":"ref_35","unstructured":"LaPlante, A., and Sharma, B. (2016). Architecting Data Lakes, O\u2019Reilly Media."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ravat, F., and Zhao, Y. (2019, January 26\u201329). Data Lakes: Trends and Perspectives. Proceedings of the International Conference on Database and Expert Systems Applications (DEXA 2019), Linz, Austria.","DOI":"10.1007\/978-3-030-27615-7_23"},{"key":"ref_37","first-page":"4","article-title":"Strategy, Strategic Management, Strategic Planning and Strategic Thinking","volume":"1","author":"Nickols","year":"2008","journal-title":"Manag. J."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Ambe, I., and Badenhorst-Weiss, J. (2011). Framework for choosing supply chain strategies. Afr. J. Bus. Manag., 5.","DOI":"10.4102\/jtscm.v5i1.18"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/j.resconrec.2018.10.009","article-title":"A literature review on green supply chain management: Trends and future challenges","volume":"141","author":"Tseng","year":"2019","journal-title":"Resour. Conserv. Recycl."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1111\/j.1468-2370.2007.00202.x","article-title":"Green supply-chain management: A state-of-the-art literature review","volume":"9","author":"Srivastava","year":"2007","journal-title":"Int. J. Manag. Rev."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1108\/01443570710725563","article-title":"Supply chain risk management and performance: A Guiding framework for future development","volume":"27","author":"Ritchie","year":"2007","journal-title":"Int. J. Oper. Prod. Manag."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1080\/13675569908901575","article-title":"Logistics and Supply Chain Management: Strategies for Reducing Cost and Improving Service (Second Edition)","volume":"2","author":"Christopher","year":"1999","journal-title":"Int. J. Logist. Res. Appl."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Miloslavskaya, N., and Tolstoy, A. (2016, January 22\u201324). Application of Big Data, Fast Data and Data Lake Concepts to Information Security Issues. Proceedings of the 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), Vienna, Austria.","DOI":"10.1109\/W-FiCloud.2016.41"},{"key":"ref_44","unstructured":"Sundaram, D., and Vidhya, M. (2016). Data Lakes-A New Data Repository For Big Data Analytics Workloads. Int. J. Adv. Comput. Res., 7."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Berntson, G.G., Cacioppo, J.T., and Bosch, J.A. (2016). From Homeostasis to Allodynamic Regulation. Handbook of Psychophysiology, Cambridge Handbooks in Psychology, Cambridge University Press. [4th ed.].","DOI":"10.1017\/9781107415782.018"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"21134","DOI":"10.1073\/pnas.1118276108","article-title":"Resilience and stability in bird guilds across tropical countryside","volume":"108","author":"Karp","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1007\/s10021-007-9076-1","article-title":"Testing for Compensation in a Multi-species Community","volume":"10","author":"Solow","year":"2007","journal-title":"Ecosystems"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"2118","DOI":"10.2307\/2680220","article-title":"Homeostasis and Compensation: The Role of Species and Resources in Ecosystem Stability","volume":"82","author":"Brown","year":"2001","journal-title":"Ecology"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1046\/j.1461-0248.2001.00189.x","article-title":"Biodiversity may regulate the temporal variability of ecological systems","volume":"4","author":"Cottingham","year":"2001","journal-title":"Ecol. Lett."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1016\/j.dss.2010.11.020","article-title":"A multi-objective optimization for green supply chain network design","volume":"51","author":"Wang","year":"2011","journal-title":"Decis. Support Syst."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"1351","DOI":"10.1111\/j.1461-0248.2008.01250.x","article-title":"Global change and species interactions in terrestrial ecosystems","volume":"11","author":"Tylianakis","year":"2008","journal-title":"Ecol. Lett."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Oechel, W.C., Callaghan, T.V., Gilmanov, T.G., Holten, J.I., Maxwell, B., Molau, U., and Sveinbj\u00f6rnsson, B. (1997). The Impact of Hydrologic Perturbations on Arctic Ecosystems Induced by Climate Change. Global Change and Arctic Terrestrial Ecosystems, Springer New York.","DOI":"10.1007\/978-1-4612-2240-8"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1111\/brv.12004","article-title":"Response diversity determines the resilience of ecosystems to environmental change","volume":"88","author":"Mori","year":"2013","journal-title":"Biol. Rev."},{"key":"ref_54","first-page":"399","article-title":"Environmental Supply Chain Management: Using Life Cycle Assessment To Structure supply chains","volume":"4","author":"Hagelaar","year":"2001","journal-title":"Int. Food Agribus. Manag. Rev."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/S0969-7012(00)00007-1","article-title":"Environmental purchasing: A framework for theory development","volume":"7","author":"Zsidisin","year":"2001","journal-title":"Eur. J. Purch. Supply Manag."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1007\/BF02994195","article-title":"Integrated Life-Cycle and Risk Assessment for Industrial Processes","volume":"9","author":"Sonnemann","year":"2004","journal-title":"Int. J. Life Cycle Assess."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/S1570-7946(10)28019-7","article-title":"Green Supply Chain Design and Operation by Integrating LCA and Dynamic Simulation","volume":"Volume 28","author":"Pierucci","year":"2010","journal-title":"20th European Symposium on Computer Aided Process Engineering"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1016\/j.jclepro.2018.09.097","article-title":"Product sustainability assessment for product life cycle","volume":"206","author":"He","year":"2019","journal-title":"J. Clean. Prod."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Ren, J., and Toniolo, S. (2020). Chapter 3\u2014Life cycle thinking tools: Life cycle assessment, life cycle costing and social life cycle assessment. Life Cycle Sustainability Assessment for Decision-Making, Elsevier.","DOI":"10.1016\/B978-0-12-818355-7.00003-8"},{"key":"ref_60","unstructured":"Ren, J., and Toniolo, S. (2020). Chapter 4\u2014Life cycle sustainability assessment: An ongoing journey. Life Cycle Sustainability Assessment for Decision-Making, Elsevier."},{"key":"ref_61","first-page":"53","article-title":"Supply Chains In The Context of Life Cycle Assessment and Sustainability","volume":"16","author":"Mesaric","year":"2016","journal-title":"Bus. Logist. Mod. Manag."},{"key":"ref_62","unstructured":"International Organization for Standardization (ISO) (2003). 14040: 1997\u2014Environmental Management\u2014Life Cycle AsseSsment-Principles and Framework, International Organization for Standardization (ISO)."},{"key":"ref_63","unstructured":"Lee, K., Inaba, A., San\u014fppu, K.S.T., Asia-Pacific Economic Cooperation, and Committee on Trade and Investment (2004). Life Cycle Assessment: Best Practices of ISO 14040 Series, APEC Publication, Center for Ecodesign and LCA(CEL), Ajou University."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1016\/j.envint.2003.11.005","article-title":"Life cycle assessment part 1: Framework, goal and scope definition, inventory analysis, and applications","volume":"30","author":"Rebitzer","year":"2004","journal-title":"Environ. Int."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"106406","DOI":"10.1016\/j.eiar.2020.106406","article-title":"An LCA-based environmental impact assessment model for regulatory planning","volume":"83","author":"Zhang","year":"2020","journal-title":"Environ. Impact Assess. Rev."},{"key":"ref_66","unstructured":"Dawkins, R. (2006). The Selfish Gene, Oxford University Press."},{"key":"ref_67","unstructured":"Dessalles, J., Gaucherel, C., and Gouyon, P. (2016). Le fil de la vie\u2014La Face Immat\u00e9rielle du Vivant, Odile Jacob."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/12\/8\/126\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:52:01Z","timestamp":1760176321000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/12\/8\/126"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,27]]},"references-count":67,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2020,8]]}},"alternative-id":["fi12080126"],"URL":"https:\/\/doi.org\/10.3390\/fi12080126","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,27]]}}}