{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T07:24:53Z","timestamp":1775805893971,"version":"3.50.1"},"reference-count":42,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,6,5]],"date-time":"2025-06-05T00:00:00Z","timestamp":1749081600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Digit. Health"],"abstract":"<jats:sec><jats:title>Introduction<\/jats:title><jats:p>The emergence of data warehousing in clinical settings has greatly enhanced data analysis capabilities, facilitating the accurate and comprehensive extraction of valuable information. This scoping review explores the contributions of data warehouses in clinical settings by analysing the strengths, challenges and implications of each type of data warehouse, with a particular focus on general and specialised types.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>This scoping review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We searched four databases (PubMed, CINAHL, Scopus and IEEE-Xplore), identifying peer-reviewed, English-language studies from 1st January 2014 to 1st January 2024, that focus on data warehousing in healthcare, covering either general or specialised data warehouse applications. Python programming was used to extract the search results and transform the data into a tabular format for analysis.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>After removing 1,194 duplicates, 4,864 unique papers remained. Abstract screening excluded 4,590 as irrelevant, leaving 274 for full-text evaluation. In total, 27 papers met the inclusion criteria, of which 17 focused on general data warehouses and 10 on specialised data warehouses.<\/jats:p><jats:p>General data warehouses were found to be primarily used to address data integration issues, particularly for electronic health record (EHR)\/ Electronic medical Record (EMR) and general clinical data. These warehouses typically use a star schema architecture with online analytical processing (OLAP) and query analysis capabilities. In contrast, specialised data warehouses were focused on improving the quality of decision support by handling a wide range of data specific to diseases, using specialised architectures and advanced artificial intelligence (AI) capabilities to address the unique and complex challenges associated with these tasks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>General purpose data warehouses effectively integrate disparate data sources to provide a comprehensive view of disease management, patient care, and resource management. However, their flexibility and analytical capabilities need improvement. In contrast, specialised data warehouses are gaining popularity for their focus on specific diseases or research purposes, using advanced tools such as data mining and AI for superior analytical performance. Despite their innovative designs, these specialised warehouses face scalability challenges due to their customised nature. Addressing these challenges with advanced analytics and flexible architectures is critical.<\/jats:p><\/jats:sec>","DOI":"10.3389\/fdgth.2025.1599514","type":"journal-article","created":{"date-parts":[[2025,6,5]],"date-time":"2025-06-05T05:27:33Z","timestamp":1749101253000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["The development and use of data warehousing in clinical settings: a scoping review"],"prefix":"10.3389","volume":"7","author":[{"given":"Shiyang","family":"Lyu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Simon","family":"Craig","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gerard","family":"O'Reilly","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Taniar","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2025,6,5]]},"reference":[{"key":"B1","first-page":"579","article-title":"\u201cIn conferences, everyone goes \u2018health data is the future\u2019\u201d: an interview study on challenges in re-using EHR data for research in clinical data warehouses","volume":"2023","author":"Priou","year":"2024","journal-title":"AMIA Annu Symp Proc"},{"key":"B2","doi-asserted-by":"publisher","first-page":"38","DOI":"10.5334\/egems.298","article-title":"Improving a secondary use health data warehouse: proposing a multi-level data quality framework","volume":"7","author":"Henley-Smith","year":"2019","journal-title":"eGEMs"},{"key":"B3","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1136\/amiajnl-2011-000339","article-title":"A dimensional bus model for integrating clinical and research data","volume":"18","author":"Wade","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"B4","doi-asserted-by":"publisher","first-page":"4661","DOI":"10.19082\/4661","article-title":"Decision support system for health care resources allocation","volume":"9","author":"Sebaa","year":"2017","journal-title":"Electron Physician"},{"key":"B5","first-page":"419","article-title":"Agile natural language processing model for pathology knowledge extraction and integration with clinical enterprise data warehouse","author":"Baghal","year":"2019"},{"key":"B6","doi-asserted-by":"publisher","first-page":"303","DOI":"10.4258\/hir.2020.26.4.303","article-title":"Building a lung and ovarian cancer data warehouse","volume":"26","author":"Atay","year":"2020","journal-title":"Healthc Inform Res"},{"key":"B7","doi-asserted-by":"publisher","first-page":"e67","DOI":"10.1097\/mlr.0b013e31824def85","article-title":"Validation of electronic data on chemotherapy and hormone therapy use in HMOs","volume":"51","author":"Ritzwoller","year":"2014","journal-title":"Med Care"},{"key":"B8","doi-asserted-by":"publisher","first-page":"e15728","DOI":"10.1016\/j.heliyon.2023.e15728","article-title":"A non-intrusive and reactive architecture to support real-time ETL processes in data warehousing environments","volume":"9","author":"De Assis Vilela","year":"2023","journal-title":"Heliyon"},{"key":"B9","doi-asserted-by":"publisher","first-page":"467","DOI":"10.7326\/M18-0850","article-title":"PRISMA Extension for scoping reviews (PRISMA-ScR): checklist and explanation","volume":"169","author":"Tricco","year":"2018","journal-title":"Ann Intern Med"},{"key":"B10","volume-title":"PyMed: Python Package for Medical Records","year":""},{"key":"B11","doi-asserted-by":"publisher","first-page":"286","DOI":"10.1177\/2150131913495243","article-title":"Using electronic medical record data to characterize the level of medication use by age-groups in a network of primary care clinics","volume":"4","author":"Freund","year":"2014","journal-title":"J Prim Care Community Health"},{"key":"B12","doi-asserted-by":"crossref","DOI":"10.1109\/BHI.2016.7455821","article-title":"Data security and privacy management in healthcare applications and clinical data warehouse environment","author":"Puppala","year":"2016"},{"key":"B13","first-page":"2612","article-title":"Evaluation of data quality of multisite electronic health record data for secondary analysis","author":"Nobles","year":"2015"},{"key":"B14","first-page":"1038","article-title":"Flexible data warehouse: towards building an integrated electronic health record architecture","author":"Neamah","year":"2020"},{"key":"B15","doi-asserted-by":"publisher","first-page":"e61708","DOI":"10.5210\/ojphi.v7i3.6047","article-title":"Data lakes and data visualization: an innovative approach to address the challenges of access to health care in Mississippi","volume":"7","author":"Krause","year":"2015","journal-title":"Online J Public Health Inform"},{"key":"B16","first-page":"1","article-title":"Towards development of health data warehouse: Bangladesh perspective","author":"Khan","year":"2015"},{"key":"B17","first-page":"94","article-title":"Improving patient care through analytics","author":"McGlothlin","year":"2016"},{"key":"B18","first-page":"531","article-title":"Integrating data from EHRs to enhance clinical decision making: the inflammatory bowel disease case","author":"Abouzahra","year":"2014"},{"key":"B19","doi-asserted-by":"publisher","first-page":"109","DOI":"10.4258\/hir.2014.20.2.109","article-title":"Characteristics desired in clinical data warehouse for biomedical research","volume":"20","author":"Shin","year":"2014","journal-title":"Healthc Inform Res"},{"key":"B20","first-page":"922","article-title":"Data behind the walls\u2014an advanced architecture for data privacy management","author":"Faridoon","year":"2022"},{"key":"B21","first-page":"144","article-title":"Dimensional modeling of medical data warehouse based on ontology","author":"Ren","year":"2018"},{"key":"B22","doi-asserted-by":"publisher","first-page":"380","DOI":"10.1097\/HCM.0000000000000113","article-title":"Clinical data warehouse: an effective tool to create intelligence in disease management","volume":"36","author":"Karami","year":"2017","journal-title":"Health Care Manag (Frederick)"},{"key":"B23","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1186\/1472-6947-12-45","article-title":"An electronic health record-enabled obesity database","volume":"12","author":"Wood","year":"2016","journal-title":"BMC Med Inform Decis Mak"},{"key":"B24","first-page":"413","article-title":"Leveraging graph models to design acute kidney injury disease research data warehouse","author":"Baghal","year":"2019"},{"key":"B25","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1016\/j.compbiomed.2015.08.019","article-title":"A similarity-based data warehousing environment for medical images","volume":"66","author":"Teixeira","year":"2015","journal-title":"Comput Biol Med"},{"key":"B26","doi-asserted-by":"crossref","first-page":"1046","DOI":"10.3233\/SHTI220260","article-title":"COVID-19 geographical maps and clinical data warehouse PREDIMED","volume-title":"MEDINFO 2021: One World, One Health\u2014Global Partnership for Digital Innovation","author":"Artemova","year":"2022"},{"key":"B27","doi-asserted-by":"publisher","first-page":"5596","DOI":"10.3390\/ijerph17155596","article-title":"COVID-warehouse: a data warehouse of Italian COVID-19, pollution, and climate data","volume":"17","author":"Agapito","year":"2020","journal-title":"Int J Environ Res Public Health"},{"key":"B28","first-page":"24","article-title":"A visual approach for analyzing readmissions in intensive care medicine","volume-title":"Proceedings of the 2020 Workshop on Visual Analytics in Healthcare (VAHC); 2020 Oct 25; Chicago, IL, United States","author":"Scheer","year":"2020"},{"key":"B29","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1177\/0009922819834278","article-title":"Antibiotic prescribing in outpatient children: a cohort from a clinical data warehouse","volume":"58","author":"Grammatico-Guillon","year":"2019","journal-title":"Clin Pediatr (Phila)"},{"key":"B30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.47912\/jscdm.320","article-title":"Clinical data warehousing: a scoping review","volume":"4","author":"Wang","year":"2024","journal-title":"J Soc Clin Data Manag"},{"key":"B31","doi-asserted-by":"publisher","first-page":"104834","DOI":"10.1016\/j.ijmedinf.2022.104834","article-title":"A scoping review of semantic integration of health data and information","volume":"165","author":"Zhang","year":"2022","journal-title":"Int J Med Inform"},{"key":"B32","doi-asserted-by":"publisher","first-page":"e56686","DOI":"10.2196\/56686","article-title":"Integrated real-world data warehouses across 7 evolving Asian health care systems: scoping review","volume":"26","author":"Shau","year":"2024","journal-title":"J Med Internet Res"},{"key":"B33","volume-title":"Digital Health","year":"2025"},{"key":"B34","doi-asserted-by":"publisher","first-page":"e50216","DOI":"10.2196\/50216","article-title":"A framework to guide implementation of AI in health care: protocol for a cocreation research project","volume":"12","author":"Nilsen","year":"2023","journal-title":"JMIR Res Protoc"},{"key":"B35","volume-title":"EU Artificial Intelligence Act","year":"2024"},{"key":"B36","volume-title":"Interoperability","year":"2025"},{"key":"B37","doi-asserted-by":"publisher","first-page":"1302","DOI":"10.1016\/j.procs.2023.10.118","article-title":"Data lakes in healthcare: applications and benefits from the perspective of data sources and players","volume":"225","author":"Gentner","year":"2023","journal-title":"Procedia Comput Sci"},{"key":"B38","first-page":"33","article-title":"Data lake-an optimum solution for storage andanalytics of big data in cardiovascular disease prediction system","volume":"21","author":"Maini","year":"2018","journal-title":"Int J Comput Eng Manag (IJCEM)"},{"key":"B39","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1053\/j.akdh.2022.11.007","article-title":"Federated learning in health care using structured medical data","volume":"30","author":"Oh","year":"2023","journal-title":"Adv Kidnzy Dis Health"},{"key":"B40","doi-asserted-by":"publisher","first-page":"730","DOI":"10.1136\/amiajnl-2013-002370","article-title":"Clinical research data warehouse governance for distributed research networks in the USA: a systematic review of the literature","volume":"21","author":"Holmes","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"B41","doi-asserted-by":"publisher","first-page":"671","DOI":"10.1093\/jamia\/ocab256","article-title":"Understanding enterprise data warehouses to support clinical and translational research: enterprise information technology relationships, data governance, workforce, and cloud computing","volume":"29","author":"Knosp","year":"2022","journal-title":"J Am Med Inform Assoc"},{"key":"B42","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1080\/13600834.2019.1644068","article-title":"Privacy policies, cross-border health data and the GDPR","volume":"28","author":"Mulder","year":"2019","journal-title":"Inf Commun Technol L"}],"container-title":["Frontiers in Digital Health"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1599514\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,5]],"date-time":"2025-06-05T05:27:35Z","timestamp":1749101255000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1599514\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,5]]},"references-count":42,"alternative-id":["10.3389\/fdgth.2025.1599514"],"URL":"https:\/\/doi.org\/10.3389\/fdgth.2025.1599514","relation":{},"ISSN":["2673-253X"],"issn-type":[{"value":"2673-253X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,5]]},"article-number":"1599514"}}