{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T09:26:58Z","timestamp":1769851618634,"version":"3.49.0"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2024,7,27]],"date-time":"2024-07-27T00:00:00Z","timestamp":1722038400000},"content-version":"vor","delay-in-days":26,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Background<\/jats:title>\n                  <jats:p>The increasing prevalence of electronic health records (EHRs) in healthcare systems globally has underscored the importance of data quality for clinical decision-making and research, particularly in obstetrics. High-quality data is vital for an accurate representation of patient populations and to avoid erroneous healthcare decisions. However, existing studies have highlighted significant challenges in EHR data quality, necessitating innovative tools and methodologies for effective data quality assessment and improvement.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>This article addresses the critical need for data quality evaluation in obstetrics by developing a novel tool. The tool utilizes Health Level 7 (HL7) Fast Healthcare Interoperable Resources (FHIR) standards in conjunction with Bayesian Networks and expert rules, offering a novel approach to assessing data quality in real-world obstetrics data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Methods<\/jats:title>\n                  <jats:p>A harmonized framework focusing on completeness, plausibility, and conformance underpins our methodology. We employed Bayesian networks for advanced probabilistic modeling, integrated outlier detection methods, and a rule-based system grounded in domain-specific knowledge. The development and validation of the tool were based on obstetrics data from 9 Portuguese hospitals, spanning the years 2019-2020.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The developed tool demonstrated strong potential for identifying data quality issues in obstetrics EHRs. Bayesian networks used in the tool showed high performance for various features with area under the receiver operating characteristic curve (AUROC) between 75% and 97%. The tool\u2019s infrastructure and interoperable format as a FHIR Application Programming Interface (API) enables a possible deployment of a real-time data quality assessment in obstetrics settings. Our initial assessments show promised, even when compared with physicians\u2019 assessment of real records, the tool can reach AUROC of 88%, depending on the threshold defined.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>Our results also show that obstetrics clinical records are difficult to assess in terms of quality and assessments like ours could benefit from more categorical approaches of ranking between bad and good quality.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>This study contributes significantly to the field of EHR data quality assessment, with a specific focus on obstetrics. The combination of HL7-FHIR interoperability, machine learning techniques, and expert knowledge presents a robust, adaptable solution to the challenges of healthcare data quality. Future research should explore tailored data quality evaluations for different healthcare contexts, as well as further validation of the tool capabilities, enhancing the tool\u2019s utility across diverse medical domains.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamiaopen\/ooae062","type":"journal-article","created":{"date-parts":[[2024,7,27]],"date-time":"2024-07-27T16:25:39Z","timestamp":1722097539000},"source":"Crossref","is-referenced-by-count":4,"title":["Development and initial validation of a data quality evaluation tool in obstetrics real-world data through HL7-FHIR interoperable Bayesian networks and expert rules"],"prefix":"10.1093","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0882-6547","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Coutinho-Almeida","sequence":"first","affiliation":[{"name":"CINTESIS@RISE\u2014Centre for Health Technologies and Services Research, University of Porto , 4200-319 Porto, Portugal"},{"name":"MEDCIDS\u2014Faculty of Medicine of University of Porto , 4200-319 Porto, Portugal"},{"name":"Health Data Science PhD Program, Faculty of Medicine of the University of Porto , 4200-319 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2678-8249","authenticated-orcid":false,"given":"Carlos","family":"Saez","sequence":"additional","affiliation":[{"name":"Instituto Universitario de Aplicaciones de las Tecnolog\u00edas de la Informaci\u00f3n y de las Comunicaciones Avanzadas, Universitat Polit\u00e8cnica de Val\u00e8ncia , 46022 Valencia, Spain"}]},{"given":"Ricardo","family":"Correia","sequence":"additional","affiliation":[{"name":"CINTESIS@RISE\u2014Centre for Health Technologies and Services Research, University of Porto , 4200-319 Porto, Portugal"},{"name":"MEDCIDS\u2014Faculty of Medicine of University of Porto , 4200-319 Porto, Portugal"},{"name":"Health Data Science PhD Program, Faculty of Medicine of the University of Porto , 4200-319 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7867-6682","authenticated-orcid":false,"given":"Pedro Pereira","family":"Rodrigues","sequence":"additional","affiliation":[{"name":"CINTESIS@RISE\u2014Centre for Health Technologies and Services Research, University of Porto , 4200-319 Porto, Portugal"},{"name":"MEDCIDS\u2014Faculty of Medicine of University of Porto , 4200-319 Porto, Portugal"},{"name":"Health Data Science PhD Program, Faculty of Medicine of the University of Porto , 4200-319 Porto, Portugal"}]}],"member":"286","published-online":{"date-parts":[[2024,7,27]]},"reference":[{"issue":"01","key":"2024072716252416300_ooae062-B1","doi-asserted-by":"crossref","first-page":"14","DOI":"10.15265\/IY-2014-0020","article-title":"Big data in medicine is driving big changes","volume":"23","author":"Martin-Sanchez","year":"2014","journal-title":"Yearb Med Inform"},{"issue":"3","key":"2024072716252416300_ooae062-B2","doi-asserted-by":"crossref","first-page":"263","DOI":"10.21815\/JDE.019.034","article-title":"Electronic health records and data quality","volume":"83","author":"Walji","year":"2019","journal-title":"J Dent Educ"},{"issue":"5","key":"2024072716252416300_ooae062-B3","doi-asserted-by":"crossref","first-page":"e185","DOI":"10.2196\/jmir.9134","article-title":"Possible sources of bias in primary care electronic health record data use and reuse","volume":"20","author":"Verheij","year":"2018","journal-title":"J Med Internet Res"},{"issue":"3","key":"2024072716252416300_ooae062-B4","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/j.jamcollsurg.2019.12.005","article-title":"Assessing quality of surgical real-world data from an automated electronic health record pipeline","volume":"230","author":"Corey","year":"2020","journal-title":"J Am Coll Surg"},{"issue":"1","key":"2024072716252416300_ooae062-B5","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1080\/24709360.2019.1572344","article-title":"Clinical data quality: a data life cycle perspective","volume":"4","author":"Weng","year":"2020","journal-title":"Biostat Epidemiol"},{"key":"2024072716252416300_ooae062-B6","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1016\/j.ijmedinf.2016.03.006","article-title":"Data quality assessment framework to assess electronic medical record data for use in research","volume":"90","author":"Reimer","year":"2016","journal-title":"Int J Med Inf"},{"issue":"2","key":"2024072716252416300_ooae062-B7","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1055\/s-0039-1681054","article-title":"Impact of electronic versus paper-based recording before EHR implementation on health care professionals\u2019 perceptions of EHR use, data quality, and data reuse","volume":"10","author":"Joukes","year":"2019","journal-title":"Appl Clin Inform"},{"issue":"1","key":"2024072716252416300_ooae062-B8","doi-asserted-by":"crossref","first-page":"24","DOI":"10.13063\/2327-9214.1239","article-title":"Multisite evaluation of a data quality tool for patient-level clinical data sets","volume":"4","author":"Huser","year":"2016","journal-title":"eGEMs"},{"issue":"3","key":"2024072716252416300_ooae062-B9","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1093\/jamia\/ocz201","article-title":"Understanding and detecting defects in healthcare administration data: toward higher data quality to better support healthcare operations and decisions","volume":"27","author":"Zhang","year":"2020","journal-title":"J Am Med Inform Assoc"},{"key":"2024072716252416300_ooae062-B10","doi-asserted-by":"crossref","first-page":"106359","DOI":"10.1016\/j.cmpb.2021.106359","article-title":"The impact of data quality defects on clinical decision-making in the intensive care unit","volume":"209","author":"Kramer","year":"2021","journal-title":"Comput Methods Programs Biomed"},{"issue":"1","key":"2024072716252416300_ooae062-B11","doi-asserted-by":"crossref","first-page":"1748","DOI":"10.1186\/s12889-019-8105-2","article-title":"The impact of data quality and source data verification on epidemiologic inference: a practical application using HIV observational data","volume":"19","author":"Giganti","year":"2019","journal-title":"BMC Public Health"},{"issue":"12","key":"2024072716252416300_ooae062-B12","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1093\/jamia\/ocaa245","article-title":"Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data","volume":"27","author":"Bian","year":"2020","journal-title":"J Am Med Inf Assoc"},{"issue":"1","key":"2024072716252416300_ooae062-B13","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1136\/amiajnl-2011-000681","article-title":"Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research","volume":"20","author":"Weiskopf","year":"2013","journal-title":"J Am Med Inf Assoc"},{"key":"2024072716252416300_ooae062-B14","first-page":"721","article-title":"Organizing data quality assessment of shifting biomedical data","volume":"180","author":"S\u00e1ez","year":"2012","journal-title":"Stud Health Technol Inform"},{"issue":"1","key":"2024072716252416300_ooae062-B15","doi-asserted-by":"crossref","first-page":"18","DOI":"10.13063\/2327-9214.1244","article-title":"A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data","volume":"4","author":"Kahn","year":"2016","journal-title":"eGEMs"},{"issue":"1","key":"2024072716252416300_ooae062-B16","doi-asserted-by":"crossref","first-page":"10164","DOI":"10.1038\/s41598-020-66925-7","article-title":"Automated data cleaning of paediatric anthropometric data from longitudinal electronic health records: protocol and application to a large patient cohort","volume":"10","author":"Phan","year":"2020","journal-title":"Sci Rep"},{"issue":"7","key":"2024072716252416300_ooae062-B17","doi-asserted-by":"crossref","first-page":"1591","DOI":"10.1093\/jamia\/ocaa340","article-title":"Quality assessment of real-world data repositories across the data life cycle: a literature review","volume":"28","author":"Liaw","year":"2021","journal-title":"J Am Med Inform Assoc"},{"key":"2024072716252416300_ooae062-B18","first-page":"574","article-title":"Observational health data sciences and informatics (OHDSI): opportunities for observational researchers","volume":"216","author":"Hripcsak","year":"2015","journal-title":"Stud Health Technol Inform"},{"key":"2024072716252416300_ooae062-B19","doi-asserted-by":"crossref","first-page":"104824","DOI":"10.1016\/j.cmpb.2018.12.029","article-title":"TAQIH, a tool for tabular data quality assessment and improvement in the context of health data","volume":"181","author":"\u00c1lvarez","year":"2019","journal-title":"Comput Methods Programs Biomed"},{"issue":"1","key":"2024072716252416300_ooae062-B20","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1186\/s12874-021-01252-7","article-title":"Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R","volume":"21","author":"Schmidt","year":"2021","journal-title":"BMC Med Res Methodol"},{"issue":"1","key":"2024072716252416300_ooae062-B21","doi-asserted-by":"crossref","first-page":"e10264","DOI":"10.1002\/lrh2.10264","article-title":"Developing a systematic approach to assessing data quality in secondary use of clinical data based on intended use","volume":"6","author":"Razzaghi","year":"2022","journal-title":"Learn Health Syst"},{"key":"2024072716252416300_ooae062-B22","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1016\/j.cmpb.2019.05.017","article-title":"Towards a content agnostic computable knowledge repository for data quality assessment","volume":"177","author":"Rajan","year":"2019","journal-title":"Comput Methods Programs Biomed"},{"issue":"4","key":"2024072716252416300_ooae062-B23","doi-asserted-by":"crossref","first-page":"826","DOI":"10.1055\/s-0041-1733847","article-title":"Linking a consortium-wide data quality assessment tool with the MIRACUM metadata repository","volume":"12","author":"Kapsner","year":"2021","journal-title":"Appl Clin Inform"},{"key":"2024072716252416300_ooae062-B24","doi-asserted-by":"crossref","first-page":"104830","DOI":"10.1016\/j.cmpb.2019.01.002","article-title":"Semi-supervised encoding for outlier detection in clinical observation data","volume":"181","author":"Estiri","year":"2019","journal-title":"Comput Methods Programs Biomed"},{"issue":"8","key":"2024072716252416300_ooae062-B25","doi-asserted-by":"crossref","first-page":"giaa079","DOI":"10.1093\/gigascience\/giaa079","article-title":"EHR temporal variability: delineating temporal data-set shifts in electronic health records","volume":"9","author":"S\u00e1ez","year":"2020","journal-title":"Gigascience"},{"key":"2024072716252416300_ooae062-B26","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1016\/j.compbiomed.2015.09.024","article-title":"Construction of quality-assured infant feeding process of care data repositories: definition and design (part 1)","volume":"67","author":"Garc\u00eda-de-Le\u00f3n-Chocano","year":"2015","journal-title":"Comput Biol Med"},{"key":"2024072716252416300_ooae062-B27","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1016\/j.compbiomed.2016.01.007","article-title":"Construction of quality-assured infant feeding process of care data repositories: construction of the perinatal repository (part 2)","volume":"71","author":"Garc\u00eda-de-Le\u00f3n-Chocano","year":"2016","journal-title":"Comput Biol Med"},{"key":"2024072716252416300_ooae062-B28","first-page":"539","author":"S\u00e1ez","year":"2017"},{"issue":"2","key":"2024072716252416300_ooae062-B29","doi-asserted-by":"crossref","first-page":"e0171784","DOI":"10.1371\/journal.pone.0171784","article-title":"rEHR: an R package for manipulating and analysing electronic health record data","volume":"12","author":"Springate","year":"2017","journal-title":"PLoS One"},{"key":"2024072716252416300_ooae062-B30","volume-title":"Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference","author":"Pearl","year":"1988"},{"key":"2024072716252416300_ooae062-B31","author":"Ankan","year":"2015"},{"key":"2024072716252416300_ooae062-B32","author":"Cortes","year":"2020"},{"key":"2024072716252416300_ooae062-B33"},{"key":"2024072716252416300_ooae062-B34","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2024072716252416300_ooae062-B35","author":"Almeida"}],"container-title":["JAMIA Open"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamiaopen\/article-pdf\/7\/3\/ooae062\/58667240\/ooae062.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamiaopen\/article-pdf\/7\/3\/ooae062\/58667240\/ooae062.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,27]],"date-time":"2024-07-27T16:25:56Z","timestamp":1722097556000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamiaopen\/article\/doi\/10.1093\/jamiaopen\/ooae062\/7721902"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,1]]},"references-count":35,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamiaopen\/ooae062","relation":{},"ISSN":["2574-2531"],"issn-type":[{"value":"2574-2531","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,10]]},"published":{"date-parts":[[2024,7,1]]},"article-number":"ooae062"}}