{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"institution":[{"name":"medRxiv"}],"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T16:43:29Z","timestamp":1772297009702,"version":"3.50.1"},"posted":{"date-parts":[[2020,11,5]]},"group-title":"Epidemiology","reference-count":20,"publisher":"openRxiv","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"accepted":{"date-parts":[[2020,11,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                <jats:sec>\n                  <jats:title>Background<\/jats:title>\n                  <jats:p>High-quality data is crucial for guiding decision making and practicing evidence-based healthcare, especially if previous knowledge is lacking. Nevertheless, data quality frailties have been exposed worldwide during the current COVID-19 pandemic. Focusing on a major Portuguese surveillance dataset, our study aims to assess data quality issues and suggest possible solutions.<\/jats:p>\n                <\/jats:sec>\n                <jats:sec>\n                  <jats:title>Methods<\/jats:title>\n                  <jats:p>\n                    On April 27\n                    <jats:sup>th<\/jats:sup>\n                    2020, the Portuguese Directorate-General of Health (DGS) made available a dataset (DGSApril) for researchers, upon request. On August 4\n                    <jats:sup>th<\/jats:sup>\n                    , an updated dataset (DGSAugust) was also obtained. The quality of data was assessed through analysis of data completeness and consistency between both datasets.\n                  <\/jats:p>\n                <\/jats:sec>\n                <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>DGSAugust has not followed the data format and variables as DGSApril and a significant number of missing data and inconsistencies were found (e.g. 4,075 cases from the DGSApril were apparently not included in DGSAugust). Several variables also showed a low degree of completeness and\/or changed their values from one dataset to another (e.g. the variable \u2018underlying conditions\u2019 had more than half of cases showing different information between datasets). There were also significant inconsistencies between the number of cases and deaths due to COVID-19 shown in DGSAugust and by the DGS reports publicly provided daily.<\/jats:p>\n                <\/jats:sec>\n                <jats:sec>\n                  <jats:title>Conclusions<\/jats:title>\n                  <jats:p>The low quality of COVID-19 surveillance datasets limits its usability to inform good decisions and perform useful research. Major improvements in surveillance datasets are therefore urgently needed - e.g. simplification of data entry processes, constant monitoring of data, and increased training and awareness of health care providers - as low data quality may lead to a deficient pandemic control.<\/jats:p>\n                <\/jats:sec>","DOI":"10.1101\/2020.11.03.20225565","type":"posted-content","created":{"date-parts":[[2020,11,5]],"date-time":"2020-11-05T11:20:42Z","timestamp":1604575242000},"source":"Crossref","is-referenced-by-count":4,"title":["COVID-19 surveillance - a descriptive study on data quality issues"],"prefix":"10.64898","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7109-1101","authenticated-orcid":false,"given":"Cristina","family":"Costa-Santos","sequence":"first","affiliation":[]},{"given":"Ana","family":"Lu\u00edsa Neves","sequence":"additional","affiliation":[]},{"given":"Ricardo","family":"Correia","sequence":"additional","affiliation":[]},{"given":"Paulo","family":"Santos","sequence":"additional","affiliation":[]},{"given":"Matilde","family":"Monteiro-Soares","sequence":"additional","affiliation":[]},{"given":"Alberto","family":"Freitas","sequence":"additional","affiliation":[]},{"given":"In\u00eas","family":"Ribeiro-Vaz","sequence":"additional","affiliation":[]},{"given":"Teresa","family":"Henriques","sequence":"additional","affiliation":[]},{"given":"Pedro Pereira","family":"Rodrigues","sequence":"additional","affiliation":[]},{"given":"Altamiro","family":"Costa-Pereira","sequence":"additional","affiliation":[]},{"given":"Ana Margarida","family":"Pereira","sequence":"additional","affiliation":[]},{"given":"Jo\u00e3o","family":"Fonseca","sequence":"additional","affiliation":[]}],"member":"54368","reference":[{"issue":"1776","key":"2020110812301447000_2020.11.03.20225565v1.1","doi-asserted-by":"crossref","first-page":"20180365","DOI":"10.1098\/rstb.2018.0365","article-title":"How decision makers can use quantitative approaches to guide outbreak responses","volume":"374","year":"2019","journal-title":"Philosophical Transactions of the Royal Society B"},{"key":"2020110812301447000_2020.11.03.20225565v1.2","doi-asserted-by":"publisher","DOI":"10.1016\/S1473-3099(20)30119-5"},{"key":"2020110812301447000_2020.11.03.20225565v1.3","doi-asserted-by":"publisher","DOI":"10.1038\/518477a"},{"key":"2020110812301447000_2020.11.03.20225565v1.4","unstructured":"German, R. R. , Horan, J. M. , Lee, L. M. , Milstein, B. , & Pertowski, C. A. (2001).Updated guidelines for evaluating public health surveillance systems; recommendations from the Guidelines Working Group."},{"issue":"1","key":"2020110812301447000_2020.11.03.20225565v1.5","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1177\/1833358319826351","article-title":"Health records as the basis of clinical coding: Is the quality adequate? A qualitative study of medical coders\u2019 perceptions","volume":"49","year":"2020","journal-title":"Health Information Management Journal"},{"issue":"5","key":"2020110812301447000_2020.11.03.20225565v1.6","doi-asserted-by":"crossref","first-page":"5170","DOI":"10.3390\/ijerph110505170","article-title":"A review of data quality assessment methods for public health information systems","volume":"11","year":"2014","journal-title":"International journal of environmental research and public health"},{"key":"2020110812301447000_2020.11.03.20225565v1.7","unstructured":"Ashofteh, A. , & Bravo, J. M. A study on the quality of novel coronavirus (COVID-19) official datasets. Statistical Journal of the IAOS, (Preprint), 1\u201311."},{"key":"2020110812301447000_2020.11.03.20225565v1.8","first-page":"81","article-title":"Methodological challenges of analysing COVID-19 data during the pandemic","volume":"20","year":"2020","journal-title":"BMC Medical Research Methodology. 2020"},{"key":"2020110812301447000_2020.11.03.20225565v1.9","unstructured":"Dire\u00e7\u00e3o Geral da Sa\u00fade. Comunicado: Casos de infe\u00e7\u00e3o por novo Coronav\u00edrus (COVID-19) 2020. Available online: https:\/\/covid19.min-saude.pt\/wp-content\/uploads\/2020\/03\/Atualiza%C3%A7%C3%A3o-de-02032020-1728.pdf (accessed on 17 August 2020)"},{"key":"2020110812301447000_2020.11.03.20225565v1.10","unstructured":"Carta aberta ao Conselho Nacional de Sa\u00fade P\u00fablica: um contributo pessoal acerca da epidemia de Covid-19, em Portugal. March 2020. Available online: https:\/\/sigarra.up.pt\/fmup\/pt\/noticias_geral.noticias_cont?p_id=F307210300\/CartaAberta_COVID19_11.03.2020_.pdf (accessed on 17 August 2020)."},{"key":"2020110812301447000_2020.11.03.20225565v1.11","unstructured":"Dire\u00e7\u00e3o Geral da Sa\u00fade. COVID-19: Disponibiliza\u00e7\u00e3o de Dados. 2020. Available online: https:\/\/covid19.min-saude.pt\/disponibilizacao-de-dados\/ x(accessed on 11 August 2020)."},{"key":"2020110812301447000_2020.11.03.20225565v1.12","unstructured":"Dire\u00e7\u00e3o Geral da Sa\u00fade. 2020. COVID metadata. Available online: https:\/\/covid19.min-saude.pt\/wp-content\/uploads\/2020\/04\/PT_COVID19_metadata-1.pdf (accessed on 11 August 2020)."},{"key":"2020110812301447000_2020.11.03.20225565v1.13","unstructured":"Dire\u00e7\u00e3o Geral da Sa\u00fade. 2020. Relat\u00f3rio de Situa\u00e7\u00e3o - Informa\u00e7\u00e3o publicada diariamente. Available online: https:\/\/covid19.min-saude.pt\/relatorio-de-situacao\/ (accessed on 11 August 2020)."},{"issue":"8","key":"2020110812301447000_2020.11.03.20225565v1.14","doi-asserted-by":"crossref","first-page":"2368","DOI":"10.3390\/jcm9082368","article-title":"The Role of Health Preconditions on COVID-19 Deaths in Portugal: Evidence from Surveillance Data of the First 20293 Infection Cases","volume":"9","year":"2020","journal-title":"Journal of Clinical Medicine"},{"key":"2020110812301447000_2020.11.03.20225565v1.15","unstructured":"Ricotta Peixoto, V ; Viera, A ; Aguar, P ; Sousa, P ; Carvalho, C ; Rhys, D ; Abrantes, A ; Nunes, C. (2020).COVID-19: Determinants of Hospitalization, ICU and Death among 20,293 reported cases in Portugal. medRxiv.2020.05.29.20115824"},{"key":"2020110812301447000_2020.11.03.20225565v1.16","doi-asserted-by":"crossref","unstructured":"Froes, M. T. , Neves, B. D. , Martins, B. , & Silva, M. J. (2020).Comparison of Multimorbidity in COVID-19 infected and general population in Portugal. medRxiv. 2020.07.02.20144378","DOI":"10.1101\/2020.07.02.20144378"},{"key":"2020110812301447000_2020.11.03.20225565v1.17","first-page":"3442","article-title":"The Hidden Factor\u2014Low Quality of Data is a Major Peril in the Identification of Risk Factors for COVID-19 Deaths: A Comment on Nogueira, P.J., et al. \u201cThe Role of Health Preconditions on COVID-19 Deaths in Portugal: Evidence from Surveillance Data of the First 20293 Infection Cases\u201d","volume":"9","year":"2020","journal-title":"J. Clin. Med. 2020, 9, 2368. J. Clin. Med"},{"key":"2020110812301447000_2020.11.03.20225565v1.18","unstructured":"D\u2019Amore, J. , Bouhaddou, O. , Mitchell, S. , et al (2018). Interoperability Progress and Remaining Data Quality Barriers of Certified Health Information Technologies. AMIA Annu Symp Proc. 2018:358\u2013367."},{"issue":"5","key":"2020110812301447000_2020.11.03.20225565v1.19","doi-asserted-by":"crossref","first-page":"5170","DOI":"10.3390\/ijerph110505170","article-title":"A review of data quality assessment methods for public health information systems","volume":"11","year":"2014","journal-title":"International journal of environmental research and public health"},{"key":"2020110812301447000_2020.11.03.20225565v1.20","unstructured":"IOM Roundtable on Value & Science-Driven Care; Institute of Medicine. Integrating Research and Practice: Health System Leaders Working Toward High-Value Care: Workshop Summary. Washington (DC): National Academies Press (US); 2015 Mar 4. 3, Continuously Learning Health Care: The Value Proposition. Available from: https:\/\/www.ncbi.nlm.nih.gov\/books\/NBK284656\/"}],"container-title":[],"original-title":[],"link":[{"URL":"https:\/\/syndication.highwire.org\/content\/doi\/10.1101\/2020.11.03.20225565","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T13:58:05Z","timestamp":1768485485000},"score":1,"resource":{"primary":{"URL":"http:\/\/medrxiv.org\/lookup\/doi\/10.1101\/2020.11.03.20225565"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,5]]},"references-count":20,"URL":"https:\/\/doi.org\/10.1101\/2020.11.03.20225565","relation":{},"subject":[],"published":{"date-parts":[[2020,11,5]]},"subtype":"preprint"}}