{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T03:04:50Z","timestamp":1765422290079,"version":"3.38.0"},"reference-count":46,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2023,2,17]],"date-time":"2023-02-17T00:00:00Z","timestamp":1676592000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["HIM J"],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:sec><jats:title>Background<\/jats:title><jats:p> Quantifying and dealing with lack of consistency in administrative databases (namely, under-coding) requires tracking patients longitudinally without compromising anonymity, which is often a challenging task. <\/jats:p><\/jats:sec><jats:sec><jats:title>Objective<\/jats:title><jats:p> This study aimed to (i) assess and compare different hierarchical clustering methods on the identification of individual patients in an administrative database that does not easily allow tracking of episodes from the same patient; (ii) quantify the frequency of potential under-coding; and (iii) identify factors associated with such phenomena. <\/jats:p><\/jats:sec><jats:sec><jats:title>Method<\/jats:title><jats:p> We analysed the Portuguese National Hospital Morbidity Dataset, an administrative database registering all hospitalisations occurring in Mainland Portugal between 2011\u20132015. We applied different approaches of hierarchical clustering methods (either isolated or combined with partitional clustering methods), to identify potential individual patients based on demographic variables and comorbidities. Diagnoses codes were grouped into the Charlson an Elixhauser comorbidity defined groups. The algorithm displaying the best performance was used to quantify potential under-coding. A generalised mixed model (GML) of binomial regression was applied to assess factors associated with such potential under-coding. <\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p> We observed that the hierarchical cluster analysis (HCA) + k-means clustering method with comorbidities grouped according to the Charlson defined groups was the algorithm displaying the best performance (with a Rand Index of 0.99997). We identified potential under-coding in all Charlson comorbidity groups, ranging from 3.5% (overall diabetes) to 27.7% (asthma). Overall, being male, having medical admission, dying during hospitalisation or being admitted at more specific and complex hospitals were associated with increased odds of potential under-coding. <\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p> We assessed several approaches to identify individual patients in an administrative database and, subsequently, by applying HCA + k-means algorithm, we tracked coding inconsistency and potentially improved data quality. We reported consistent potential under-coding in all defined groups of comorbidities and potential factors associated with such lack of completeness. <\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p> Our proposed methodological framework could both enhance data quality and act as a reference for other studies relying on databases with similar problems. <\/jats:p><\/jats:sec>","DOI":"10.1177\/18333583221144663","type":"journal-article","created":{"date-parts":[[2023,2,18]],"date-time":"2023-02-18T06:16:16Z","timestamp":1676700976000},"page":"174-182","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["Unsupervised algorithms to identify potential under-coding of secondary diagnoses in hospitalisations databases in Portugal"],"prefix":"10.1177","volume":"53","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7913-7461","authenticated-orcid":false,"given":"Diana","family":"Portela","sequence":"first","affiliation":[{"name":"Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, Portugal"},{"name":"ACES Entre o Douro e Vouga I - Feira\/Arouca, Portugal"},{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0233-830X","authenticated-orcid":false,"given":"Rita","family":"Amaral","sequence":"additional","affiliation":[{"name":"Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, Portugal"},{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"},{"name":"ESS, IPP - Porto Health School, Polytechnic Institute of Porto, Portugal"}]},{"given":"Pedro P","family":"Rodrigues","sequence":"additional","affiliation":[{"name":"Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, Portugal"},{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2113-9653","authenticated-orcid":false,"given":"Alberto","family":"Freitas","sequence":"additional","affiliation":[{"name":"Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, Portugal"},{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"}]},{"given":"El\u00edsio","family":"Costa","sequence":"additional","affiliation":[{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"},{"name":"Research Unit on Applied Molecular Biosciences (UCIBIO\u2014REQUIMTE), Faculty of Pharmacy, University of Porto, Portugal"}]},{"given":"Jo\u00e3o A","family":"Fonseca","sequence":"additional","affiliation":[{"name":"Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, Portugal"},{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"}]},{"given":"Bernardo","family":"Sousa-Pinto","sequence":"additional","affiliation":[{"name":"Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, Portugal"},{"name":"Center for Health Technology and Services Research (CINTESIS), Faculty of Medicine, University of Porto, Portugal"}]}],"member":"179","published-online":{"date-parts":[[2023,2,17]]},"reference":[{"key":"bibr1-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1007\/s10916-020-1532-x"},{"key":"bibr2-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1177\/1833358319826351"},{"key":"bibr3-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1145\/565117.565130"},{"key":"bibr4-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1177\/070674370705200203"},{"key":"bibr5-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1177\/1833358319897928"},{"volume-title":"ICD-9-CM Official Guidelines for Coding and Reporting","year":"1991","author":"CDC","key":"bibr6-18333583221144663"},{"key":"bibr7-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9681(87)90171-8"},{"key":"bibr8-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1016\/0895-4356(92)90133-8"},{"key":"bibr9-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1002\/art.21440"},{"key":"bibr10-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1097\/00005650-199801000-00004"},{"key":"bibr11-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28696-4_8"},{"key":"bibr12-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-31307-8_63"},{"key":"bibr13-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1093\/qjmed\/hcr070"},{"volume-title":"The Epidemiology and Outcomes of Critical Illness in Manitoba","year":"2012","author":"Garland A","key":"bibr14-18333583221144663"},{"key":"bibr15-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1016\/j.csda.2004.06.017"},{"key":"bibr16-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1186\/s13690-021-00687-0"},{"key":"bibr17-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1177\/2053951717745678"},{"key":"bibr18-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1136\/bmjhci-2019-000016"},{"key":"bibr19-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1145\/331499.331504"},{"key":"bibr20-18333583221144663","first-page":"619","volume":"3","author":"Johnston M","year":"2014","journal-title":"Qualitative and Quantitative Methods in Libraries"},{"key":"bibr21-18333583221144663","doi-asserted-by":"publisher","DOI":"10.23889\/ijpds.v3i1.446"},{"key":"bibr22-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1007\/s003579900043"},{"key":"bibr23-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-med-022613-090415"},{"key":"bibr24-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocz173"},{"key":"bibr25-18333583221144663","doi-asserted-by":"publisher","DOI":"10.3390\/e22121391"},{"key":"bibr26-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1186\/s12913-016-1489-0"},{"key":"bibr27-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1136\/bmjopen-2017-020824"},{"key":"bibr28-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1093\/ije\/dyy134"},{"key":"bibr29-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1136\/bmjopen-2011-000723"},{"key":"bibr30-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1177\/1460458216647089"},{"key":"bibr31-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1111\/j.1475-6773.2007.00822.x"},{"key":"bibr32-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1097\/00005650-200208000-00007"},{"key":"bibr33-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1186\/2047-2501-2-3"},{"key":"bibr34-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1186\/s12890-018-0684-7"},{"key":"bibr35-18333583221144663","doi-asserted-by":"publisher","DOI":"10.14236\/jhi.v17i2.723"},{"volume-title":"Modern Epidemiology","year":"2015","author":"Rothman KJ","key":"bibr36-18333583221144663"},{"key":"bibr37-18333583221144663","doi-asserted-by":"publisher","DOI":"10.3390\/bs9120122"},{"key":"bibr38-18333583221144663","unstructured":"Saude MD (2014) Portaria n.\u00b0 82\/2014 de 10 de abril. Di\u00e1rio da Rep\u00fablica."},{"key":"bibr39-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1016\/j.anai.2017.11.022"},{"key":"bibr40-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1007\/s11219-020-09504-3"},{"key":"bibr41-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1177\/1833358319840575"},{"key":"bibr42-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1016\/j.burns.2018.09.013"},{"key":"bibr43-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1186\/s12963-016-0115-z"},{"key":"bibr44-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1963.10500845"},{"key":"bibr45-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1161\/CIRCINTERVENTIONS.120.009447"},{"key":"bibr46-18333583221144663","doi-asserted-by":"publisher","DOI":"10.1007\/s11606-018-4760-8"}],"container-title":["Health Information Management Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/18333583221144663","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/18333583221144663","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/18333583221144663","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T04:35:59Z","timestamp":1740976559000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/18333583221144663"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,17]]},"references-count":46,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["10.1177\/18333583221144663"],"URL":"https:\/\/doi.org\/10.1177\/18333583221144663","relation":{},"ISSN":["1833-3583","1833-3575"],"issn-type":[{"type":"print","value":"1833-3583"},{"type":"electronic","value":"1833-3575"}],"subject":[],"published":{"date-parts":[[2023,2,17]]}}}