{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T09:23:27Z","timestamp":1772616207216,"version":"3.50.1"},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,7,25]],"date-time":"2020-07-25T00:00:00Z","timestamp":1595635200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,25]],"date-time":"2020-07-25T00:00:00Z","timestamp":1595635200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010269","name":"Wellcome Trust","doi-asserted-by":"crossref","award":["202912 \/ Z \/ 16 \/ Z"],"award-info":[{"award-number":["202912 \/ Z \/ 16 \/ Z"]}],"id":[{"id":"10.13039\/100010269","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100006181","name":"FAPESB","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006181","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100004440","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["213589\/Z\/18\/Z"],"award-info":[{"award-number":["213589\/Z\/18\/Z"]}],"id":[{"id":"10.13039\/100004440","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Research using linked routine population-based data collected for non-research purposes has increased in recent years because they are a rich and detailed source of information. The objective of this study is to present an approach to prepare and link data from administrative sources in a middle-income country, to estimate its quality and to identify potential sources of bias by comparing linked and non-linked\u00a0individuals.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>We linked two administrative datasets with data covering the period 2001 to 2015, using maternal attributes (name, age, date of birth, and municipally of residence) from Brazil: live birth information system and the 100 Million Brazilian Cohort (created using administrative records from over 114 million individuals whose families applied for social assistance via the Unified Register for Social Programmes) implementing an in house developed linkage tool CIDACS-RL. We then estimated the proportion of highly probably link and examined the characteristics of missed-matches to identify any potential source of bias.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>A total of 27,699,891 live births were submited to linkage with maternal information recorded in the baseline of the 100 Million Brazilian Cohort dataset of those, 16,447,414 (59.4%) children were found registered in the 100 Million Brazilian Cohort dataset. The proportion of highly probably link ranged from 39.3% in 2001 to 82.1% in 2014. A substantial improvement in the linkage after the introduction of maternal date of birth attribute, in 2011, was observed. Our analyses indicated a slightly higher proportion of missing data among missed matches and a higher proportion of people living in an urban area and self-declared as Caucasian among linked pairs when compared with non-linked sets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>We demonstrated that CIDACS-RL is capable of performing high quality linkage even with a limited number of common attributes, using indexation as a blocking strategy in larg e routine databases from a middle-income country. However, residual records occurred more among people under worse living conditions. The results presented in this study reinforce the need of evaluating linkage quality and when necessary to take linkage error into account for the analyses of any generated dataset.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12911-020-01192-0","type":"journal-article","created":{"date-parts":[[2020,7,25]],"date-time":"2020-07-25T08:02:34Z","timestamp":1595664154000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["Examining the quality of record linkage process using nationwide Brazilian administrative databases to build a large birth cohort"],"prefix":"10.1186","volume":"20","author":[{"given":"Daniela","family":"Almeida","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Gorender","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maria Yury","family":"Ichihara","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Samila","family":"Sena","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luan","family":"Menezes","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"George C. G.","family":"Barbosa","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rosimeire L.","family":"Fiaccone","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4797-908X","authenticated-orcid":false,"given":"Enny S.","family":"Paix\u00e3o","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robespierre","family":"Pita","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mauricio L.","family":"Barreto","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,7,25]]},"reference":[{"key":"1192_CR1","doi-asserted-by":"publisher","unstructured":"Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: a review of methods and applications. Annu Rev Public Health. 2016. https:\/\/doi.org\/10.1146\/annurev-publhealth-032315-021353.","DOI":"10.1146\/annurev-publhealth-032315-021353"},{"key":"1192_CR2","doi-asserted-by":"publisher","unstructured":"Sayers A, Ben-Shlomo Y, Blom AW, Steele F. Probabilistic record linkage. Int J Epidemiol. 2016. https:\/\/doi.org\/10.1093\/ije\/dyv322.","DOI":"10.1093\/ije\/dyv322"},{"key":"1192_CR3","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0164667","volume":"11","author":"K Harron","year":"2016","unstructured":"Harron K, Gilbert R, Cromwell D, van der Meulen J. Linking Data for Mothers and Babies in De-Identified Electronic Health Data. PLoS One. 2016;11:e0164667. https:\/\/doi.org\/10.1371\/journal.pone.0164667.","journal-title":"PLoS One"},{"key":"1192_CR4","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1186\/1471-2288-14-71","volume":"14.1","author":"CW Kabudula","year":"2014","unstructured":"Kabudula CW, et al. The promise of record linkage for assessing the uptake of health services in resource constrained settings: a pilot study from South Africa. BMC Med Res Methodol. 2014;14.1:71. https:\/\/doi.org\/10.1186\/1471-2288-14-71.","journal-title":"BMC Med Res Methodol"},{"key":"1192_CR5","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1002\/bdra.23142","volume":"97.7","author":"CM O'Leary","year":"2013","unstructured":"O'Leary CM, et al. Exploring the potential to use data linkage for investigating the relationship between birth defects and prenatal alcohol exposure. Birth Defects Res A Clin Mol Teratol. 2013;97.7:497\u2013504. https:\/\/doi.org\/10.1002\/bdra.23142.","journal-title":"Birth Defects Res A Clin Mol Teratol"},{"key":"1192_CR6","doi-asserted-by":"publisher","unstructured":"Newcombe HB, Kennedy JM, Axford SJ, James AP. Automatic linkage of vital records. Science. 1959. https:\/\/doi.org\/10.1126\/science.130.3381.954.","DOI":"10.1126\/science.130.3381.954"},{"key":"1192_CR7","doi-asserted-by":"publisher","unstructured":"Clark DE. Practical introduction to record linkage for injury research. Injury Prev. 2004. https:\/\/doi.org\/10.1136\/ip.2003.004580.","DOI":"10.1136\/ip.2003.004580"},{"key":"1192_CR8","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1016\/j.jbi.2015.05.012","volume":"56","author":"Y Zhu","year":"2015","unstructured":"Zhu Y, et al. When to conduct probabilistic linkage vs. deterministic linkage? A simulation study. J Biomed Inform. 2015;56:80\u20136. https:\/\/doi.org\/10.1016\/j.jbi.2015.05.012.","journal-title":"J Biomed Inform"},{"key":"1192_CR9","doi-asserted-by":"publisher","unstructured":"Harron K. A guide to evaluating linkage quality for the analysis of linked data. Int J Epidemiol. 2017. https:\/\/doi.org\/10.1093\/ije\/dyx177.","DOI":"10.1093\/ije\/dyx177"},{"key":"1192_CR10","doi-asserted-by":"publisher","unstructured":"Rentsch CT, et al. Impact of linkage quality on inferences drawn from analyses using data with high rates of linkage errors in rural Tanzania. BMC Med Res Methodol. 2018. https:\/\/doi.org\/10.1186\/s12874-018-0632-5.","DOI":"10.1186\/s12874-018-0632-5"},{"key":"1192_CR11","doi-asserted-by":"publisher","unstructured":"Harron K, et al. Challenges in administrative data linkage for research. Big Data Soc. 2017. https:\/\/doi.org\/10.1177\/2053951717745678.","DOI":"10.1177\/2053951717745678"},{"key":"1192_CR12","doi-asserted-by":"publisher","unstructured":"Walker JR, Hilder L, Levy MH, Sullivan EA. Pregnancy, prison and perinatal outcomes in New South Wales, Australia: A retrospective cohort study using linked health data. BMC Pregnancy Childbirth. 2014. https:\/\/doi.org\/10.1186\/1471-2393-14-214.","DOI":"10.1186\/1471-2393-14-214"},{"key":"1192_CR13","doi-asserted-by":"publisher","unstructured":"Hockley C, et al. Linking Millennium Cohort data to birth registration and hospital episode records. Paediatr Perinat Epidemiol. 2008. https:\/\/doi.org\/10.1111\/j.1365-3016.2007.00902.x.","DOI":"10.1111\/j.1365-3016.2007.00902.x"},{"key":"1192_CR14","unstructured":"S\u00e3o Paulo (cidade). Secretaria Municipal da Sa\u00fade. Coordena\u00e7\u00e3o de Epidemiologia e Informa\u00e7\u00e3o \u2013 CEInfo. Declara\u00e7\u00e3o de Nascido Vivo. Manual de preenchimento da Declara\u00e7\u00e3o de Nascido Vivo. S\u00e3o Paulo: Secretaria Municipal da Sa\u00fade; 2011. p. 24."},{"key":"1192_CR15","doi-asserted-by":"crossref","unstructured":"Oliveira MM, Andrade SSCA, Dimech GS, et al. Avalia\u00e7\u00e3o do Sistema de Informa\u00e7\u00f5es sobre nascidos vivos. Brasil, 2006 a 2010. Epidemiol. E Servi\u00e7os Sa\u00fade. 2015;24:629\u201340.","DOI":"10.5123\/S1679-49742015000400005"},{"key":"1192_CR16","volume-title":"Sobre as utilidades do Cadastro \u00danico. Texto para discuss\u00e3o no 1414","author":"RP de Barros","year":"2009","unstructured":"de Barros RP, de Carvalho M, Mendon\u00e7a R. Sobre as utilidades do Cadastro \u00danico. Texto para discuss\u00e3o no 1414; 2009."},{"key":"1192_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3389\/fphar.2019.00984","volume":"10","author":"MS Ali","year":"2019","unstructured":"Ali MS, et al. Administrative Data Linkage in Brazil: Potentials for Health Technology Assessment. Front Pharmacol. 2019;10:1\u201320. https:\/\/doi.org\/10.3389\/fphar.2019.00984.","journal-title":"Front Pharmacol"},{"key":"1192_CR18","first-page":"118","volume-title":"Pharmaco Epidemiology and Drug Safety","author":"GCG Barbosa","year":"2019","unstructured":"Barbosa GCG, et al. CIDACS-RL: A novel search engine-based record linkage system for huge datasets with high accuracy and scalability. In: Pharmaco Epidemiology and Drug Safety. Hoboken: Wiley; 2019. p. 118."},{"key":"1192_CR19","unstructured":"Yancey WE. Evaluating string comparator performance for record linkage. Stat Res Div. 2005;1:3905\u201312."},{"key":"1192_CR20","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11257-2_20","volume-title":"Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)","author":"RC Steorts","year":"2014","unstructured":"Steorts RC, Ventura SL, Sadinle M, Fienberg SE. A comparison of blocking methods for record linkage. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 2014. https:\/\/doi.org\/10.1007\/978-3-319-11257-2_20."},{"issue":"328","key":"1192_CR21","doi-asserted-by":"publisher","first-page":"1183","DOI":"10.1080\/01621459.1969.10501049","volume":"64","author":"IP Fellegi","year":"1969","unstructured":"Fellegi IP, Sunter AB. A theory for record linkage. J Am Stat Assoc. 1969;64(328):1183\u2013210. https:\/\/doi.org\/10.1080\/01621459.1969.10501049.","journal-title":"J Am Stat Assoc"},{"key":"1192_CR22","doi-asserted-by":"publisher","unstructured":"Paix\u00e3o ES, et al. Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil. BMC Med Inform Decis Mak. 2017. https:\/\/doi.org\/10.1186\/s12911-017-0506-5.","DOI":"10.1186\/s12911-017-0506-5"},{"key":"1192_CR23","doi-asserted-by":"publisher","unstructured":"Reichman NE, Hade EM. Validation of birth certificate data: A study of women in New Jersey\u2019s healthstart program. Ann Epidemiol. 2001. https:\/\/doi.org\/10.1016\/S1047-2797(00)00209-X.","DOI":"10.1016\/S1047-2797(00)00209-X"},{"key":"1192_CR24","doi-asserted-by":"publisher","unstructured":"St Sauver JL, et al. Linking medical and dental health record data: A partnership with the Rochester Epidemiology Project. BMJ Open. 2017. https:\/\/doi.org\/10.1136\/bmjopen-2016-012528.","DOI":"10.1136\/bmjopen-2016-012528"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-020-01192-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-020-01192-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-020-01192-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,24]],"date-time":"2021-07-24T19:29:00Z","timestamp":1627154940000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-020-01192-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,25]]},"references-count":24,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["1192"],"URL":"https:\/\/doi.org\/10.1186\/s12911-020-01192-0","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-15927\/v2","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-15927\/v3","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-15927\/v1","asserted-by":"object"}]},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,25]]},"assertion":[{"value":"22 February 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 July 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The CIDACS maintains a linkage system for social and health-related data following all ethical, legal, privacy, and confidentiality requirements. The study protocol was reviewed and approved by the Instituto of Public Health Ethics Committee at the Federal University of Bahia (CAAE registration number: 18022319.4.0000.5030).","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"173"}}