Cargando…

Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil

BACKGROUND: Due to the increasing availability of individual-level information across different electronic datasets, record linkage has become an efficient and important research tool. High quality linkage is essential for producing robust results. The objective of this study was to describe the pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Paixão, Enny S, Harron, Katie, Andrade, Kleydson, Teixeira, Maria Glória, Fiaccone, Rosemeire L., Costa, Maria da Conceição N., Rodrigues, Laura C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5513351/
https://www.ncbi.nlm.nih.gov/pubmed/28716074
http://dx.doi.org/10.1186/s12911-017-0506-5
_version_ 1783250642515525632
author Paixão, Enny S
Harron, Katie
Andrade, Kleydson
Teixeira, Maria Glória
Fiaccone, Rosemeire L.
Costa, Maria da Conceição N.
Rodrigues, Laura C.
author_facet Paixão, Enny S
Harron, Katie
Andrade, Kleydson
Teixeira, Maria Glória
Fiaccone, Rosemeire L.
Costa, Maria da Conceição N.
Rodrigues, Laura C.
author_sort Paixão, Enny S
collection PubMed
description BACKGROUND: Due to the increasing availability of individual-level information across different electronic datasets, record linkage has become an efficient and important research tool. High quality linkage is essential for producing robust results. The objective of this study was to describe the process of preparing and linking national Brazilian datasets, and to compare the accuracy of different linkage methods for assessing the risk of stillbirth due to dengue in pregnancy. METHODS: We linked mothers and stillbirths in two routinely collected datasets from Brazil for 2009–2010: for dengue in pregnancy, notifications of infectious diseases (SINAN); for stillbirths, mortality (SIM). Since there was no unique identifier, we used probabilistic linkage based on maternal name, age and municipality. We compared two probabilistic approaches, each with two thresholds: 1) a bespoke linkage algorithm; 2) a standard linkage software widely used in Brazil (ReclinkIII), and used manual review to identify further links. Sensitivity and positive predictive value (PPV) were estimated using a subset of gold-standard data created through manual review. We examined the characteristics of false-matches and missed-matches to identify any sources of bias. RESULTS: From records of 678,999 dengue cases and 62,373 stillbirths, the gold-standard linkage identified 191 cases. The bespoke linkage algorithm with a conservative threshold produced 131 links, with sensitivity = 64.4% (68 missed-matches) and PPV = 92.5% (8 false-matches). Manual review of uncertain links identified an additional 37 links, increasing sensitivity to 83.7%. The bespoke algorithm with a relaxed threshold identified 132 true matches (sensitivity = 69.1%), but introduced 61 false-matches (PPV = 68.4%). ReclinkIII produced lower sensitivity and PPV than the bespoke linkage algorithm. Linkage error was not associated with any recorded study variables. CONCLUSION: Despite a lack of unique identifiers for linking mothers and stillbirths, we demonstrate a high standard of linkage of large routine databases from a middle income country. Probabilistic linkage and manual review were essential for accurately identifying cases for a case-control study, but this approach may not be feasible for larger databases or for linkage of more common outcomes.
format Online
Article
Text
id pubmed-5513351
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-55133512017-07-19 Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil Paixão, Enny S Harron, Katie Andrade, Kleydson Teixeira, Maria Glória Fiaccone, Rosemeire L. Costa, Maria da Conceição N. Rodrigues, Laura C. BMC Med Inform Decis Mak Research Article BACKGROUND: Due to the increasing availability of individual-level information across different electronic datasets, record linkage has become an efficient and important research tool. High quality linkage is essential for producing robust results. The objective of this study was to describe the process of preparing and linking national Brazilian datasets, and to compare the accuracy of different linkage methods for assessing the risk of stillbirth due to dengue in pregnancy. METHODS: We linked mothers and stillbirths in two routinely collected datasets from Brazil for 2009–2010: for dengue in pregnancy, notifications of infectious diseases (SINAN); for stillbirths, mortality (SIM). Since there was no unique identifier, we used probabilistic linkage based on maternal name, age and municipality. We compared two probabilistic approaches, each with two thresholds: 1) a bespoke linkage algorithm; 2) a standard linkage software widely used in Brazil (ReclinkIII), and used manual review to identify further links. Sensitivity and positive predictive value (PPV) were estimated using a subset of gold-standard data created through manual review. We examined the characteristics of false-matches and missed-matches to identify any sources of bias. RESULTS: From records of 678,999 dengue cases and 62,373 stillbirths, the gold-standard linkage identified 191 cases. The bespoke linkage algorithm with a conservative threshold produced 131 links, with sensitivity = 64.4% (68 missed-matches) and PPV = 92.5% (8 false-matches). Manual review of uncertain links identified an additional 37 links, increasing sensitivity to 83.7%. The bespoke algorithm with a relaxed threshold identified 132 true matches (sensitivity = 69.1%), but introduced 61 false-matches (PPV = 68.4%). ReclinkIII produced lower sensitivity and PPV than the bespoke linkage algorithm. Linkage error was not associated with any recorded study variables. CONCLUSION: Despite a lack of unique identifiers for linking mothers and stillbirths, we demonstrate a high standard of linkage of large routine databases from a middle income country. Probabilistic linkage and manual review were essential for accurately identifying cases for a case-control study, but this approach may not be feasible for larger databases or for linkage of more common outcomes. BioMed Central 2017-07-17 /pmc/articles/PMC5513351/ /pubmed/28716074 http://dx.doi.org/10.1186/s12911-017-0506-5 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Paixão, Enny S
Harron, Katie
Andrade, Kleydson
Teixeira, Maria Glória
Fiaccone, Rosemeire L.
Costa, Maria da Conceição N.
Rodrigues, Laura C.
Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil
title Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil
title_full Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil
title_fullStr Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil
title_full_unstemmed Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil
title_short Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil
title_sort evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in brazil
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5513351/
https://www.ncbi.nlm.nih.gov/pubmed/28716074
http://dx.doi.org/10.1186/s12911-017-0506-5
work_keys_str_mv AT paixaoennys evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil
AT harronkatie evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil
AT andradekleydson evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil
AT teixeiramariagloria evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil
AT fiacconerosemeirel evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil
AT costamariadaconceicaon evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil
AT rodrigueslaurac evaluationofrecordlinkageoftwolargeadministrativedatabasesinamiddleincomecountrystillbirthsandnotificationsofdengueduringpregnancyinbrazil