Cargando…

Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data

BACKGROUND: Studies using health administrative databases (HAD) may lead to biased results since information on potential confounders is often missing. Methods that integrate confounder data from cohort studies, such as multivariate imputation by chained equations (MICE) and two-stage calibration (T...

Descripción completa

Detalles Bibliográficos
Autores principales: Silenou, Bernard C., Avalos, Marta, Helmer, Catherine, Berr, Claudine, Pariente, Antoine, Jacqmin-Gadda, Helene
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354983/
https://www.ncbi.nlm.nih.gov/pubmed/30703112
http://dx.doi.org/10.1371/journal.pone.0211118
_version_ 1783391282033328128
author Silenou, Bernard C.
Avalos, Marta
Helmer, Catherine
Berr, Claudine
Pariente, Antoine
Jacqmin-Gadda, Helene
author_facet Silenou, Bernard C.
Avalos, Marta
Helmer, Catherine
Berr, Claudine
Pariente, Antoine
Jacqmin-Gadda, Helene
author_sort Silenou, Bernard C.
collection PubMed
description BACKGROUND: Studies using health administrative databases (HAD) may lead to biased results since information on potential confounders is often missing. Methods that integrate confounder data from cohort studies, such as multivariate imputation by chained equations (MICE) and two-stage calibration (TSC), aim to reduce confounding bias. We provide new insights into their behavior under different deviations from representativeness of the cohort. METHODS: We conducted an extensive simulation study to assess the performance of these two methods under different deviations from representativeness of the cohort. We illustrate these approaches by studying the association between benzodiazepine use and fractures in the elderly using the general sample of French health insurance beneficiaries (EGB) as main database and two French cohorts (Paquid and 3C) as validation samples. RESULTS: When the cohort was representative from the same population as the HAD, the two methods are unbiased. TSC was more efficient and faster but its variance could be slightly underestimated when confounders were non-Gaussian. If the cohort was a subsample of the HAD (internal validation) with the probability of the subject being included in the cohort depending on both exposure and outcome, MICE was unbiased while TSC was biased. The two methods appeared biased when the inclusion probability in the cohort depended on unobserved confounders. CONCLUSION: When choosing the most appropriate method, epidemiologists should consider the origin of the cohort (internal or external validation) as well as the (anticipated or observed) selection biases of the validation sample.
format Online
Article
Text
id pubmed-6354983
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63549832019-02-15 Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data Silenou, Bernard C. Avalos, Marta Helmer, Catherine Berr, Claudine Pariente, Antoine Jacqmin-Gadda, Helene PLoS One Research Article BACKGROUND: Studies using health administrative databases (HAD) may lead to biased results since information on potential confounders is often missing. Methods that integrate confounder data from cohort studies, such as multivariate imputation by chained equations (MICE) and two-stage calibration (TSC), aim to reduce confounding bias. We provide new insights into their behavior under different deviations from representativeness of the cohort. METHODS: We conducted an extensive simulation study to assess the performance of these two methods under different deviations from representativeness of the cohort. We illustrate these approaches by studying the association between benzodiazepine use and fractures in the elderly using the general sample of French health insurance beneficiaries (EGB) as main database and two French cohorts (Paquid and 3C) as validation samples. RESULTS: When the cohort was representative from the same population as the HAD, the two methods are unbiased. TSC was more efficient and faster but its variance could be slightly underestimated when confounders were non-Gaussian. If the cohort was a subsample of the HAD (internal validation) with the probability of the subject being included in the cohort depending on both exposure and outcome, MICE was unbiased while TSC was biased. The two methods appeared biased when the inclusion probability in the cohort depended on unobserved confounders. CONCLUSION: When choosing the most appropriate method, epidemiologists should consider the origin of the cohort (internal or external validation) as well as the (anticipated or observed) selection biases of the validation sample. Public Library of Science 2019-01-31 /pmc/articles/PMC6354983/ /pubmed/30703112 http://dx.doi.org/10.1371/journal.pone.0211118 Text en © 2019 Silenou et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Silenou, Bernard C.
Avalos, Marta
Helmer, Catherine
Berr, Claudine
Pariente, Antoine
Jacqmin-Gadda, Helene
Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data
title Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data
title_full Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data
title_fullStr Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data
title_full_unstemmed Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data
title_short Health administrative data enrichment using cohort information: Comparative evaluation of methods by simulation and application to real data
title_sort health administrative data enrichment using cohort information: comparative evaluation of methods by simulation and application to real data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6354983/
https://www.ncbi.nlm.nih.gov/pubmed/30703112
http://dx.doi.org/10.1371/journal.pone.0211118
work_keys_str_mv AT silenoubernardc healthadministrativedataenrichmentusingcohortinformationcomparativeevaluationofmethodsbysimulationandapplicationtorealdata
AT avalosmarta healthadministrativedataenrichmentusingcohortinformationcomparativeevaluationofmethodsbysimulationandapplicationtorealdata
AT helmercatherine healthadministrativedataenrichmentusingcohortinformationcomparativeevaluationofmethodsbysimulationandapplicationtorealdata
AT berrclaudine healthadministrativedataenrichmentusingcohortinformationcomparativeevaluationofmethodsbysimulationandapplicationtorealdata
AT parienteantoine healthadministrativedataenrichmentusingcohortinformationcomparativeevaluationofmethodsbysimulationandapplicationtorealdata
AT jacqmingaddahelene healthadministrativedataenrichmentusingcohortinformationcomparativeevaluationofmethodsbysimulationandapplicationtorealdata