Cargando…
Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
BACKGROUND: Linked population health data are increasingly used in epidemiological studies. If data items are reported on more than one dataset, data linkage can reduce the under-ascertainment associated with many population health datasets. However, this raises the possibility of discrepant case re...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1797010/ https://www.ncbi.nlm.nih.gov/pubmed/17261198 http://dx.doi.org/10.1186/1472-6963-7-12 |
_version_ | 1782132280747098112 |
---|---|
author | Roberts, Christine L Algert, Charles S Ford, Jane B |
author_facet | Roberts, Christine L Algert, Charles S Ford, Jane B |
author_sort | Roberts, Christine L |
collection | PubMed |
description | BACKGROUND: Linked population health data are increasingly used in epidemiological studies. If data items are reported on more than one dataset, data linkage can reduce the under-ascertainment associated with many population health datasets. However, this raises the possibility of discrepant case reports from different datasets. METHODS: We examined the effect of four methods of classifying discrepant reports from different population health datasets on the estimated prevalence of hypertensive disorders of pregnancy and on the adjusted odds ratios (aOR) for known risk factors. Data were obtained from linked, validated, birth and hospital data for women who gave birth in a New South Wales hospital (Australia) 2000–2002. RESULTS: Among 250173 women with linked data, 238412 (95.3%) women had perfect agreement on the occurrence of hypertension, 1577 (0.6%) had imperfect agreement; 9369 (3.7%) had hypertension reported in only one dataset (under-reporting) and 815 (0.3%) had conflicting types of hypertension. Using only perfect agreement between birth and discharge data resulted in the lowest prevalence rates (0.3% chronic, 5.1% pregnancy hypertension), while including all reports resulted in the highest prevalence rates (1.1 % chronic, 8.7% pregnancy hypertension). The higher prevalence rates were generally consistent with international reports. In contrast, perfect agreement gave the highest aOR (95% confidence interval) for known risk factors: risk of chronic hypertension for maternal age ≥40 years was 4.0 (2.9, 5.3) and the risk of pregnancy hypertension for multiple birth was 2.8 (2.5, 3.2). CONCLUSION: The method chosen for classifying discrepant case reports should vary depending on the study question; all reports should be used as part of calculating the range of prevalence estimates, but perfect matches may be best suited to risk factor analyses. These findings are likely to be applicable to the linkage of any specialised health services datasets to population data that include information on diagnoses or procedures. |
format | Text |
id | pubmed-1797010 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-17970102007-02-13 Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study Roberts, Christine L Algert, Charles S Ford, Jane B BMC Health Serv Res Research Article BACKGROUND: Linked population health data are increasingly used in epidemiological studies. If data items are reported on more than one dataset, data linkage can reduce the under-ascertainment associated with many population health datasets. However, this raises the possibility of discrepant case reports from different datasets. METHODS: We examined the effect of four methods of classifying discrepant reports from different population health datasets on the estimated prevalence of hypertensive disorders of pregnancy and on the adjusted odds ratios (aOR) for known risk factors. Data were obtained from linked, validated, birth and hospital data for women who gave birth in a New South Wales hospital (Australia) 2000–2002. RESULTS: Among 250173 women with linked data, 238412 (95.3%) women had perfect agreement on the occurrence of hypertension, 1577 (0.6%) had imperfect agreement; 9369 (3.7%) had hypertension reported in only one dataset (under-reporting) and 815 (0.3%) had conflicting types of hypertension. Using only perfect agreement between birth and discharge data resulted in the lowest prevalence rates (0.3% chronic, 5.1% pregnancy hypertension), while including all reports resulted in the highest prevalence rates (1.1 % chronic, 8.7% pregnancy hypertension). The higher prevalence rates were generally consistent with international reports. In contrast, perfect agreement gave the highest aOR (95% confidence interval) for known risk factors: risk of chronic hypertension for maternal age ≥40 years was 4.0 (2.9, 5.3) and the risk of pregnancy hypertension for multiple birth was 2.8 (2.5, 3.2). CONCLUSION: The method chosen for classifying discrepant case reports should vary depending on the study question; all reports should be used as part of calculating the range of prevalence estimates, but perfect matches may be best suited to risk factor analyses. These findings are likely to be applicable to the linkage of any specialised health services datasets to population data that include information on diagnoses or procedures. BioMed Central 2007-01-30 /pmc/articles/PMC1797010/ /pubmed/17261198 http://dx.doi.org/10.1186/1472-6963-7-12 Text en Copyright © 2007 Roberts et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Roberts, Christine L Algert, Charles S Ford, Jane B Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
title | Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
title_full | Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
title_fullStr | Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
title_full_unstemmed | Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
title_short | Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
title_sort | methods for dealing with discrepant records in linked population health datasets: a cross-sectional study |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1797010/ https://www.ncbi.nlm.nih.gov/pubmed/17261198 http://dx.doi.org/10.1186/1472-6963-7-12 |
work_keys_str_mv | AT robertschristinel methodsfordealingwithdiscrepantrecordsinlinkedpopulationhealthdatasetsacrosssectionalstudy AT algertcharless methodsfordealingwithdiscrepantrecordsinlinkedpopulationhealthdatasetsacrosssectionalstudy AT fordjaneb methodsfordealingwithdiscrepantrecordsinlinkedpopulationhealthdatasetsacrosssectionalstudy |