Cargando…

Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study

BACKGROUND: Linked population health data are increasingly used in epidemiological studies. If data items are reported on more than one dataset, data linkage can reduce the under-ascertainment associated with many population health datasets. However, this raises the possibility of discrepant case re...

Descripción completa

Detalles Bibliográficos
Autores principales:	Roberts, Christine L, Algert, Charles S, Ford, Jane B
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1797010/ https://www.ncbi.nlm.nih.gov/pubmed/17261198 http://dx.doi.org/10.1186/1472-6963-7-12

_version_	1782132280747098112
author	Roberts, Christine L Algert, Charles S Ford, Jane B
author_facet	Roberts, Christine L Algert, Charles S Ford, Jane B
author_sort	Roberts, Christine L
collection	PubMed
description	BACKGROUND: Linked population health data are increasingly used in epidemiological studies. If data items are reported on more than one dataset, data linkage can reduce the under-ascertainment associated with many population health datasets. However, this raises the possibility of discrepant case reports from different datasets. METHODS: We examined the effect of four methods of classifying discrepant reports from different population health datasets on the estimated prevalence of hypertensive disorders of pregnancy and on the adjusted odds ratios (aOR) for known risk factors. Data were obtained from linked, validated, birth and hospital data for women who gave birth in a New South Wales hospital (Australia) 2000–2002. RESULTS: Among 250173 women with linked data, 238412 (95.3%) women had perfect agreement on the occurrence of hypertension, 1577 (0.6%) had imperfect agreement; 9369 (3.7%) had hypertension reported in only one dataset (under-reporting) and 815 (0.3%) had conflicting types of hypertension. Using only perfect agreement between birth and discharge data resulted in the lowest prevalence rates (0.3% chronic, 5.1% pregnancy hypertension), while including all reports resulted in the highest prevalence rates (1.1 % chronic, 8.7% pregnancy hypertension). The higher prevalence rates were generally consistent with international reports. In contrast, perfect agreement gave the highest aOR (95% confidence interval) for known risk factors: risk of chronic hypertension for maternal age ≥40 years was 4.0 (2.9, 5.3) and the risk of pregnancy hypertension for multiple birth was 2.8 (2.5, 3.2). CONCLUSION: The method chosen for classifying discrepant case reports should vary depending on the study question; all reports should be used as part of calculating the range of prevalence estimates, but perfect matches may be best suited to risk factor analyses. These findings are likely to be applicable to the linkage of any specialised health services datasets to population data that include information on diagnoses or procedures.
format	Text
id	pubmed-1797010
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-17970102007-02-13 Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study Roberts, Christine L Algert, Charles S Ford, Jane B BMC Health Serv Res Research Article BACKGROUND: Linked population health data are increasingly used in epidemiological studies. If data items are reported on more than one dataset, data linkage can reduce the under-ascertainment associated with many population health datasets. However, this raises the possibility of discrepant case reports from different datasets. METHODS: We examined the effect of four methods of classifying discrepant reports from different population health datasets on the estimated prevalence of hypertensive disorders of pregnancy and on the adjusted odds ratios (aOR) for known risk factors. Data were obtained from linked, validated, birth and hospital data for women who gave birth in a New South Wales hospital (Australia) 2000–2002. RESULTS: Among 250173 women with linked data, 238412 (95.3%) women had perfect agreement on the occurrence of hypertension, 1577 (0.6%) had imperfect agreement; 9369 (3.7%) had hypertension reported in only one dataset (under-reporting) and 815 (0.3%) had conflicting types of hypertension. Using only perfect agreement between birth and discharge data resulted in the lowest prevalence rates (0.3% chronic, 5.1% pregnancy hypertension), while including all reports resulted in the highest prevalence rates (1.1 % chronic, 8.7% pregnancy hypertension). The higher prevalence rates were generally consistent with international reports. In contrast, perfect agreement gave the highest aOR (95% confidence interval) for known risk factors: risk of chronic hypertension for maternal age ≥40 years was 4.0 (2.9, 5.3) and the risk of pregnancy hypertension for multiple birth was 2.8 (2.5, 3.2). CONCLUSION: The method chosen for classifying discrepant case reports should vary depending on the study question; all reports should be used as part of calculating the range of prevalence estimates, but perfect matches may be best suited to risk factor analyses. These findings are likely to be applicable to the linkage of any specialised health services datasets to population data that include information on diagnoses or procedures. BioMed Central 2007-01-30 /pmc/articles/PMC1797010/ /pubmed/17261198 http://dx.doi.org/10.1186/1472-6963-7-12 Text en Copyright © 2007 Roberts et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Roberts, Christine L Algert, Charles S Ford, Jane B Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
title	Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
title_full	Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
title_fullStr	Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
title_full_unstemmed	Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
title_short	Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
title_sort	methods for dealing with discrepant records in linked population health datasets: a cross-sectional study
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1797010/ https://www.ncbi.nlm.nih.gov/pubmed/17261198 http://dx.doi.org/10.1186/1472-6963-7-12
work_keys_str_mv	AT robertschristinel methodsfordealingwithdiscrepantrecordsinlinkedpopulationhealthdatasetsacrosssectionalstudy AT algertcharless methodsfordealingwithdiscrepantrecordsinlinkedpopulationhealthdatasetsacrosssectionalstudy AT fordjaneb methodsfordealingwithdiscrepantrecordsinlinkedpopulationhealthdatasetsacrosssectionalstudy

Methods for dealing with discrepant records in linked population health datasets: a cross-sectional study

Ejemplares similares