Cargando…
Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
BACKGROUND: As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what qualit...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2288611/ https://www.ncbi.nlm.nih.gov/pubmed/18366742 http://dx.doi.org/10.1186/1471-2288-8-13 |
_version_ | 1782152094804869120 |
---|---|
author | Fottrell, Edward Byass, Peter Berhane, Yemane |
author_facet | Fottrell, Edward Byass, Peter Berhane, Yemane |
author_sort | Fottrell, Edward |
collection | PubMed |
description | BACKGROUND: As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what quality control procedures are in place. The extent to which random errors in large, longitudinal datasets affect overall health and demographic profiles has important implications for the role of DSSs as platforms for public health research and clinical trials. Such knowledge is also of particular importance if the outputs of DSSs are to be extrapolated and aggregated with realistic margins of error and validity. METHODS: This study uses the first 10-year dataset from the Butajira Rural Health Project (BRHP) DSS, Ethiopia, covering approximately 336,000 person-years of data. Simple programmes were written to introduce random errors and omissions into new versions of the definitive 10-year Butajira dataset. Key parameters of sex, age, death, literacy and roof material (an indicator of poverty) were selected for the introduction of errors based on their obvious importance in demographic and health surveillance and their established significant associations with mortality. Defining the original 10-year dataset as the 'gold standard' for the purposes of this investigation, population, age and sex compositions and Poisson regression models of mortality rate ratios were compared between each of the intentionally erroneous datasets and the original 'gold standard' 10-year data. RESULTS: The composition of the Butajira population was well represented despite introducing random errors, and differences between population pyramids based on the derived datasets were subtle. Regression analyses of well-established mortality risk factors were largely unaffected even by relatively high levels of random errors in the data. CONCLUSION: The low sensitivity of parameter estimates and regression analyses to significant amounts of randomly introduced errors indicates a high level of robustness of the dataset. This apparent inertia of population parameter estimates to simulated errors is largely due to the size of the dataset. Tolerable margins of random error in DSS data may exceed 20%. While this is not an argument in favour of poor quality data, reducing the time and valuable resources spent on detecting and correcting random errors in routine DSS operations may be justifiable as the returns from such procedures diminish with increasing overall accuracy. The money and effort currently spent on endlessly correcting DSS datasets would perhaps be better spent on increasing the surveillance population size and geographic spread of DSSs and analysing and disseminating research findings. |
format | Text |
id | pubmed-2288611 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-22886112008-04-05 Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates Fottrell, Edward Byass, Peter Berhane, Yemane BMC Med Res Methodol Research Article BACKGROUND: As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what quality control procedures are in place. The extent to which random errors in large, longitudinal datasets affect overall health and demographic profiles has important implications for the role of DSSs as platforms for public health research and clinical trials. Such knowledge is also of particular importance if the outputs of DSSs are to be extrapolated and aggregated with realistic margins of error and validity. METHODS: This study uses the first 10-year dataset from the Butajira Rural Health Project (BRHP) DSS, Ethiopia, covering approximately 336,000 person-years of data. Simple programmes were written to introduce random errors and omissions into new versions of the definitive 10-year Butajira dataset. Key parameters of sex, age, death, literacy and roof material (an indicator of poverty) were selected for the introduction of errors based on their obvious importance in demographic and health surveillance and their established significant associations with mortality. Defining the original 10-year dataset as the 'gold standard' for the purposes of this investigation, population, age and sex compositions and Poisson regression models of mortality rate ratios were compared between each of the intentionally erroneous datasets and the original 'gold standard' 10-year data. RESULTS: The composition of the Butajira population was well represented despite introducing random errors, and differences between population pyramids based on the derived datasets were subtle. Regression analyses of well-established mortality risk factors were largely unaffected even by relatively high levels of random errors in the data. CONCLUSION: The low sensitivity of parameter estimates and regression analyses to significant amounts of randomly introduced errors indicates a high level of robustness of the dataset. This apparent inertia of population parameter estimates to simulated errors is largely due to the size of the dataset. Tolerable margins of random error in DSS data may exceed 20%. While this is not an argument in favour of poor quality data, reducing the time and valuable resources spent on detecting and correcting random errors in routine DSS operations may be justifiable as the returns from such procedures diminish with increasing overall accuracy. The money and effort currently spent on endlessly correcting DSS datasets would perhaps be better spent on increasing the surveillance population size and geographic spread of DSSs and analysing and disseminating research findings. BioMed Central 2008-03-25 /pmc/articles/PMC2288611/ /pubmed/18366742 http://dx.doi.org/10.1186/1471-2288-8-13 Text en Copyright © 2008 Fottrell et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Fottrell, Edward Byass, Peter Berhane, Yemane Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
title | Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
title_full | Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
title_fullStr | Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
title_full_unstemmed | Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
title_short | Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
title_sort | demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2288611/ https://www.ncbi.nlm.nih.gov/pubmed/18366742 http://dx.doi.org/10.1186/1471-2288-8-13 |
work_keys_str_mv | AT fottrelledward demonstratingtherobustnessofpopulationsurveillancedataimplicationsoferrorratesondemographicandmortalityestimates AT byasspeter demonstratingtherobustnessofpopulationsurveillancedataimplicationsoferrorratesondemographicandmortalityestimates AT berhaneyemane demonstratingtherobustnessofpopulationsurveillancedataimplicationsoferrorratesondemographicandmortalityestimates |