Cargando…
A novel framework for validating and applying standardized small area measurement strategies
BACKGROUND: Local measurements of health behaviors, diseases, and use of health services are critical inputs into local, state, and national decision-making. Small area measurement methods can deliver more precise and accurate local-level information than direct estimates from surveys or administrat...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2958154/ https://www.ncbi.nlm.nih.gov/pubmed/20920214 http://dx.doi.org/10.1186/1478-7954-8-26 |
_version_ | 1782188306292801536 |
---|---|
author | Srebotnjak, Tanja Mokdad , Ali H Murray, Christopher JL |
author_facet | Srebotnjak, Tanja Mokdad , Ali H Murray, Christopher JL |
author_sort | Srebotnjak, Tanja |
collection | PubMed |
description | BACKGROUND: Local measurements of health behaviors, diseases, and use of health services are critical inputs into local, state, and national decision-making. Small area measurement methods can deliver more precise and accurate local-level information than direct estimates from surveys or administrative records, where sample sizes are often too small to yield acceptable standard errors. However, small area measurement requires careful validation using approaches other than conventional statistical methods such as in-sample or cross-validation methods because they do not solve the problem of validating estimates in data-sparse domains. METHODS: A new general framework for small area estimation and validation is developed and applied to estimate Type 2 diabetes prevalence in US counties using data from the Behavioral Risk Factor Surveillance System (BRFSS). The framework combines the three conventional approaches to small area measurement: (1) pooling data across time by combining multiple survey years; (2) exploiting spatial correlation by including a spatial component; and (3) utilizing structured relationships between the outcome variable and domain-specific covariates to define four increasingly complex model types - coined the Naive, Geospatial, Covariate, and Full models. The validation framework uses direct estimates of prevalence in large domains as the gold standard and compares model estimates against it using (i) all available observations for the large domains and (ii) systematically reduced sample sizes obtained through random sampling with replacement. At each sampling level, the model is rerun repeatedly, and the validity of the model estimates from the four model types is then determined by calculating the (average) concordance correlation coefficient (CCC) and (average) root mean squared error (RMSE) against the gold standard. The CCC is closely related to the intraclass correlation coefficient and can be used when the units are organized in groups and when it is of interest to measure the agreement between units in the same group (e.g., counties). The RMSE is often used to measure the differences between values predicted by a model or an estimator and the actually observed values. It is a useful measure to capture the precision of the model or estimator. RESULTS: All model types have substantially higher CCC and lower RMSE than the direct, single-year BRFSS estimates. In addition, the inclusion of relevant domain-specific covariates generally improves predictive validity, especially at small sample sizes, and their leverage can be equivalent to a five- to tenfold increase in sample size. CONCLUSIONS: Small area estimation of important health outcomes and risk factors can be improved using a systematic modeling and validation framework, which consistently outperformed single-year direct survey estimates and demonstrated the potential leverage of including relevant domain-specific covariates compared to pure measurement models. The proposed validation strategy can be applied to other disease outcomes and risk factors in the US as well as to resource-scarce situations, including low-income countries. These estimates are needed by public health officials to identify at-risk groups, to design targeted prevention and intervention programs, and to monitor and evaluate results over time. |
format | Text |
id | pubmed-2958154 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-29581542010-10-21 A novel framework for validating and applying standardized small area measurement strategies Srebotnjak, Tanja Mokdad , Ali H Murray, Christopher JL Popul Health Metr Research BACKGROUND: Local measurements of health behaviors, diseases, and use of health services are critical inputs into local, state, and national decision-making. Small area measurement methods can deliver more precise and accurate local-level information than direct estimates from surveys or administrative records, where sample sizes are often too small to yield acceptable standard errors. However, small area measurement requires careful validation using approaches other than conventional statistical methods such as in-sample or cross-validation methods because they do not solve the problem of validating estimates in data-sparse domains. METHODS: A new general framework for small area estimation and validation is developed and applied to estimate Type 2 diabetes prevalence in US counties using data from the Behavioral Risk Factor Surveillance System (BRFSS). The framework combines the three conventional approaches to small area measurement: (1) pooling data across time by combining multiple survey years; (2) exploiting spatial correlation by including a spatial component; and (3) utilizing structured relationships between the outcome variable and domain-specific covariates to define four increasingly complex model types - coined the Naive, Geospatial, Covariate, and Full models. The validation framework uses direct estimates of prevalence in large domains as the gold standard and compares model estimates against it using (i) all available observations for the large domains and (ii) systematically reduced sample sizes obtained through random sampling with replacement. At each sampling level, the model is rerun repeatedly, and the validity of the model estimates from the four model types is then determined by calculating the (average) concordance correlation coefficient (CCC) and (average) root mean squared error (RMSE) against the gold standard. The CCC is closely related to the intraclass correlation coefficient and can be used when the units are organized in groups and when it is of interest to measure the agreement between units in the same group (e.g., counties). The RMSE is often used to measure the differences between values predicted by a model or an estimator and the actually observed values. It is a useful measure to capture the precision of the model or estimator. RESULTS: All model types have substantially higher CCC and lower RMSE than the direct, single-year BRFSS estimates. In addition, the inclusion of relevant domain-specific covariates generally improves predictive validity, especially at small sample sizes, and their leverage can be equivalent to a five- to tenfold increase in sample size. CONCLUSIONS: Small area estimation of important health outcomes and risk factors can be improved using a systematic modeling and validation framework, which consistently outperformed single-year direct survey estimates and demonstrated the potential leverage of including relevant domain-specific covariates compared to pure measurement models. The proposed validation strategy can be applied to other disease outcomes and risk factors in the US as well as to resource-scarce situations, including low-income countries. These estimates are needed by public health officials to identify at-risk groups, to design targeted prevention and intervention programs, and to monitor and evaluate results over time. BioMed Central 2010-09-29 /pmc/articles/PMC2958154/ /pubmed/20920214 http://dx.doi.org/10.1186/1478-7954-8-26 Text en Copyright ©2010 Srebotnjak et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Srebotnjak, Tanja Mokdad , Ali H Murray, Christopher JL A novel framework for validating and applying standardized small area measurement strategies |
title | A novel framework for validating and applying standardized small area measurement strategies |
title_full | A novel framework for validating and applying standardized small area measurement strategies |
title_fullStr | A novel framework for validating and applying standardized small area measurement strategies |
title_full_unstemmed | A novel framework for validating and applying standardized small area measurement strategies |
title_short | A novel framework for validating and applying standardized small area measurement strategies |
title_sort | novel framework for validating and applying standardized small area measurement strategies |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2958154/ https://www.ncbi.nlm.nih.gov/pubmed/20920214 http://dx.doi.org/10.1186/1478-7954-8-26 |
work_keys_str_mv | AT srebotnjaktanja anovelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies AT mokdadalih anovelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies AT murraychristopherjl anovelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies AT srebotnjaktanja novelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies AT mokdadalih novelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies AT murraychristopherjl novelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies |