Cargando…

A novel framework for validating and applying standardized small area measurement strategies

BACKGROUND: Local measurements of health behaviors, diseases, and use of health services are critical inputs into local, state, and national decision-making. Small area measurement methods can deliver more precise and accurate local-level information than direct estimates from surveys or administrat...

Descripción completa

Detalles Bibliográficos
Autores principales: Srebotnjak, Tanja, Mokdad , Ali H, Murray, Christopher JL
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2958154/
https://www.ncbi.nlm.nih.gov/pubmed/20920214
http://dx.doi.org/10.1186/1478-7954-8-26
_version_ 1782188306292801536
author Srebotnjak, Tanja
Mokdad , Ali H
Murray, Christopher JL
author_facet Srebotnjak, Tanja
Mokdad , Ali H
Murray, Christopher JL
author_sort Srebotnjak, Tanja
collection PubMed
description BACKGROUND: Local measurements of health behaviors, diseases, and use of health services are critical inputs into local, state, and national decision-making. Small area measurement methods can deliver more precise and accurate local-level information than direct estimates from surveys or administrative records, where sample sizes are often too small to yield acceptable standard errors. However, small area measurement requires careful validation using approaches other than conventional statistical methods such as in-sample or cross-validation methods because they do not solve the problem of validating estimates in data-sparse domains. METHODS: A new general framework for small area estimation and validation is developed and applied to estimate Type 2 diabetes prevalence in US counties using data from the Behavioral Risk Factor Surveillance System (BRFSS). The framework combines the three conventional approaches to small area measurement: (1) pooling data across time by combining multiple survey years; (2) exploiting spatial correlation by including a spatial component; and (3) utilizing structured relationships between the outcome variable and domain-specific covariates to define four increasingly complex model types - coined the Naive, Geospatial, Covariate, and Full models. The validation framework uses direct estimates of prevalence in large domains as the gold standard and compares model estimates against it using (i) all available observations for the large domains and (ii) systematically reduced sample sizes obtained through random sampling with replacement. At each sampling level, the model is rerun repeatedly, and the validity of the model estimates from the four model types is then determined by calculating the (average) concordance correlation coefficient (CCC) and (average) root mean squared error (RMSE) against the gold standard. The CCC is closely related to the intraclass correlation coefficient and can be used when the units are organized in groups and when it is of interest to measure the agreement between units in the same group (e.g., counties). The RMSE is often used to measure the differences between values predicted by a model or an estimator and the actually observed values. It is a useful measure to capture the precision of the model or estimator. RESULTS: All model types have substantially higher CCC and lower RMSE than the direct, single-year BRFSS estimates. In addition, the inclusion of relevant domain-specific covariates generally improves predictive validity, especially at small sample sizes, and their leverage can be equivalent to a five- to tenfold increase in sample size. CONCLUSIONS: Small area estimation of important health outcomes and risk factors can be improved using a systematic modeling and validation framework, which consistently outperformed single-year direct survey estimates and demonstrated the potential leverage of including relevant domain-specific covariates compared to pure measurement models. The proposed validation strategy can be applied to other disease outcomes and risk factors in the US as well as to resource-scarce situations, including low-income countries. These estimates are needed by public health officials to identify at-risk groups, to design targeted prevention and intervention programs, and to monitor and evaluate results over time.
format Text
id pubmed-2958154
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29581542010-10-21 A novel framework for validating and applying standardized small area measurement strategies Srebotnjak, Tanja Mokdad , Ali H Murray, Christopher JL Popul Health Metr Research BACKGROUND: Local measurements of health behaviors, diseases, and use of health services are critical inputs into local, state, and national decision-making. Small area measurement methods can deliver more precise and accurate local-level information than direct estimates from surveys or administrative records, where sample sizes are often too small to yield acceptable standard errors. However, small area measurement requires careful validation using approaches other than conventional statistical methods such as in-sample or cross-validation methods because they do not solve the problem of validating estimates in data-sparse domains. METHODS: A new general framework for small area estimation and validation is developed and applied to estimate Type 2 diabetes prevalence in US counties using data from the Behavioral Risk Factor Surveillance System (BRFSS). The framework combines the three conventional approaches to small area measurement: (1) pooling data across time by combining multiple survey years; (2) exploiting spatial correlation by including a spatial component; and (3) utilizing structured relationships between the outcome variable and domain-specific covariates to define four increasingly complex model types - coined the Naive, Geospatial, Covariate, and Full models. The validation framework uses direct estimates of prevalence in large domains as the gold standard and compares model estimates against it using (i) all available observations for the large domains and (ii) systematically reduced sample sizes obtained through random sampling with replacement. At each sampling level, the model is rerun repeatedly, and the validity of the model estimates from the four model types is then determined by calculating the (average) concordance correlation coefficient (CCC) and (average) root mean squared error (RMSE) against the gold standard. The CCC is closely related to the intraclass correlation coefficient and can be used when the units are organized in groups and when it is of interest to measure the agreement between units in the same group (e.g., counties). The RMSE is often used to measure the differences between values predicted by a model or an estimator and the actually observed values. It is a useful measure to capture the precision of the model or estimator. RESULTS: All model types have substantially higher CCC and lower RMSE than the direct, single-year BRFSS estimates. In addition, the inclusion of relevant domain-specific covariates generally improves predictive validity, especially at small sample sizes, and their leverage can be equivalent to a five- to tenfold increase in sample size. CONCLUSIONS: Small area estimation of important health outcomes and risk factors can be improved using a systematic modeling and validation framework, which consistently outperformed single-year direct survey estimates and demonstrated the potential leverage of including relevant domain-specific covariates compared to pure measurement models. The proposed validation strategy can be applied to other disease outcomes and risk factors in the US as well as to resource-scarce situations, including low-income countries. These estimates are needed by public health officials to identify at-risk groups, to design targeted prevention and intervention programs, and to monitor and evaluate results over time. BioMed Central 2010-09-29 /pmc/articles/PMC2958154/ /pubmed/20920214 http://dx.doi.org/10.1186/1478-7954-8-26 Text en Copyright ©2010 Srebotnjak et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Srebotnjak, Tanja
Mokdad , Ali H
Murray, Christopher JL
A novel framework for validating and applying standardized small area measurement strategies
title A novel framework for validating and applying standardized small area measurement strategies
title_full A novel framework for validating and applying standardized small area measurement strategies
title_fullStr A novel framework for validating and applying standardized small area measurement strategies
title_full_unstemmed A novel framework for validating and applying standardized small area measurement strategies
title_short A novel framework for validating and applying standardized small area measurement strategies
title_sort novel framework for validating and applying standardized small area measurement strategies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2958154/
https://www.ncbi.nlm.nih.gov/pubmed/20920214
http://dx.doi.org/10.1186/1478-7954-8-26
work_keys_str_mv AT srebotnjaktanja anovelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies
AT mokdadalih anovelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies
AT murraychristopherjl anovelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies
AT srebotnjaktanja novelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies
AT mokdadalih novelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies
AT murraychristopherjl novelframeworkforvalidatingandapplyingstandardizedsmallareameasurementstrategies