Cargando…
Validation of Diagnostic Groups Based on Health Care Utilization Data Should Adjust for Sampling Strategy
OBJECTIVE: Valid measurement of outcomes such as disease prevalence using health care utilization data is fundamental to the implementation of a “learning health system.” Definitions of such outcomes can be complex, based on multiple diagnostic codes. The literature on validating such data demonstra...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510703/ https://www.ncbi.nlm.nih.gov/pubmed/25821898 http://dx.doi.org/10.1097/MLR.0000000000000324 |
Sumario: | OBJECTIVE: Valid measurement of outcomes such as disease prevalence using health care utilization data is fundamental to the implementation of a “learning health system.” Definitions of such outcomes can be complex, based on multiple diagnostic codes. The literature on validating such data demonstrates a lack of awareness of the need for a stratified sampling design and corresponding statistical methods. We propose a method for validating the measurement of diagnostic groups that have: (1) different prevalences of diagnostic codes within the group; and (2) low prevalence. METHODS: We describe an estimation method whereby: (1) low-prevalence diagnostic codes are oversampled, and the positive predictive value (PPV) of the diagnostic group is estimated as a weighted average of the PPV of each diagnostic code; and (2) claims that fall within a low-prevalence diagnostic group are oversampled relative to claims that are not, and bias-adjusted estimators of sensitivity and specificity are generated. APPLICATION: We illustrate our proposed method using an example from population health surveillance in which diagnostic groups are applied to physician claims to identify cases of acute respiratory illness. CONCLUSIONS: Failure to account for the prevalence of each diagnostic code within a diagnostic group leads to the underestimation of the PPV, because low-prevalence diagnostic codes are more likely to be false positives. Failure to adjust for oversampling of claims that fall within the low-prevalence diagnostic group relative to those that do not leads to the overestimation of sensitivity and underestimation of specificity. |
---|