Cargando…

Validation of Diagnostic Groups Based on Health Care Utilization Data Should Adjust for Sampling Strategy

OBJECTIVE: Valid measurement of outcomes such as disease prevalence using health care utilization data is fundamental to the implementation of a “learning health system.” Definitions of such outcomes can be complex, based on multiple diagnostic codes. The literature on validating such data demonstra...

Descripción completa

Detalles Bibliográficos
Autores principales: Cadieux, Geneviève, Tamblyn, Robyn, Buckeridge, David L., Dendukuri, Nandini
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510703/
https://www.ncbi.nlm.nih.gov/pubmed/25821898
http://dx.doi.org/10.1097/MLR.0000000000000324
Descripción
Sumario:OBJECTIVE: Valid measurement of outcomes such as disease prevalence using health care utilization data is fundamental to the implementation of a “learning health system.” Definitions of such outcomes can be complex, based on multiple diagnostic codes. The literature on validating such data demonstrates a lack of awareness of the need for a stratified sampling design and corresponding statistical methods. We propose a method for validating the measurement of diagnostic groups that have: (1) different prevalences of diagnostic codes within the group; and (2) low prevalence. METHODS: We describe an estimation method whereby: (1) low-prevalence diagnostic codes are oversampled, and the positive predictive value (PPV) of the diagnostic group is estimated as a weighted average of the PPV of each diagnostic code; and (2) claims that fall within a low-prevalence diagnostic group are oversampled relative to claims that are not, and bias-adjusted estimators of sensitivity and specificity are generated. APPLICATION: We illustrate our proposed method using an example from population health surveillance in which diagnostic groups are applied to physician claims to identify cases of acute respiratory illness. CONCLUSIONS: Failure to account for the prevalence of each diagnostic code within a diagnostic group leads to the underestimation of the PPV, because low-prevalence diagnostic codes are more likely to be false positives. Failure to adjust for oversampling of claims that fall within the low-prevalence diagnostic group relative to those that do not leads to the overestimation of sensitivity and underestimation of specificity.