Cargando…

Performance of statistical models to predict mental health and substance abuse cost

BACKGROUND: Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goa...

Descripción completa

Detalles Bibliográficos
Autores principales: Montez-Rath, Maria, Christiansen, Cindy L, Ettner, Susan L, Loveland, Susan, Rosen, Amy K
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636059/
https://www.ncbi.nlm.nih.gov/pubmed/17067394
http://dx.doi.org/10.1186/1471-2288-6-53
_version_ 1782130727338377216
author Montez-Rath, Maria
Christiansen, Cindy L
Ettner, Susan L
Loveland, Susan
Rosen, Amy K
author_facet Montez-Rath, Maria
Christiansen, Cindy L
Ettner, Susan L
Loveland, Susan
Rosen, Amy K
author_sort Montez-Rath, Maria
collection PubMed
description BACKGROUND: Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. METHODS: Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. RESULTS: The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. CONCLUSION: Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.
format Text
id pubmed-1636059
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16360592006-11-15 Performance of statistical models to predict mental health and substance abuse cost Montez-Rath, Maria Christiansen, Cindy L Ettner, Susan L Loveland, Susan Rosen, Amy K BMC Med Res Methodol Research Article BACKGROUND: Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. METHODS: Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. RESULTS: The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. CONCLUSION: Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used. BioMed Central 2006-10-26 /pmc/articles/PMC1636059/ /pubmed/17067394 http://dx.doi.org/10.1186/1471-2288-6-53 Text en Copyright © 2006 Montez-Rath et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Montez-Rath, Maria
Christiansen, Cindy L
Ettner, Susan L
Loveland, Susan
Rosen, Amy K
Performance of statistical models to predict mental health and substance abuse cost
title Performance of statistical models to predict mental health and substance abuse cost
title_full Performance of statistical models to predict mental health and substance abuse cost
title_fullStr Performance of statistical models to predict mental health and substance abuse cost
title_full_unstemmed Performance of statistical models to predict mental health and substance abuse cost
title_short Performance of statistical models to predict mental health and substance abuse cost
title_sort performance of statistical models to predict mental health and substance abuse cost
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636059/
https://www.ncbi.nlm.nih.gov/pubmed/17067394
http://dx.doi.org/10.1186/1471-2288-6-53
work_keys_str_mv AT montezrathmaria performanceofstatisticalmodelstopredictmentalhealthandsubstanceabusecost
AT christiansencindyl performanceofstatisticalmodelstopredictmentalhealthandsubstanceabusecost
AT ettnersusanl performanceofstatisticalmodelstopredictmentalhealthandsubstanceabusecost
AT lovelandsusan performanceofstatisticalmodelstopredictmentalhealthandsubstanceabusecost
AT rosenamyk performanceofstatisticalmodelstopredictmentalhealthandsubstanceabusecost