Cargando…

Polychotomization of continuous variables in regression models based on the overall C index

BACKGROUND: When developing multivariable regression models for diagnosis or prognosis, continuous independent variables can be categorized to make a prediction table instead of a prediction formula. Although many methods have been proposed to dichotomize prognostic variables, to date there has been...

Descripción completa

Detalles Bibliográficos
Autores principales: Tsuruta, Harukazu, Bax, Leon
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1770908/
https://www.ncbi.nlm.nih.gov/pubmed/17169154
http://dx.doi.org/10.1186/1472-6947-6-41
_version_ 1782131709476601856
author Tsuruta, Harukazu
Bax, Leon
author_facet Tsuruta, Harukazu
Bax, Leon
author_sort Tsuruta, Harukazu
collection PubMed
description BACKGROUND: When developing multivariable regression models for diagnosis or prognosis, continuous independent variables can be categorized to make a prediction table instead of a prediction formula. Although many methods have been proposed to dichotomize prognostic variables, to date there has been no integrated method for polychotomization. The latter is necessary when dichotomization results in too much loss of information or when central values refer to normal states and more dispersed values refer to less preferable states, a situation that is not unusual in medical settings (e.g. body temperature, blood pressure). The goal of our study was to develop a theoretical and practical method for polychotomization. METHODS: We used the overall discrimination index C, introduced by Harrel, as a measure of the predictive ability of an independent regressor variable and derived a method for polychotomization mathematically. Since the naïve application of our method, like some existing methods, gives rise to positive bias, we developed a parametric method that minimizes this bias and assessed its performance by the use of Monte Carlo simulation. RESULTS: The overall C is closely related to the area under the ROC curve and the produced di(poly)chotomized variable's predictive performance is comparable to the original continuous variable. The simulation shows that the parametric method is essentially unbiased for both the estimates of performance and the cutoff points. Application of our method to the predictor variables of a previous study on rhabdomyolysis shows that it can be used to make probability profile tables that are applicable to the diagnosis or prognosis of individual patient status. CONCLUSION: We propose a polychotomization (including dichotomization) method for independent continuous variables in regression models based on the overall discrimination index C and clarified its meaning mathematically. To avoid positive bias in application, we have proposed and evaluated a parametric method. The proposed method for polychotomizing continuous regressor variables performed well and can be used to create probability profile tables.
format Text
id pubmed-1770908
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17709082007-01-22 Polychotomization of continuous variables in regression models based on the overall C index Tsuruta, Harukazu Bax, Leon BMC Med Inform Decis Mak Research Article BACKGROUND: When developing multivariable regression models for diagnosis or prognosis, continuous independent variables can be categorized to make a prediction table instead of a prediction formula. Although many methods have been proposed to dichotomize prognostic variables, to date there has been no integrated method for polychotomization. The latter is necessary when dichotomization results in too much loss of information or when central values refer to normal states and more dispersed values refer to less preferable states, a situation that is not unusual in medical settings (e.g. body temperature, blood pressure). The goal of our study was to develop a theoretical and practical method for polychotomization. METHODS: We used the overall discrimination index C, introduced by Harrel, as a measure of the predictive ability of an independent regressor variable and derived a method for polychotomization mathematically. Since the naïve application of our method, like some existing methods, gives rise to positive bias, we developed a parametric method that minimizes this bias and assessed its performance by the use of Monte Carlo simulation. RESULTS: The overall C is closely related to the area under the ROC curve and the produced di(poly)chotomized variable's predictive performance is comparable to the original continuous variable. The simulation shows that the parametric method is essentially unbiased for both the estimates of performance and the cutoff points. Application of our method to the predictor variables of a previous study on rhabdomyolysis shows that it can be used to make probability profile tables that are applicable to the diagnosis or prognosis of individual patient status. CONCLUSION: We propose a polychotomization (including dichotomization) method for independent continuous variables in regression models based on the overall discrimination index C and clarified its meaning mathematically. To avoid positive bias in application, we have proposed and evaluated a parametric method. The proposed method for polychotomizing continuous regressor variables performed well and can be used to create probability profile tables. BioMed Central 2006-12-14 /pmc/articles/PMC1770908/ /pubmed/17169154 http://dx.doi.org/10.1186/1472-6947-6-41 Text en Copyright © 2006 Tsuruta and Bax; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tsuruta, Harukazu
Bax, Leon
Polychotomization of continuous variables in regression models based on the overall C index
title Polychotomization of continuous variables in regression models based on the overall C index
title_full Polychotomization of continuous variables in regression models based on the overall C index
title_fullStr Polychotomization of continuous variables in regression models based on the overall C index
title_full_unstemmed Polychotomization of continuous variables in regression models based on the overall C index
title_short Polychotomization of continuous variables in regression models based on the overall C index
title_sort polychotomization of continuous variables in regression models based on the overall c index
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1770908/
https://www.ncbi.nlm.nih.gov/pubmed/17169154
http://dx.doi.org/10.1186/1472-6947-6-41
work_keys_str_mv AT tsurutaharukazu polychotomizationofcontinuousvariablesinregressionmodelsbasedontheoverallcindex
AT baxleon polychotomizationofcontinuousvariablesinregressionmodelsbasedontheoverallcindex