Cargando…
A boosting method for maximizing the partial area under the ROC curve
BACKGROUND: The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2898798/ https://www.ncbi.nlm.nih.gov/pubmed/20537139 http://dx.doi.org/10.1186/1471-2105-11-314 |
_version_ | 1782183522340962304 |
---|---|
author | Komori, Osamu Eguchi, Shinto |
author_facet | Komori, Osamu Eguchi, Shinto |
author_sort | Komori, Osamu |
collection | PubMed |
description | BACKGROUND: The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration. RESULTS: We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis. CONCLUSIONS: The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker. |
format | Text |
id | pubmed-2898798 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-28987982010-07-08 A boosting method for maximizing the partial area under the ROC curve Komori, Osamu Eguchi, Shinto BMC Bioinformatics Methodology Article BACKGROUND: The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration. RESULTS: We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis. CONCLUSIONS: The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker. BioMed Central 2010-06-10 /pmc/articles/PMC2898798/ /pubmed/20537139 http://dx.doi.org/10.1186/1471-2105-11-314 Text en Copyright ©2010 Komori and Eguchi; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Komori, Osamu Eguchi, Shinto A boosting method for maximizing the partial area under the ROC curve |
title | A boosting method for maximizing the partial area under the ROC curve |
title_full | A boosting method for maximizing the partial area under the ROC curve |
title_fullStr | A boosting method for maximizing the partial area under the ROC curve |
title_full_unstemmed | A boosting method for maximizing the partial area under the ROC curve |
title_short | A boosting method for maximizing the partial area under the ROC curve |
title_sort | boosting method for maximizing the partial area under the roc curve |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2898798/ https://www.ncbi.nlm.nih.gov/pubmed/20537139 http://dx.doi.org/10.1186/1471-2105-11-314 |
work_keys_str_mv | AT komoriosamu aboostingmethodformaximizingthepartialareaundertheroccurve AT eguchishinto aboostingmethodformaximizingthepartialareaundertheroccurve AT komoriosamu boostingmethodformaximizingthepartialareaundertheroccurve AT eguchishinto boostingmethodformaximizingthepartialareaundertheroccurve |