Cargando…

Regularized binormal ROC method in disease classification using microarray data

BACKGROUND: An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-t...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Shuangge, Song, Xiao, Huang, Jian
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1513612/
https://www.ncbi.nlm.nih.gov/pubmed/16684357
http://dx.doi.org/10.1186/1471-2105-7-253
_version_ 1782128515524591616
author Ma, Shuangge
Song, Xiao
Huang, Jian
author_facet Ma, Shuangge
Song, Xiao
Huang, Jian
author_sort Ma, Shuangge
collection PubMed
description BACKGROUND: An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-throughput genomic data and construct appropriate classification rules. It is also of interest to develop methods for evaluation of classification performance and ranking of identified biomarkers. RESULTS: The ROC (receiver operating characteristic) technique has been widely used in disease classification with low dimensional biomarkers. Compared with the empirical ROC approach, the binormal ROC is computationally more affordable and robust in small sample size cases. We propose using the binormal AUC (area under the ROC curve) as the objective function for two-sample classification, and the scaled threshold gradient directed regularization method for regularized estimation and biomarker selection. Tuning parameter selection is based on V-fold cross validation. We develop Monte Carlo based methods for evaluating the stability of individual biomarkers and overall prediction performance. Extensive simulation studies show that the proposed approach can generate parsimonious models with excellent classification and prediction performance, under most simulated scenarios including model mis-specification. Application of the method to two cancer studies shows that the identified genes are reasonably stable with satisfactory prediction performance and biologically sound implications. The overall classification performance is satisfactory, with small classification errors and large AUCs. CONCLUSION: In comparison to existing methods, the proposed approach is computationally more affordable without losing the optimality possessed by the standard ROC method.
format Text
id pubmed-1513612
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15136122006-07-24 Regularized binormal ROC method in disease classification using microarray data Ma, Shuangge Song, Xiao Huang, Jian BMC Bioinformatics Methodology Article BACKGROUND: An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-throughput genomic data and construct appropriate classification rules. It is also of interest to develop methods for evaluation of classification performance and ranking of identified biomarkers. RESULTS: The ROC (receiver operating characteristic) technique has been widely used in disease classification with low dimensional biomarkers. Compared with the empirical ROC approach, the binormal ROC is computationally more affordable and robust in small sample size cases. We propose using the binormal AUC (area under the ROC curve) as the objective function for two-sample classification, and the scaled threshold gradient directed regularization method for regularized estimation and biomarker selection. Tuning parameter selection is based on V-fold cross validation. We develop Monte Carlo based methods for evaluating the stability of individual biomarkers and overall prediction performance. Extensive simulation studies show that the proposed approach can generate parsimonious models with excellent classification and prediction performance, under most simulated scenarios including model mis-specification. Application of the method to two cancer studies shows that the identified genes are reasonably stable with satisfactory prediction performance and biologically sound implications. The overall classification performance is satisfactory, with small classification errors and large AUCs. CONCLUSION: In comparison to existing methods, the proposed approach is computationally more affordable without losing the optimality possessed by the standard ROC method. BioMed Central 2006-05-09 /pmc/articles/PMC1513612/ /pubmed/16684357 http://dx.doi.org/10.1186/1471-2105-7-253 Text en Copyright © 2006 Ma et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Ma, Shuangge
Song, Xiao
Huang, Jian
Regularized binormal ROC method in disease classification using microarray data
title Regularized binormal ROC method in disease classification using microarray data
title_full Regularized binormal ROC method in disease classification using microarray data
title_fullStr Regularized binormal ROC method in disease classification using microarray data
title_full_unstemmed Regularized binormal ROC method in disease classification using microarray data
title_short Regularized binormal ROC method in disease classification using microarray data
title_sort regularized binormal roc method in disease classification using microarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1513612/
https://www.ncbi.nlm.nih.gov/pubmed/16684357
http://dx.doi.org/10.1186/1471-2105-7-253
work_keys_str_mv AT mashuangge regularizedbinormalrocmethodindiseaseclassificationusingmicroarraydata
AT songxiao regularizedbinormalrocmethodindiseaseclassificationusingmicroarraydata
AT huangjian regularizedbinormalrocmethodindiseaseclassificationusingmicroarraydata