Cargando…

Entropy-based gene ranking without selection bias for the predictive classification of microarray data

BACKGROUND: We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small...

Descripción completa

Detalles Bibliográficos
Autores principales:	Furlanello, Cesare, Serafini, Maria, Merler, Stefano, Jurman, Giuseppe
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2003
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC293475/ https://www.ncbi.nlm.nih.gov/pubmed/14604446 http://dx.doi.org/10.1186/1471-2105-4-54

_version_	1782121083195883520
author	Furlanello, Cesare Serafini, Maria Merler, Stefano Jurman, Giuseppe
author_facet	Furlanello, Cesare Serafini, Maria Merler, Stefano Jurman, Giuseppe
author_sort	Furlanello, Cesare
collection	PubMed
description	BACKGROUND: We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). RESULTS: With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. CONCLUSIONS: Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.
format	Text
id	pubmed-293475
institution	National Center for Biotechnology Information
language	English
publishDate	2003
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-2934752003-12-16 Entropy-based gene ranking without selection bias for the predictive classification of microarray data Furlanello, Cesare Serafini, Maria Merler, Stefano Jurman, Giuseppe BMC Bioinformatics Research Article BACKGROUND: We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). RESULTS: With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. CONCLUSIONS: Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance. BioMed Central 2003-11-06 /pmc/articles/PMC293475/ /pubmed/14604446 http://dx.doi.org/10.1186/1471-2105-4-54 Text en Copyright © 2003 Furlanello et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle	Research Article Furlanello, Cesare Serafini, Maria Merler, Stefano Jurman, Giuseppe Entropy-based gene ranking without selection bias for the predictive classification of microarray data
title	Entropy-based gene ranking without selection bias for the predictive classification of microarray data
title_full	Entropy-based gene ranking without selection bias for the predictive classification of microarray data
title_fullStr	Entropy-based gene ranking without selection bias for the predictive classification of microarray data
title_full_unstemmed	Entropy-based gene ranking without selection bias for the predictive classification of microarray data
title_short	Entropy-based gene ranking without selection bias for the predictive classification of microarray data
title_sort	entropy-based gene ranking without selection bias for the predictive classification of microarray data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC293475/ https://www.ncbi.nlm.nih.gov/pubmed/14604446 http://dx.doi.org/10.1186/1471-2105-4-54
work_keys_str_mv	AT furlanellocesare entropybasedgenerankingwithoutselectionbiasforthepredictiveclassificationofmicroarraydata AT serafinimaria entropybasedgenerankingwithoutselectionbiasforthepredictiveclassificationofmicroarraydata AT merlerstefano entropybasedgenerankingwithoutselectionbiasforthepredictiveclassificationofmicroarraydata AT jurmangiuseppe entropybasedgenerankingwithoutselectionbiasforthepredictiveclassificationofmicroarraydata

Entropy-based gene ranking without selection bias for the predictive classification of microarray data

Ejemplares similares