Cargando…

Gene selection algorithms for microarray data based on least squares support vector machine

BACKGROUND: In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary to conduct the discriminant analysis with all the genes. Hence, gene selection is usually performed to select important g...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, E Ke, Suganthan, PN, Yao, Xin
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1409801/
https://www.ncbi.nlm.nih.gov/pubmed/16504159
http://dx.doi.org/10.1186/1471-2105-7-95
_version_ 1782127058297552896
author Tang, E Ke
Suganthan, PN
Yao, Xin
author_facet Tang, E Ke
Suganthan, PN
Yao, Xin
author_sort Tang, E Ke
collection PubMed
description BACKGROUND: In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary to conduct the discriminant analysis with all the genes. Hence, gene selection is usually performed to select important genes. RESULTS: A gene selection method searches for an optimal or near optimal subset of genes with respect to a given evaluation criterion. In this paper, we propose a new evaluation criterion, named the leave-one-out calculation (LOOC, A list of abbreviations appears just above the list of references) measure. A gene selection method, named leave-one-out calculation sequential forward selection (LOOCSFS) algorithm, is then presented by combining the LOOC measure with the sequential forward selection scheme. Further, a novel gene selection algorithm, the gradient-based leave-one-out gene selection (GLGS) algorithm, is also proposed. Both of the gene selection algorithms originate from an efficient and exact calculation of the leave-one-out cross-validation error of the least squares support vector machine (LS-SVM). The proposed approaches are applied to two microarray datasets and compared to other well-known gene selection methods using codes available from the second author. CONCLUSION: The proposed gene selection approaches can provide gene subsets leading to more accurate classification results, while their computational complexity is comparable to the existing methods. The GLGS algorithm can also better scale to datasets with a very large number of genes.
format Text
id pubmed-1409801
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14098012006-04-21 Gene selection algorithms for microarray data based on least squares support vector machine Tang, E Ke Suganthan, PN Yao, Xin BMC Bioinformatics Research Article BACKGROUND: In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary to conduct the discriminant analysis with all the genes. Hence, gene selection is usually performed to select important genes. RESULTS: A gene selection method searches for an optimal or near optimal subset of genes with respect to a given evaluation criterion. In this paper, we propose a new evaluation criterion, named the leave-one-out calculation (LOOC, A list of abbreviations appears just above the list of references) measure. A gene selection method, named leave-one-out calculation sequential forward selection (LOOCSFS) algorithm, is then presented by combining the LOOC measure with the sequential forward selection scheme. Further, a novel gene selection algorithm, the gradient-based leave-one-out gene selection (GLGS) algorithm, is also proposed. Both of the gene selection algorithms originate from an efficient and exact calculation of the leave-one-out cross-validation error of the least squares support vector machine (LS-SVM). The proposed approaches are applied to two microarray datasets and compared to other well-known gene selection methods using codes available from the second author. CONCLUSION: The proposed gene selection approaches can provide gene subsets leading to more accurate classification results, while their computational complexity is comparable to the existing methods. The GLGS algorithm can also better scale to datasets with a very large number of genes. BioMed Central 2006-02-27 /pmc/articles/PMC1409801/ /pubmed/16504159 http://dx.doi.org/10.1186/1471-2105-7-95 Text en Copyright © 2006 Tang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tang, E Ke
Suganthan, PN
Yao, Xin
Gene selection algorithms for microarray data based on least squares support vector machine
title Gene selection algorithms for microarray data based on least squares support vector machine
title_full Gene selection algorithms for microarray data based on least squares support vector machine
title_fullStr Gene selection algorithms for microarray data based on least squares support vector machine
title_full_unstemmed Gene selection algorithms for microarray data based on least squares support vector machine
title_short Gene selection algorithms for microarray data based on least squares support vector machine
title_sort gene selection algorithms for microarray data based on least squares support vector machine
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1409801/
https://www.ncbi.nlm.nih.gov/pubmed/16504159
http://dx.doi.org/10.1186/1471-2105-7-95
work_keys_str_mv AT tangeke geneselectionalgorithmsformicroarraydatabasedonleastsquaressupportvectormachine
AT suganthanpn geneselectionalgorithmsformicroarraydatabasedonleastsquaressupportvectormachine
AT yaoxin geneselectionalgorithmsformicroarraydatabasedonleastsquaressupportvectormachine