Cargando…

A stable gene selection in microarray data analysis

BACKGROUND: Microarray data analysis is notorious for involving a huge number of genes compared to a relatively small number of samples. Gene selection is to detect the most significantly differentially expressed genes under different conditions, and it has been a central research focus. In general,...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Kun, Cai, Zhipeng, Li, Jianzhong, Lin, Guohui
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1524991/
https://www.ncbi.nlm.nih.gov/pubmed/16643657
http://dx.doi.org/10.1186/1471-2105-7-228
_version_ 1782128874050551808
author Yang, Kun
Cai, Zhipeng
Li, Jianzhong
Lin, Guohui
author_facet Yang, Kun
Cai, Zhipeng
Li, Jianzhong
Lin, Guohui
author_sort Yang, Kun
collection PubMed
description BACKGROUND: Microarray data analysis is notorious for involving a huge number of genes compared to a relatively small number of samples. Gene selection is to detect the most significantly differentially expressed genes under different conditions, and it has been a central research focus. In general, a better gene selection method can improve the performance of classification significantly. One of the difficulties in gene selection is that the numbers of samples under different conditions vary a lot. RESULTS: Two novel gene selection methods are proposed in this paper, which are not affected by the unbalanced sample class sizes and do not assume any explicit statistical model on the gene expression values. They were evaluated on eight publicly available microarray datasets, using leave-one-out cross-validation and 5-fold cross-validation. The performance is measured by the classification accuracies using the top ranked genes based on the training datasets. CONCLUSION: The experimental results showed that the proposed gene selection methods are efficient, effective, and robust in identifying differentially expressed genes. Adopting the existing SVM-based and KNN-based classifiers, the selected genes by our proposed methods in general give more accurate classification results, typically when the sample class sizes in the training dataset are unbalanced.
format Text
id pubmed-1524991
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15249912006-08-01 A stable gene selection in microarray data analysis Yang, Kun Cai, Zhipeng Li, Jianzhong Lin, Guohui BMC Bioinformatics Research Article BACKGROUND: Microarray data analysis is notorious for involving a huge number of genes compared to a relatively small number of samples. Gene selection is to detect the most significantly differentially expressed genes under different conditions, and it has been a central research focus. In general, a better gene selection method can improve the performance of classification significantly. One of the difficulties in gene selection is that the numbers of samples under different conditions vary a lot. RESULTS: Two novel gene selection methods are proposed in this paper, which are not affected by the unbalanced sample class sizes and do not assume any explicit statistical model on the gene expression values. They were evaluated on eight publicly available microarray datasets, using leave-one-out cross-validation and 5-fold cross-validation. The performance is measured by the classification accuracies using the top ranked genes based on the training datasets. CONCLUSION: The experimental results showed that the proposed gene selection methods are efficient, effective, and robust in identifying differentially expressed genes. Adopting the existing SVM-based and KNN-based classifiers, the selected genes by our proposed methods in general give more accurate classification results, typically when the sample class sizes in the training dataset are unbalanced. BioMed Central 2006-04-27 /pmc/articles/PMC1524991/ /pubmed/16643657 http://dx.doi.org/10.1186/1471-2105-7-228 Text en Copyright © 2006 Yang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yang, Kun
Cai, Zhipeng
Li, Jianzhong
Lin, Guohui
A stable gene selection in microarray data analysis
title A stable gene selection in microarray data analysis
title_full A stable gene selection in microarray data analysis
title_fullStr A stable gene selection in microarray data analysis
title_full_unstemmed A stable gene selection in microarray data analysis
title_short A stable gene selection in microarray data analysis
title_sort stable gene selection in microarray data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1524991/
https://www.ncbi.nlm.nih.gov/pubmed/16643657
http://dx.doi.org/10.1186/1471-2105-7-228
work_keys_str_mv AT yangkun astablegeneselectioninmicroarraydataanalysis
AT caizhipeng astablegeneselectioninmicroarraydataanalysis
AT lijianzhong astablegeneselectioninmicroarraydataanalysis
AT linguohui astablegeneselectioninmicroarraydataanalysis
AT yangkun stablegeneselectioninmicroarraydataanalysis
AT caizhipeng stablegeneselectioninmicroarraydataanalysis
AT lijianzhong stablegeneselectioninmicroarraydataanalysis
AT linguohui stablegeneselectioninmicroarraydataanalysis