Cargando…
A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data
BACKGROUND: Feature selection techniques are critical to the analysis of high dimensional datasets. This is especially true in gene selection from microarray data which are commonly with extremely high feature-to-sample ratio. In addition to the essential objectives such as to reduce data noise, to...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009522/ https://www.ncbi.nlm.nih.gov/pubmed/20122224 http://dx.doi.org/10.1186/1471-2105-11-S1-S5 |
_version_ | 1782194698247471104 |
---|---|
author | Yang, Pengyi Zhou, Bing B Zhang, Zili Zomaya, Albert Y |
author_facet | Yang, Pengyi Zhou, Bing B Zhang, Zili Zomaya, Albert Y |
author_sort | Yang, Pengyi |
collection | PubMed |
description | BACKGROUND: Feature selection techniques are critical to the analysis of high dimensional datasets. This is especially true in gene selection from microarray data which are commonly with extremely high feature-to-sample ratio. In addition to the essential objectives such as to reduce data noise, to reduce data redundancy, to improve sample classification accuracy, and to improve model generalization property, feature selection also helps biologists to focus on the selected genes to further validate their biological hypotheses. RESULTS: In this paper we describe an improved hybrid system for gene selection. It is based on a recently proposed genetic ensemble (GE) system. To enhance the generalization property of the selected genes or gene subsets and to overcome the overfitting problem of the GE system, we devised a mapping strategy to fuse the goodness information of each gene provided by multiple filtering algorithms. This information is then used for initialization and mutation operation of the genetic ensemble system. CONCLUSION: We used four benchmark microarray datasets (including both binary-class and multi-class classification problems) for concept proving and model evaluation. The experimental results indicate that the proposed multi-filter enhanced genetic ensemble (MF-GE) system is able to improve sample classification accuracy, generate more compact gene subset, and converge to the selection results more quickly. The MF-GE system is very flexible as various combinations of multiple filters and classifiers can be incorporated based on the data characteristics and the user preferences. |
format | Text |
id | pubmed-3009522 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30095222010-12-23 A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data Yang, Pengyi Zhou, Bing B Zhang, Zili Zomaya, Albert Y BMC Bioinformatics Research BACKGROUND: Feature selection techniques are critical to the analysis of high dimensional datasets. This is especially true in gene selection from microarray data which are commonly with extremely high feature-to-sample ratio. In addition to the essential objectives such as to reduce data noise, to reduce data redundancy, to improve sample classification accuracy, and to improve model generalization property, feature selection also helps biologists to focus on the selected genes to further validate their biological hypotheses. RESULTS: In this paper we describe an improved hybrid system for gene selection. It is based on a recently proposed genetic ensemble (GE) system. To enhance the generalization property of the selected genes or gene subsets and to overcome the overfitting problem of the GE system, we devised a mapping strategy to fuse the goodness information of each gene provided by multiple filtering algorithms. This information is then used for initialization and mutation operation of the genetic ensemble system. CONCLUSION: We used four benchmark microarray datasets (including both binary-class and multi-class classification problems) for concept proving and model evaluation. The experimental results indicate that the proposed multi-filter enhanced genetic ensemble (MF-GE) system is able to improve sample classification accuracy, generate more compact gene subset, and converge to the selection results more quickly. The MF-GE system is very flexible as various combinations of multiple filters and classifiers can be incorporated based on the data characteristics and the user preferences. BioMed Central 2010-01-18 /pmc/articles/PMC3009522/ /pubmed/20122224 http://dx.doi.org/10.1186/1471-2105-11-S1-S5 Text en Copyright ©2010 Yang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Yang, Pengyi Zhou, Bing B Zhang, Zili Zomaya, Albert Y A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
title | A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
title_full | A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
title_fullStr | A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
title_full_unstemmed | A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
title_short | A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
title_sort | multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009522/ https://www.ncbi.nlm.nih.gov/pubmed/20122224 http://dx.doi.org/10.1186/1471-2105-11-S1-S5 |
work_keys_str_mv | AT yangpengyi amultifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT zhoubingb amultifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT zhangzili amultifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT zomayaalberty amultifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT yangpengyi multifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT zhoubingb multifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT zhangzili multifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata AT zomayaalberty multifilterenhancedgeneticensemblesystemforgeneselectionandsampleclassificationofmicroarraydata |