Cargando…

A Cancer Gene Selection Algorithm Based on the K-S Test and CFS

BACKGROUND: To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects disti...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Qiang, Wang, Yina, Jiang, Xiaobing, Chen, Fuxue, Lu, Wen-cong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5439177/
https://www.ncbi.nlm.nih.gov/pubmed/28567418
http://dx.doi.org/10.1155/2017/1645619
Descripción
Sumario:BACKGROUND: To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. RESULTS: We adopted support vector machines (SVM) as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR), and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. CONCLUSIONS: The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.