Cargando…

A Cancer Gene Selection Algorithm Based on the K-S Test and CFS

BACKGROUND: To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects disti...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Qiang, Wang, Yina, Jiang, Xiaobing, Chen, Fuxue, Lu, Wen-cong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5439177/
https://www.ncbi.nlm.nih.gov/pubmed/28567418
http://dx.doi.org/10.1155/2017/1645619
_version_ 1783237897547153408
author Su, Qiang
Wang, Yina
Jiang, Xiaobing
Chen, Fuxue
Lu, Wen-cong
author_facet Su, Qiang
Wang, Yina
Jiang, Xiaobing
Chen, Fuxue
Lu, Wen-cong
author_sort Su, Qiang
collection PubMed
description BACKGROUND: To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. RESULTS: We adopted support vector machines (SVM) as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR), and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. CONCLUSIONS: The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.
format Online
Article
Text
id pubmed-5439177
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-54391772017-05-31 A Cancer Gene Selection Algorithm Based on the K-S Test and CFS Su, Qiang Wang, Yina Jiang, Xiaobing Chen, Fuxue Lu, Wen-cong Biomed Res Int Research Article BACKGROUND: To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. RESULTS: We adopted support vector machines (SVM) as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR), and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. CONCLUSIONS: The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms. Hindawi 2017 2017-05-08 /pmc/articles/PMC5439177/ /pubmed/28567418 http://dx.doi.org/10.1155/2017/1645619 Text en Copyright © 2017 Qiang Su et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Su, Qiang
Wang, Yina
Jiang, Xiaobing
Chen, Fuxue
Lu, Wen-cong
A Cancer Gene Selection Algorithm Based on the K-S Test and CFS
title A Cancer Gene Selection Algorithm Based on the K-S Test and CFS
title_full A Cancer Gene Selection Algorithm Based on the K-S Test and CFS
title_fullStr A Cancer Gene Selection Algorithm Based on the K-S Test and CFS
title_full_unstemmed A Cancer Gene Selection Algorithm Based on the K-S Test and CFS
title_short A Cancer Gene Selection Algorithm Based on the K-S Test and CFS
title_sort cancer gene selection algorithm based on the k-s test and cfs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5439177/
https://www.ncbi.nlm.nih.gov/pubmed/28567418
http://dx.doi.org/10.1155/2017/1645619
work_keys_str_mv AT suqiang acancergeneselectionalgorithmbasedonthekstestandcfs
AT wangyina acancergeneselectionalgorithmbasedonthekstestandcfs
AT jiangxiaobing acancergeneselectionalgorithmbasedonthekstestandcfs
AT chenfuxue acancergeneselectionalgorithmbasedonthekstestandcfs
AT luwencong acancergeneselectionalgorithmbasedonthekstestandcfs
AT suqiang cancergeneselectionalgorithmbasedonthekstestandcfs
AT wangyina cancergeneselectionalgorithmbasedonthekstestandcfs
AT jiangxiaobing cancergeneselectionalgorithmbasedonthekstestandcfs
AT chenfuxue cancergeneselectionalgorithmbasedonthekstestandcfs
AT luwencong cancergeneselectionalgorithmbasedonthekstestandcfs