Cargando…

A new hybrid algorithm for three-stage gene selection based on whale optimization

In biomedical data mining, the gene dimension is often much larger than the sample size. To solve this problem, we need to use a feature selection algorithm to select feature gene subsets with a strong correlation with phenotype to ensure the accuracy of subsequent analysis. This paper presents a ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Junjian, Qu, Chiwen, Zhang, Lupeng, Tang, Yifan, Li, Jinlong, Feng, Huicong, Zeng, Xiaomin, Peng, Xiaoning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9992521/
https://www.ncbi.nlm.nih.gov/pubmed/36882446
http://dx.doi.org/10.1038/s41598-023-30862-y
Descripción
Sumario:In biomedical data mining, the gene dimension is often much larger than the sample size. To solve this problem, we need to use a feature selection algorithm to select feature gene subsets with a strong correlation with phenotype to ensure the accuracy of subsequent analysis. This paper presents a new three-stage hybrid feature gene selection method, that combines a variance filter, extremely randomized tree, and whale optimization algorithm. First, a variance filter is used to reduce the dimension of the feature gene space, and an extremely randomized tree is used to further reduce the feature gene set. Finally, the whale optimization algorithm is used to select the optimal feature gene subset. We evaluate the proposed method with three different classifiers in seven published gene expression profile datasets and compare it with other advanced feature selection algorithms. The results show that the proposed method has significant advantages in a variety of evaluation indicators.