Cargando…

EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm

Feature selection (FS) is a vital step in data mining and machine learning, especially for analyzing the data in high-dimensional feature space. Gene expression data usually consist of a few samples characterized by high-dimensional feature space. As a result, they are not suitable to be processed b...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Lan, Hu, Xuemei, Wang, Yan, Fu, Yuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9322764/
https://www.ncbi.nlm.nih.gov/pubmed/35885095
http://dx.doi.org/10.3390/e24070873
_version_ 1784756385028767744
author Huang, Lan
Hu, Xuemei
Wang, Yan
Fu, Yuan
author_facet Huang, Lan
Hu, Xuemei
Wang, Yan
Fu, Yuan
author_sort Huang, Lan
collection PubMed
description Feature selection (FS) is a vital step in data mining and machine learning, especially for analyzing the data in high-dimensional feature space. Gene expression data usually consist of a few samples characterized by high-dimensional feature space. As a result, they are not suitable to be processed by simple methods, such as the filter-based method. In this study, we propose a novel feature selection algorithm based on the Explosion Gravitation Field Algorithm, called EGFAFS. To reduce the dimensions of the feature space to acceptable dimensions, we constructed a recommended feature pool by a series of Random Forests based on the Gini index. Furthermore, by paying more attention to the features in the recommended feature pool, we can find the best subset more efficiently. To verify the performance of EGFAFS for FS, we tested EGFAFS on eight gene expression datasets compared with four heuristic-based FS methods (GA, PSO, SA, and DE) and four other FS methods (Boruta, HSICLasso, DNN-FS, and EGSG). The results show that EGFAFS has better performance for FS on gene expression data in terms of evaluation metrics, having more than the other eight FS algorithms. The genes selected by EGFAGS play an essential role in the differential co-expression network and some biological functions further demonstrate the success of EGFAFS for solving FS problems on gene expression data.
format Online
Article
Text
id pubmed-9322764
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93227642022-07-27 EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm Huang, Lan Hu, Xuemei Wang, Yan Fu, Yuan Entropy (Basel) Article Feature selection (FS) is a vital step in data mining and machine learning, especially for analyzing the data in high-dimensional feature space. Gene expression data usually consist of a few samples characterized by high-dimensional feature space. As a result, they are not suitable to be processed by simple methods, such as the filter-based method. In this study, we propose a novel feature selection algorithm based on the Explosion Gravitation Field Algorithm, called EGFAFS. To reduce the dimensions of the feature space to acceptable dimensions, we constructed a recommended feature pool by a series of Random Forests based on the Gini index. Furthermore, by paying more attention to the features in the recommended feature pool, we can find the best subset more efficiently. To verify the performance of EGFAFS for FS, we tested EGFAFS on eight gene expression datasets compared with four heuristic-based FS methods (GA, PSO, SA, and DE) and four other FS methods (Boruta, HSICLasso, DNN-FS, and EGSG). The results show that EGFAFS has better performance for FS on gene expression data in terms of evaluation metrics, having more than the other eight FS algorithms. The genes selected by EGFAGS play an essential role in the differential co-expression network and some biological functions further demonstrate the success of EGFAFS for solving FS problems on gene expression data. MDPI 2022-06-25 /pmc/articles/PMC9322764/ /pubmed/35885095 http://dx.doi.org/10.3390/e24070873 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Huang, Lan
Hu, Xuemei
Wang, Yan
Fu, Yuan
EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm
title EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm
title_full EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm
title_fullStr EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm
title_full_unstemmed EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm
title_short EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm
title_sort egfafs: a novel feature selection algorithm based on explosion gravitation field algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9322764/
https://www.ncbi.nlm.nih.gov/pubmed/35885095
http://dx.doi.org/10.3390/e24070873
work_keys_str_mv AT huanglan egfafsanovelfeatureselectionalgorithmbasedonexplosiongravitationfieldalgorithm
AT huxuemei egfafsanovelfeatureselectionalgorithmbasedonexplosiongravitationfieldalgorithm
AT wangyan egfafsanovelfeatureselectionalgorithmbasedonexplosiongravitationfieldalgorithm
AT fuyuan egfafsanovelfeatureselectionalgorithmbasedonexplosiongravitationfieldalgorithm