Cargando…
A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
The identification of discriminative features from information-rich data with the goal of clinical diagnosis is crucial in the field of biomedical science. In this context, many machine-learning techniques have been widely applied and achieved remarkable results. However, disease, especially cancer,...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6885241/ https://www.ncbi.nlm.nih.gov/pubmed/31828154 http://dx.doi.org/10.1155/2019/9864213 |
_version_ | 1783474700117082112 |
---|---|
author | Mao, Yifei Yang, Yuansheng |
author_facet | Mao, Yifei Yang, Yuansheng |
author_sort | Mao, Yifei |
collection | PubMed |
description | The identification of discriminative features from information-rich data with the goal of clinical diagnosis is crucial in the field of biomedical science. In this context, many machine-learning techniques have been widely applied and achieved remarkable results. However, disease, especially cancer, is often caused by a group of features with complex interactions. Unlike traditional feature selection methods, which only focused on finding single discriminative features, a multilayer feature subset selection method (MLFSSM), which employs randomized search and multilayer structure to select a discriminative subset, is proposed herein. In each level of this method, many feature subsets are generated to assure the diversity of the combinations, and the weights of features are evaluated on the performances of the subsets. The weight of a feature would increase if the feature is selected into more subsets with better performances compared with other features on the current layer. In this manner, the values of feature weights are revised layer-by-layer; the precision of feature weights is constantly improved; and better subsets are repeatedly constructed by the features with higher weights. Finally, the topmost feature subset of the last layer is returned. The experimental results based on five public gene datasets showed that the subsets selected by MLFSSM were more discriminative than the results by traditional feature methods including LVW (a feature subset method used the Las Vegas method for randomized search strategy), GAANN (a feature subset selection method based genetic algorithm (GA)), and support vector machine recursive feature elimination (SVM-RFE). Furthermore, MLFSSM showed higher classification performance than some state-of-the-art methods which selected feature pairs or groups, including top scoring pair (TSP), k-top scoring pairs (K-TSP), and relative simplicity-based direct classifier (RS-DC). |
format | Online Article Text |
id | pubmed-6885241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-68852412019-12-11 A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure Mao, Yifei Yang, Yuansheng Biomed Res Int Research Article The identification of discriminative features from information-rich data with the goal of clinical diagnosis is crucial in the field of biomedical science. In this context, many machine-learning techniques have been widely applied and achieved remarkable results. However, disease, especially cancer, is often caused by a group of features with complex interactions. Unlike traditional feature selection methods, which only focused on finding single discriminative features, a multilayer feature subset selection method (MLFSSM), which employs randomized search and multilayer structure to select a discriminative subset, is proposed herein. In each level of this method, many feature subsets are generated to assure the diversity of the combinations, and the weights of features are evaluated on the performances of the subsets. The weight of a feature would increase if the feature is selected into more subsets with better performances compared with other features on the current layer. In this manner, the values of feature weights are revised layer-by-layer; the precision of feature weights is constantly improved; and better subsets are repeatedly constructed by the features with higher weights. Finally, the topmost feature subset of the last layer is returned. The experimental results based on five public gene datasets showed that the subsets selected by MLFSSM were more discriminative than the results by traditional feature methods including LVW (a feature subset method used the Las Vegas method for randomized search strategy), GAANN (a feature subset selection method based genetic algorithm (GA)), and support vector machine recursive feature elimination (SVM-RFE). Furthermore, MLFSSM showed higher classification performance than some state-of-the-art methods which selected feature pairs or groups, including top scoring pair (TSP), k-top scoring pairs (K-TSP), and relative simplicity-based direct classifier (RS-DC). Hindawi 2019-11-04 /pmc/articles/PMC6885241/ /pubmed/31828154 http://dx.doi.org/10.1155/2019/9864213 Text en Copyright © 2019 Yifei Mao and Yuansheng Yang. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Mao, Yifei Yang, Yuansheng A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure |
title | A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure |
title_full | A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure |
title_fullStr | A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure |
title_full_unstemmed | A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure |
title_short | A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure |
title_sort | wrapper feature subset selection method based on randomized search and multilayer structure |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6885241/ https://www.ncbi.nlm.nih.gov/pubmed/31828154 http://dx.doi.org/10.1155/2019/9864213 |
work_keys_str_mv | AT maoyifei awrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure AT yangyuansheng awrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure AT maoyifei wrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure AT yangyuansheng wrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure |