Cargando…

A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure

The identification of discriminative features from information-rich data with the goal of clinical diagnosis is crucial in the field of biomedical science. In this context, many machine-learning techniques have been widely applied and achieved remarkable results. However, disease, especially cancer,...

Descripción completa

Detalles Bibliográficos
Autores principales: Mao, Yifei, Yang, Yuansheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6885241/
https://www.ncbi.nlm.nih.gov/pubmed/31828154
http://dx.doi.org/10.1155/2019/9864213
_version_ 1783474700117082112
author Mao, Yifei
Yang, Yuansheng
author_facet Mao, Yifei
Yang, Yuansheng
author_sort Mao, Yifei
collection PubMed
description The identification of discriminative features from information-rich data with the goal of clinical diagnosis is crucial in the field of biomedical science. In this context, many machine-learning techniques have been widely applied and achieved remarkable results. However, disease, especially cancer, is often caused by a group of features with complex interactions. Unlike traditional feature selection methods, which only focused on finding single discriminative features, a multilayer feature subset selection method (MLFSSM), which employs randomized search and multilayer structure to select a discriminative subset, is proposed herein. In each level of this method, many feature subsets are generated to assure the diversity of the combinations, and the weights of features are evaluated on the performances of the subsets. The weight of a feature would increase if the feature is selected into more subsets with better performances compared with other features on the current layer. In this manner, the values of feature weights are revised layer-by-layer; the precision of feature weights is constantly improved; and better subsets are repeatedly constructed by the features with higher weights. Finally, the topmost feature subset of the last layer is returned. The experimental results based on five public gene datasets showed that the subsets selected by MLFSSM were more discriminative than the results by traditional feature methods including LVW (a feature subset method used the Las Vegas method for randomized search strategy), GAANN (a feature subset selection method based genetic algorithm (GA)), and support vector machine recursive feature elimination (SVM-RFE). Furthermore, MLFSSM showed higher classification performance than some state-of-the-art methods which selected feature pairs or groups, including top scoring pair (TSP), k-top scoring pairs (K-TSP), and relative simplicity-based direct classifier (RS-DC).
format Online
Article
Text
id pubmed-6885241
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-68852412019-12-11 A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure Mao, Yifei Yang, Yuansheng Biomed Res Int Research Article The identification of discriminative features from information-rich data with the goal of clinical diagnosis is crucial in the field of biomedical science. In this context, many machine-learning techniques have been widely applied and achieved remarkable results. However, disease, especially cancer, is often caused by a group of features with complex interactions. Unlike traditional feature selection methods, which only focused on finding single discriminative features, a multilayer feature subset selection method (MLFSSM), which employs randomized search and multilayer structure to select a discriminative subset, is proposed herein. In each level of this method, many feature subsets are generated to assure the diversity of the combinations, and the weights of features are evaluated on the performances of the subsets. The weight of a feature would increase if the feature is selected into more subsets with better performances compared with other features on the current layer. In this manner, the values of feature weights are revised layer-by-layer; the precision of feature weights is constantly improved; and better subsets are repeatedly constructed by the features with higher weights. Finally, the topmost feature subset of the last layer is returned. The experimental results based on five public gene datasets showed that the subsets selected by MLFSSM were more discriminative than the results by traditional feature methods including LVW (a feature subset method used the Las Vegas method for randomized search strategy), GAANN (a feature subset selection method based genetic algorithm (GA)), and support vector machine recursive feature elimination (SVM-RFE). Furthermore, MLFSSM showed higher classification performance than some state-of-the-art methods which selected feature pairs or groups, including top scoring pair (TSP), k-top scoring pairs (K-TSP), and relative simplicity-based direct classifier (RS-DC). Hindawi 2019-11-04 /pmc/articles/PMC6885241/ /pubmed/31828154 http://dx.doi.org/10.1155/2019/9864213 Text en Copyright © 2019 Yifei Mao and Yuansheng Yang. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mao, Yifei
Yang, Yuansheng
A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
title A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
title_full A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
title_fullStr A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
title_full_unstemmed A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
title_short A Wrapper Feature Subset Selection Method Based on Randomized Search and Multilayer Structure
title_sort wrapper feature subset selection method based on randomized search and multilayer structure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6885241/
https://www.ncbi.nlm.nih.gov/pubmed/31828154
http://dx.doi.org/10.1155/2019/9864213
work_keys_str_mv AT maoyifei awrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure
AT yangyuansheng awrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure
AT maoyifei wrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure
AT yangyuansheng wrapperfeaturesubsetselectionmethodbasedonrandomizedsearchandmultilayerstructure