Cargando…

Rough sets and Laplacian score based cost-sensitive feature selection

Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. O...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Shenglong, Zhao, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6005488/
https://www.ncbi.nlm.nih.gov/pubmed/29912884
http://dx.doi.org/10.1371/journal.pone.0197564
_version_ 1783332689970987008
author Yu, Shenglong
Zhao, Hong
author_facet Yu, Shenglong
Zhao, Hong
author_sort Yu, Shenglong
collection PubMed
description Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
format Online
Article
Text
id pubmed-6005488
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-60054882018-06-25 Rough sets and Laplacian score based cost-sensitive feature selection Yu, Shenglong Zhao, Hong PLoS One Research Article Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. Public Library of Science 2018-06-18 /pmc/articles/PMC6005488/ /pubmed/29912884 http://dx.doi.org/10.1371/journal.pone.0197564 Text en © 2018 Yu, Zhao http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Yu, Shenglong
Zhao, Hong
Rough sets and Laplacian score based cost-sensitive feature selection
title Rough sets and Laplacian score based cost-sensitive feature selection
title_full Rough sets and Laplacian score based cost-sensitive feature selection
title_fullStr Rough sets and Laplacian score based cost-sensitive feature selection
title_full_unstemmed Rough sets and Laplacian score based cost-sensitive feature selection
title_short Rough sets and Laplacian score based cost-sensitive feature selection
title_sort rough sets and laplacian score based cost-sensitive feature selection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6005488/
https://www.ncbi.nlm.nih.gov/pubmed/29912884
http://dx.doi.org/10.1371/journal.pone.0197564
work_keys_str_mv AT yushenglong roughsetsandlaplacianscorebasedcostsensitivefeatureselection
AT zhaohong roughsetsandlaplacianscorebasedcostsensitivefeatureselection