Cargando…

AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity

BACKGROUND: The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Lei, Wang, Jun, Wei, Jinmao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374660/
https://www.ncbi.nlm.nih.gov/pubmed/28361689
http://dx.doi.org/10.1186/s12859-017-1468-4
_version_ 1782518936726667264
author Sun, Lei
Wang, Jun
Wei, Jinmao
author_facet Sun, Lei
Wang, Jun
Wei, Jinmao
author_sort Sun, Lei
collection PubMed
description BACKGROUND: The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. RESULTS: In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. CONCLUSIONS: Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.
format Online
Article
Text
id pubmed-5374660
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53746602017-04-03 AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity Sun, Lei Wang, Jun Wei, Jinmao BMC Bioinformatics Research BACKGROUND: The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. RESULTS: In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. CONCLUSIONS: Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance. BioMed Central 2017-03-14 /pmc/articles/PMC5374660/ /pubmed/28361689 http://dx.doi.org/10.1186/s12859-017-1468-4 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Sun, Lei
Wang, Jun
Wei, Jinmao
AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity
title AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity
title_full AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity
title_fullStr AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity
title_full_unstemmed AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity
title_short AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity
title_sort avc: selecting discriminative features on basis of auc by maximizing variable complementarity
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374660/
https://www.ncbi.nlm.nih.gov/pubmed/28361689
http://dx.doi.org/10.1186/s12859-017-1468-4
work_keys_str_mv AT sunlei avcselectingdiscriminativefeaturesonbasisofaucbymaximizingvariablecomplementarity
AT wangjun avcselectingdiscriminativefeaturesonbasisofaucbymaximizingvariablecomplementarity
AT weijinmao avcselectingdiscriminativefeaturesonbasisofaucbymaximizingvariablecomplementarity