Cargando…

A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification

High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performan...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sun, Shiquan, Peng, Qinke, Shakoor, Adnan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2014
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4105478/ https://www.ncbi.nlm.nih.gov/pubmed/25048512 http://dx.doi.org/10.1371/journal.pone.0102541

_version_	1782327373371277312
author	Sun, Shiquan Peng, Qinke Shakoor, Adnan
author_facet	Sun, Shiquan Peng, Qinke Shakoor, Adnan
author_sort	Sun, Shiquan
collection	PubMed
description	High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA) in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Image: see text]-nearest neighbor) on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.
format	Online Article Text
id	pubmed-4105478
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-41054782014-07-23 A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification Sun, Shiquan Peng, Qinke Shakoor, Adnan PLoS One Research Article High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA) in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Image: see text]-nearest neighbor) on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma. Public Library of Science 2014-07-21 /pmc/articles/PMC4105478/ /pubmed/25048512 http://dx.doi.org/10.1371/journal.pone.0102541 Text en © 2014 Sun et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Sun, Shiquan Peng, Qinke Shakoor, Adnan A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
title	A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
title_full	A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
title_fullStr	A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
title_full_unstemmed	A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
title_short	A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification
title_sort	kernel-based multivariate feature selection method for microarray data classification
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4105478/ https://www.ncbi.nlm.nih.gov/pubmed/25048512 http://dx.doi.org/10.1371/journal.pone.0102541
work_keys_str_mv	AT sunshiquan akernelbasedmultivariatefeatureselectionmethodformicroarraydataclassification AT pengqinke akernelbasedmultivariatefeatureselectionmethodformicroarraydataclassification AT shakooradnan akernelbasedmultivariatefeatureselectionmethodformicroarraydataclassification AT sunshiquan kernelbasedmultivariatefeatureselectionmethodformicroarraydataclassification AT pengqinke kernelbasedmultivariatefeatureselectionmethodformicroarraydataclassification AT shakooradnan kernelbasedmultivariatefeatureselectionmethodformicroarraydataclassification

A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification

Ejemplares similares