Cargando…

Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles

Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as h...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Liying, Liu, Zhimin, Yuan, Xiguo, Wei, Jianhua, Zhang, Junying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5143691/
https://www.ncbi.nlm.nih.gov/pubmed/27999797
http://dx.doi.org/10.1155/2016/4596326
_version_ 1782472978135515136
author Yang, Liying
Liu, Zhimin
Yuan, Xiguo
Wei, Jianhua
Zhang, Junying
author_facet Yang, Liying
Liu, Zhimin
Yuan, Xiguo
Wei, Jianhua
Zhang, Junying
author_sort Yang, Liying
collection PubMed
description Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as high dimensionality, small sample size, and low Signal-to-Noise Ratio. Results. This paper proposes a method, termed RS_SVM, to predict gene expression profiles via aggregating SVM trained on random subspaces. After choosing gene features through statistical analysis, RS_SVM randomly selects feature subsets to yield random subspaces and training SVM classifiers accordingly and then aggregates SVM classifiers to capture the advantage of ensemble learning. Experiments on eight real gene expression datasets are performed to validate the RS_SVM method. Experimental results show that RS_SVM achieved better classification accuracy and generalization performance in contrast with single SVM, K-nearest neighbor, decision tree, Bagging, AdaBoost, and the state-of-the-art methods. Experiments also explored the effect of subspace size on prediction performance. Conclusions. The proposed RS_SVM method yielded superior performance in analyzing gene expression profiles, which demonstrates that RS_SVM provides a good channel for such biological data.
format Online
Article
Text
id pubmed-5143691
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-51436912016-12-20 Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles Yang, Liying Liu, Zhimin Yuan, Xiguo Wei, Jianhua Zhang, Junying Biomed Res Int Research Article Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as high dimensionality, small sample size, and low Signal-to-Noise Ratio. Results. This paper proposes a method, termed RS_SVM, to predict gene expression profiles via aggregating SVM trained on random subspaces. After choosing gene features through statistical analysis, RS_SVM randomly selects feature subsets to yield random subspaces and training SVM classifiers accordingly and then aggregates SVM classifiers to capture the advantage of ensemble learning. Experiments on eight real gene expression datasets are performed to validate the RS_SVM method. Experimental results show that RS_SVM achieved better classification accuracy and generalization performance in contrast with single SVM, K-nearest neighbor, decision tree, Bagging, AdaBoost, and the state-of-the-art methods. Experiments also explored the effect of subspace size on prediction performance. Conclusions. The proposed RS_SVM method yielded superior performance in analyzing gene expression profiles, which demonstrates that RS_SVM provides a good channel for such biological data. Hindawi Publishing Corporation 2016 2016-11-24 /pmc/articles/PMC5143691/ /pubmed/27999797 http://dx.doi.org/10.1155/2016/4596326 Text en Copyright © 2016 Liying Yang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yang, Liying
Liu, Zhimin
Yuan, Xiguo
Wei, Jianhua
Zhang, Junying
Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
title Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
title_full Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
title_fullStr Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
title_full_unstemmed Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
title_short Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
title_sort random subspace aggregation for cancer prediction with gene expression profiles
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5143691/
https://www.ncbi.nlm.nih.gov/pubmed/27999797
http://dx.doi.org/10.1155/2016/4596326
work_keys_str_mv AT yangliying randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles
AT liuzhimin randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles
AT yuanxiguo randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles
AT weijianhua randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles
AT zhangjunying randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles