Cargando…
Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles
Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as h...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5143691/ https://www.ncbi.nlm.nih.gov/pubmed/27999797 http://dx.doi.org/10.1155/2016/4596326 |
_version_ | 1782472978135515136 |
---|---|
author | Yang, Liying Liu, Zhimin Yuan, Xiguo Wei, Jianhua Zhang, Junying |
author_facet | Yang, Liying Liu, Zhimin Yuan, Xiguo Wei, Jianhua Zhang, Junying |
author_sort | Yang, Liying |
collection | PubMed |
description | Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as high dimensionality, small sample size, and low Signal-to-Noise Ratio. Results. This paper proposes a method, termed RS_SVM, to predict gene expression profiles via aggregating SVM trained on random subspaces. After choosing gene features through statistical analysis, RS_SVM randomly selects feature subsets to yield random subspaces and training SVM classifiers accordingly and then aggregates SVM classifiers to capture the advantage of ensemble learning. Experiments on eight real gene expression datasets are performed to validate the RS_SVM method. Experimental results show that RS_SVM achieved better classification accuracy and generalization performance in contrast with single SVM, K-nearest neighbor, decision tree, Bagging, AdaBoost, and the state-of-the-art methods. Experiments also explored the effect of subspace size on prediction performance. Conclusions. The proposed RS_SVM method yielded superior performance in analyzing gene expression profiles, which demonstrates that RS_SVM provides a good channel for such biological data. |
format | Online Article Text |
id | pubmed-5143691 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-51436912016-12-20 Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles Yang, Liying Liu, Zhimin Yuan, Xiguo Wei, Jianhua Zhang, Junying Biomed Res Int Research Article Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as high dimensionality, small sample size, and low Signal-to-Noise Ratio. Results. This paper proposes a method, termed RS_SVM, to predict gene expression profiles via aggregating SVM trained on random subspaces. After choosing gene features through statistical analysis, RS_SVM randomly selects feature subsets to yield random subspaces and training SVM classifiers accordingly and then aggregates SVM classifiers to capture the advantage of ensemble learning. Experiments on eight real gene expression datasets are performed to validate the RS_SVM method. Experimental results show that RS_SVM achieved better classification accuracy and generalization performance in contrast with single SVM, K-nearest neighbor, decision tree, Bagging, AdaBoost, and the state-of-the-art methods. Experiments also explored the effect of subspace size on prediction performance. Conclusions. The proposed RS_SVM method yielded superior performance in analyzing gene expression profiles, which demonstrates that RS_SVM provides a good channel for such biological data. Hindawi Publishing Corporation 2016 2016-11-24 /pmc/articles/PMC5143691/ /pubmed/27999797 http://dx.doi.org/10.1155/2016/4596326 Text en Copyright © 2016 Liying Yang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Yang, Liying Liu, Zhimin Yuan, Xiguo Wei, Jianhua Zhang, Junying Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles |
title | Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles |
title_full | Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles |
title_fullStr | Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles |
title_full_unstemmed | Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles |
title_short | Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles |
title_sort | random subspace aggregation for cancer prediction with gene expression profiles |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5143691/ https://www.ncbi.nlm.nih.gov/pubmed/27999797 http://dx.doi.org/10.1155/2016/4596326 |
work_keys_str_mv | AT yangliying randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles AT liuzhimin randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles AT yuanxiguo randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles AT weijianhua randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles AT zhangjunying randomsubspaceaggregationforcancerpredictionwithgeneexpressionprofiles |