Cargando…
Identifying cancer biomarkers by network-constrained support vector machines
BACKGROUND: One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individu...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3214162/ https://www.ncbi.nlm.nih.gov/pubmed/21992556 http://dx.doi.org/10.1186/1752-0509-5-161 |
_version_ | 1782216212764164096 |
---|---|
author | Chen, Li Xuan, Jianhua Riggins, Rebecca B Clarke, Robert Wang, Yue |
author_facet | Chen, Li Xuan, Jianhua Riggins, Rebecca B Clarke, Robert Wang, Yue |
author_sort | Chen, Li |
collection | PubMed |
description | BACKGROUND: One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers. RESULTS: We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis. CONCLUSIONS: We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets. |
format | Online Article Text |
id | pubmed-3214162 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32141622011-11-14 Identifying cancer biomarkers by network-constrained support vector machines Chen, Li Xuan, Jianhua Riggins, Rebecca B Clarke, Robert Wang, Yue BMC Syst Biol Methodology Article BACKGROUND: One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers. RESULTS: We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis. CONCLUSIONS: We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets. BioMed Central 2011-10-12 /pmc/articles/PMC3214162/ /pubmed/21992556 http://dx.doi.org/10.1186/1752-0509-5-161 Text en Copyright ©2011 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Chen, Li Xuan, Jianhua Riggins, Rebecca B Clarke, Robert Wang, Yue Identifying cancer biomarkers by network-constrained support vector machines |
title | Identifying cancer biomarkers by network-constrained support vector machines |
title_full | Identifying cancer biomarkers by network-constrained support vector machines |
title_fullStr | Identifying cancer biomarkers by network-constrained support vector machines |
title_full_unstemmed | Identifying cancer biomarkers by network-constrained support vector machines |
title_short | Identifying cancer biomarkers by network-constrained support vector machines |
title_sort | identifying cancer biomarkers by network-constrained support vector machines |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3214162/ https://www.ncbi.nlm.nih.gov/pubmed/21992556 http://dx.doi.org/10.1186/1752-0509-5-161 |
work_keys_str_mv | AT chenli identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines AT xuanjianhua identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines AT rigginsrebeccab identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines AT clarkerobert identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines AT wangyue identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines |