Cargando…

Identifying cancer biomarkers by network-constrained support vector machines

BACKGROUND: One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individu...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Li, Xuan, Jianhua, Riggins, Rebecca B, Clarke, Robert, Wang, Yue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3214162/
https://www.ncbi.nlm.nih.gov/pubmed/21992556
http://dx.doi.org/10.1186/1752-0509-5-161
_version_ 1782216212764164096
author Chen, Li
Xuan, Jianhua
Riggins, Rebecca B
Clarke, Robert
Wang, Yue
author_facet Chen, Li
Xuan, Jianhua
Riggins, Rebecca B
Clarke, Robert
Wang, Yue
author_sort Chen, Li
collection PubMed
description BACKGROUND: One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers. RESULTS: We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis. CONCLUSIONS: We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets.
format Online
Article
Text
id pubmed-3214162
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32141622011-11-14 Identifying cancer biomarkers by network-constrained support vector machines Chen, Li Xuan, Jianhua Riggins, Rebecca B Clarke, Robert Wang, Yue BMC Syst Biol Methodology Article BACKGROUND: One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers. RESULTS: We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis. CONCLUSIONS: We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets. BioMed Central 2011-10-12 /pmc/articles/PMC3214162/ /pubmed/21992556 http://dx.doi.org/10.1186/1752-0509-5-161 Text en Copyright ©2011 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Chen, Li
Xuan, Jianhua
Riggins, Rebecca B
Clarke, Robert
Wang, Yue
Identifying cancer biomarkers by network-constrained support vector machines
title Identifying cancer biomarkers by network-constrained support vector machines
title_full Identifying cancer biomarkers by network-constrained support vector machines
title_fullStr Identifying cancer biomarkers by network-constrained support vector machines
title_full_unstemmed Identifying cancer biomarkers by network-constrained support vector machines
title_short Identifying cancer biomarkers by network-constrained support vector machines
title_sort identifying cancer biomarkers by network-constrained support vector machines
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3214162/
https://www.ncbi.nlm.nih.gov/pubmed/21992556
http://dx.doi.org/10.1186/1752-0509-5-161
work_keys_str_mv AT chenli identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines
AT xuanjianhua identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines
AT rigginsrebeccab identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines
AT clarkerobert identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines
AT wangyue identifyingcancerbiomarkersbynetworkconstrainedsupportvectormachines