Cargando…

Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data

MOTIVATION: Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Junhua, Zhang, Shihua, Wang, Yong, Zhang, Xiang-Sun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851989/
https://www.ncbi.nlm.nih.gov/pubmed/24565034
http://dx.doi.org/10.1186/1752-0509-7-S2-S4
_version_ 1782294391392567296
author Zhang, Junhua
Zhang, Shihua
Wang, Yong
Zhang, Xiang-Sun
author_facet Zhang, Junhua
Zhang, Shihua
Wang, Yong
Zhang, Xiang-Sun
author_sort Zhang, Junhua
collection PubMed
description MOTIVATION: Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need. RESULTS: In this study, we propose a novel method to identify Mutated Core Modules in Cancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer. CONCLUSIONS: This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.
format Online
Article
Text
id pubmed-3851989
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38519892013-12-20 Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data Zhang, Junhua Zhang, Shihua Wang, Yong Zhang, Xiang-Sun BMC Syst Biol Research MOTIVATION: Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need. RESULTS: In this study, we propose a novel method to identify Mutated Core Modules in Cancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer. CONCLUSIONS: This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies. BioMed Central 2013-10-14 /pmc/articles/PMC3851989/ /pubmed/24565034 http://dx.doi.org/10.1186/1752-0509-7-S2-S4 Text en Copyright © 2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Zhang, Junhua
Zhang, Shihua
Wang, Yong
Zhang, Xiang-Sun
Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
title Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
title_full Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
title_fullStr Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
title_full_unstemmed Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
title_short Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
title_sort identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851989/
https://www.ncbi.nlm.nih.gov/pubmed/24565034
http://dx.doi.org/10.1186/1752-0509-7-S2-S4
work_keys_str_mv AT zhangjunhua identificationofmutatedcorecancermodulesbyintegratingsomaticmutationcopynumbervariationandgeneexpressiondata
AT zhangshihua identificationofmutatedcorecancermodulesbyintegratingsomaticmutationcopynumbervariationandgeneexpressiondata
AT wangyong identificationofmutatedcorecancermodulesbyintegratingsomaticmutationcopynumbervariationandgeneexpressiondata
AT zhangxiangsun identificationofmutatedcorecancermodulesbyintegratingsomaticmutationcopynumbervariationandgeneexpressiondata