Cargando…

Integrating mutation and gene expression cross-sectional data to infer cancer progression

BACKGROUND: A major problem in identifying the best therapeutic targets for cancer is the molecular heterogeneity of the disease. Cancer is often caused by an accumulation of mutations which produce irreversible damage to the cell’s control mechanisms of survival and proliferation. Different mutatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Fleck, Julia L., Pavel, Ana B., Cassandras, Christos G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4727329/
https://www.ncbi.nlm.nih.gov/pubmed/26810975
http://dx.doi.org/10.1186/s12918-016-0255-6
_version_ 1782411947708252160
author Fleck, Julia L.
Pavel, Ana B.
Cassandras, Christos G.
author_facet Fleck, Julia L.
Pavel, Ana B.
Cassandras, Christos G.
author_sort Fleck, Julia L.
collection PubMed
description BACKGROUND: A major problem in identifying the best therapeutic targets for cancer is the molecular heterogeneity of the disease. Cancer is often caused by an accumulation of mutations which produce irreversible damage to the cell’s control mechanisms of survival and proliferation. Different mutations may affect these cellular anachronisms through a combination of molecular interactions which may be dynamically changing during cancer progression. It has been previously shown that cancer accumulates mutations over time. In this paper we address the problem of cancer heterogeneity by modeling cancer progression using somatic mutation and gene expression cross-sectional data. RESULTS: We propose a novel formulation of integrating somatic mutation and gene expression data to infer the temporal sequence of events from cross-sectional data. Using a mixed integer linear program we model the interaction between groups of different mutated genes and the resulting modifications at the gene expression level. Our approach identifies a partition of mutation events which gradually produce gene expression changes to a partition of genes over time. The proposed formulation is tested using both simulated data and real breast cancer data with matched somatic mutations and gene expression measurements from The Cancer Genome Atlas. First, we classify the genes as oncogenes or tumor suppressors based on the frequency of driver mutations. As expected, the most frequently mutated genes in breast cancer are PIK3CA and TP53 genes. Then, we select those genes with most frequent driver mutations and a set of genes known to play roles in cancer development. Furthermore, we apply the proposed mixed integer linear program to identify the temporal order in which genes mutate and, simultaneously, the changes they produce at the gene expression level during cancer progression. In addition, we are able to identify known causal relationships between mutations and gene expression changes in PI3K/AKT and TP53 pathways. CONCLUSIONS: This paper proposes a new model to infer the temporal sequence in which mutations occur and lead to changes at the gene expression level during cancer progression. The approach is general and can be applied to any data sets with available somatic mutations and gene expression measurements. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-016-0255-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4727329
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47273292016-01-27 Integrating mutation and gene expression cross-sectional data to infer cancer progression Fleck, Julia L. Pavel, Ana B. Cassandras, Christos G. BMC Syst Biol Methodology Article BACKGROUND: A major problem in identifying the best therapeutic targets for cancer is the molecular heterogeneity of the disease. Cancer is often caused by an accumulation of mutations which produce irreversible damage to the cell’s control mechanisms of survival and proliferation. Different mutations may affect these cellular anachronisms through a combination of molecular interactions which may be dynamically changing during cancer progression. It has been previously shown that cancer accumulates mutations over time. In this paper we address the problem of cancer heterogeneity by modeling cancer progression using somatic mutation and gene expression cross-sectional data. RESULTS: We propose a novel formulation of integrating somatic mutation and gene expression data to infer the temporal sequence of events from cross-sectional data. Using a mixed integer linear program we model the interaction between groups of different mutated genes and the resulting modifications at the gene expression level. Our approach identifies a partition of mutation events which gradually produce gene expression changes to a partition of genes over time. The proposed formulation is tested using both simulated data and real breast cancer data with matched somatic mutations and gene expression measurements from The Cancer Genome Atlas. First, we classify the genes as oncogenes or tumor suppressors based on the frequency of driver mutations. As expected, the most frequently mutated genes in breast cancer are PIK3CA and TP53 genes. Then, we select those genes with most frequent driver mutations and a set of genes known to play roles in cancer development. Furthermore, we apply the proposed mixed integer linear program to identify the temporal order in which genes mutate and, simultaneously, the changes they produce at the gene expression level during cancer progression. In addition, we are able to identify known causal relationships between mutations and gene expression changes in PI3K/AKT and TP53 pathways. CONCLUSIONS: This paper proposes a new model to infer the temporal sequence in which mutations occur and lead to changes at the gene expression level during cancer progression. The approach is general and can be applied to any data sets with available somatic mutations and gene expression measurements. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-016-0255-6) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-25 /pmc/articles/PMC4727329/ /pubmed/26810975 http://dx.doi.org/10.1186/s12918-016-0255-6 Text en © Fleck et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Fleck, Julia L.
Pavel, Ana B.
Cassandras, Christos G.
Integrating mutation and gene expression cross-sectional data to infer cancer progression
title Integrating mutation and gene expression cross-sectional data to infer cancer progression
title_full Integrating mutation and gene expression cross-sectional data to infer cancer progression
title_fullStr Integrating mutation and gene expression cross-sectional data to infer cancer progression
title_full_unstemmed Integrating mutation and gene expression cross-sectional data to infer cancer progression
title_short Integrating mutation and gene expression cross-sectional data to infer cancer progression
title_sort integrating mutation and gene expression cross-sectional data to infer cancer progression
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4727329/
https://www.ncbi.nlm.nih.gov/pubmed/26810975
http://dx.doi.org/10.1186/s12918-016-0255-6
work_keys_str_mv AT fleckjulial integratingmutationandgeneexpressioncrosssectionaldatatoinfercancerprogression
AT pavelanab integratingmutationandgeneexpressioncrosssectionaldatatoinfercancerprogression
AT cassandraschristosg integratingmutationandgeneexpressioncrosssectionaldatatoinfercancerprogression