Cargando…

Integrative analysis of survival-associated gene sets in breast cancer

BACKGROUND: Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a...

Descripción completa

Detalles Bibliográficos
Autores principales: Varn, Frederick S, Ung, Matthew H, Lou, Shao Ke, Cheng, Chao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4359519/
https://www.ncbi.nlm.nih.gov/pubmed/25881247
http://dx.doi.org/10.1186/s12920-015-0086-0
_version_ 1782361422573862912
author Varn, Frederick S
Ung, Matthew H
Lou, Shao Ke
Cheng, Chao
author_facet Varn, Frederick S
Ung, Matthew H
Lou, Shao Ke
Cheng, Chao
author_sort Varn, Frederick S
collection PubMed
description BACKGROUND: Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient’s cancer. Identifying robust gene sets that are consistently predictive of a patient’s clinical outcome has become one of the main challenges in the field. METHODS: We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set’s activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network’s topology and applied the GSAS metric to characterize its role in patient survival. RESULTS: Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. CONCLUSIONS: The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used this metric to identify predictive gene sets and to construct a novel gene set containing genes heavily involved in cancer progression. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-015-0086-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4359519
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43595192015-03-15 Integrative analysis of survival-associated gene sets in breast cancer Varn, Frederick S Ung, Matthew H Lou, Shao Ke Cheng, Chao BMC Med Genomics Research Article BACKGROUND: Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient’s cancer. Identifying robust gene sets that are consistently predictive of a patient’s clinical outcome has become one of the main challenges in the field. METHODS: We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set’s activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network’s topology and applied the GSAS metric to characterize its role in patient survival. RESULTS: Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. CONCLUSIONS: The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used this metric to identify predictive gene sets and to construct a novel gene set containing genes heavily involved in cancer progression. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-015-0086-0) contains supplementary material, which is available to authorized users. BioMed Central 2015-03-12 /pmc/articles/PMC4359519/ /pubmed/25881247 http://dx.doi.org/10.1186/s12920-015-0086-0 Text en © Varn et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Varn, Frederick S
Ung, Matthew H
Lou, Shao Ke
Cheng, Chao
Integrative analysis of survival-associated gene sets in breast cancer
title Integrative analysis of survival-associated gene sets in breast cancer
title_full Integrative analysis of survival-associated gene sets in breast cancer
title_fullStr Integrative analysis of survival-associated gene sets in breast cancer
title_full_unstemmed Integrative analysis of survival-associated gene sets in breast cancer
title_short Integrative analysis of survival-associated gene sets in breast cancer
title_sort integrative analysis of survival-associated gene sets in breast cancer
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4359519/
https://www.ncbi.nlm.nih.gov/pubmed/25881247
http://dx.doi.org/10.1186/s12920-015-0086-0
work_keys_str_mv AT varnfredericks integrativeanalysisofsurvivalassociatedgenesetsinbreastcancer
AT ungmatthewh integrativeanalysisofsurvivalassociatedgenesetsinbreastcancer
AT loushaoke integrativeanalysisofsurvivalassociatedgenesetsinbreastcancer
AT chengchao integrativeanalysisofsurvivalassociatedgenesetsinbreastcancer