Cargando…

Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations

BACKGROUND: Copy number aberrations (CNAs) in cancer affect disease outcomes by regulating molecular phenotypes, such as gene expressions, that drive important biological processes. To gain comprehensive insights into molecular biomarkers for cancer, it is critical to identify key groups of CNAs, th...

Descripción completa

Detalles Bibliográficos
Autores principales: Dutta, Diptavo, Sen, Ananda, Satagopan, Jaya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803132/
https://www.ncbi.nlm.nih.gov/pubmed/36584096
http://dx.doi.org/10.1371/journal.pone.0276886
_version_ 1784861812181696512
author Dutta, Diptavo
Sen, Ananda
Satagopan, Jaya
author_facet Dutta, Diptavo
Sen, Ananda
Satagopan, Jaya
author_sort Dutta, Diptavo
collection PubMed
description BACKGROUND: Copy number aberrations (CNAs) in cancer affect disease outcomes by regulating molecular phenotypes, such as gene expressions, that drive important biological processes. To gain comprehensive insights into molecular biomarkers for cancer, it is critical to identify key groups of CNAs, the associated gene modules, regulatory modules, and their downstream effect on outcomes. METHODS: In this paper, we demonstrate an innovative use of sparse canonical correlation analysis (sCCA) to effectively identify the ensemble of CNAs, and gene modules in the context of binary and censored disease endpoints. Our approach detects potentially orthogonal gene expression modules which are highly correlated with sets of CNA and then identifies the genes within these modules that are associated with the outcome. RESULTS: Analyzing clinical and genomic data on 1,904 breast cancer patients from the METABRIC study, we found 14 gene modules to be regulated by groups of proximally located CNA sites. We validated this finding using an independent set of 1,077 breast invasive carcinoma samples from The Cancer Genome Atlas (TCGA). Our analysis of 7 clinical endpoints identified several novel and interpretable regulatory associations, highlighting the role of CNAs in key biological pathways and processes for breast cancer. Genes significantly associated with the outcomes were enriched for early estrogen response pathway, DNA repair pathways as well as targets of transcription factors such as E2F4, MYC, and ETS1 that have recognized roles in tumor characteristics and survival. Subsequent meta-analysis across the endpoints further identified several genes through the aggregation of weaker associations. CONCLUSIONS: Our findings suggest that sCCA analysis can aggregate weaker associations to identify interpretable and important genes, modules, and clinically consequential pathways.
format Online
Article
Text
id pubmed-9803132
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98031322022-12-31 Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations Dutta, Diptavo Sen, Ananda Satagopan, Jaya PLoS One Research Article BACKGROUND: Copy number aberrations (CNAs) in cancer affect disease outcomes by regulating molecular phenotypes, such as gene expressions, that drive important biological processes. To gain comprehensive insights into molecular biomarkers for cancer, it is critical to identify key groups of CNAs, the associated gene modules, regulatory modules, and their downstream effect on outcomes. METHODS: In this paper, we demonstrate an innovative use of sparse canonical correlation analysis (sCCA) to effectively identify the ensemble of CNAs, and gene modules in the context of binary and censored disease endpoints. Our approach detects potentially orthogonal gene expression modules which are highly correlated with sets of CNA and then identifies the genes within these modules that are associated with the outcome. RESULTS: Analyzing clinical and genomic data on 1,904 breast cancer patients from the METABRIC study, we found 14 gene modules to be regulated by groups of proximally located CNA sites. We validated this finding using an independent set of 1,077 breast invasive carcinoma samples from The Cancer Genome Atlas (TCGA). Our analysis of 7 clinical endpoints identified several novel and interpretable regulatory associations, highlighting the role of CNAs in key biological pathways and processes for breast cancer. Genes significantly associated with the outcomes were enriched for early estrogen response pathway, DNA repair pathways as well as targets of transcription factors such as E2F4, MYC, and ETS1 that have recognized roles in tumor characteristics and survival. Subsequent meta-analysis across the endpoints further identified several genes through the aggregation of weaker associations. CONCLUSIONS: Our findings suggest that sCCA analysis can aggregate weaker associations to identify interpretable and important genes, modules, and clinically consequential pathways. Public Library of Science 2022-12-30 /pmc/articles/PMC9803132/ /pubmed/36584096 http://dx.doi.org/10.1371/journal.pone.0276886 Text en © 2022 Dutta et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Dutta, Diptavo
Sen, Ananda
Satagopan, Jaya
Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
title Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
title_full Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
title_fullStr Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
title_full_unstemmed Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
title_short Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
title_sort sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803132/
https://www.ncbi.nlm.nih.gov/pubmed/36584096
http://dx.doi.org/10.1371/journal.pone.0276886
work_keys_str_mv AT duttadiptavo sparsecanonicalcorrelationtoidentifybreastcancerrelatedgenesregulatedbycopynumberaberrations
AT senananda sparsecanonicalcorrelationtoidentifybreastcancerrelatedgenesregulatedbycopynumberaberrations
AT satagopanjaya sparsecanonicalcorrelationtoidentifybreastcancerrelatedgenesregulatedbycopynumberaberrations