Cargando…

Motif-guided sparse decomposition of gene expression data for regulatory module identification

BACKGROUND: Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditiona...

Descripción completa

Detalles Bibliográficos
Autores principales: Gong, Ting, Xuan, Jianhua, Chen, Li, Riggins, Rebecca B, Li, Huai, Hoffman, Eric P, Clarke, Robert, Wang, Yue
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072956/
https://www.ncbi.nlm.nih.gov/pubmed/21426557
http://dx.doi.org/10.1186/1471-2105-12-82
_version_ 1782201597600727040
author Gong, Ting
Xuan, Jianhua
Chen, Li
Riggins, Rebecca B
Li, Huai
Hoffman, Eric P
Clarke, Robert
Wang, Yue
author_facet Gong, Ting
Xuan, Jianhua
Chen, Li
Riggins, Rebecca B
Li, Huai
Hoffman, Eric P
Clarke, Robert
Wang, Yue
author_sort Gong, Ting
collection PubMed
description BACKGROUND: Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditional gene clustering often yields unsatisfactory results for regulatory module identification because the resulting gene clusters are co-expressed but not necessarily co-regulated. RESULTS: We propose a novel approach, motif-guided sparse decomposition (mSD), to identify gene regulatory modules by integrating gene expression data and DNA sequence motif information. The mSD approach is implemented as a two-step algorithm comprising estimates of (1) transcription factor activity and (2) the strength of the predicted gene regulation event(s). Specifically, a motif-guided clustering method is first developed to estimate the transcription factor activity of a gene module; sparse component analysis is then applied to estimate the regulation strength, and so predict the target genes of the transcription factors. The mSD approach was first tested for its improved performance in finding regulatory modules using simulated and real yeast data, revealing functionally distinct gene modules enriched with biologically validated transcription factors. We then demonstrated the efficacy of the mSD approach on breast cancer cell line data and uncovered several important gene regulatory modules related to endocrine therapy of breast cancer. CONCLUSION: We have developed a new integrated strategy, namely motif-guided sparse decomposition (mSD) of gene expression data, for regulatory module identification. The mSD method features a novel motif-guided clustering method for transcription factor activity estimation by finding a balance between co-regulation and co-expression. The mSD method further utilizes a sparse decomposition method for regulation strength estimation. The experimental results show that such a motif-guided strategy can provide context-specific regulatory modules in both yeast and breast cancer studies.
format Text
id pubmed-3072956
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30729562011-04-09 Motif-guided sparse decomposition of gene expression data for regulatory module identification Gong, Ting Xuan, Jianhua Chen, Li Riggins, Rebecca B Li, Huai Hoffman, Eric P Clarke, Robert Wang, Yue BMC Bioinformatics Methodology Article BACKGROUND: Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditional gene clustering often yields unsatisfactory results for regulatory module identification because the resulting gene clusters are co-expressed but not necessarily co-regulated. RESULTS: We propose a novel approach, motif-guided sparse decomposition (mSD), to identify gene regulatory modules by integrating gene expression data and DNA sequence motif information. The mSD approach is implemented as a two-step algorithm comprising estimates of (1) transcription factor activity and (2) the strength of the predicted gene regulation event(s). Specifically, a motif-guided clustering method is first developed to estimate the transcription factor activity of a gene module; sparse component analysis is then applied to estimate the regulation strength, and so predict the target genes of the transcription factors. The mSD approach was first tested for its improved performance in finding regulatory modules using simulated and real yeast data, revealing functionally distinct gene modules enriched with biologically validated transcription factors. We then demonstrated the efficacy of the mSD approach on breast cancer cell line data and uncovered several important gene regulatory modules related to endocrine therapy of breast cancer. CONCLUSION: We have developed a new integrated strategy, namely motif-guided sparse decomposition (mSD) of gene expression data, for regulatory module identification. The mSD method features a novel motif-guided clustering method for transcription factor activity estimation by finding a balance between co-regulation and co-expression. The mSD method further utilizes a sparse decomposition method for regulation strength estimation. The experimental results show that such a motif-guided strategy can provide context-specific regulatory modules in both yeast and breast cancer studies. BioMed Central 2011-03-22 /pmc/articles/PMC3072956/ /pubmed/21426557 http://dx.doi.org/10.1186/1471-2105-12-82 Text en Copyright ©2011 Gong et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Gong, Ting
Xuan, Jianhua
Chen, Li
Riggins, Rebecca B
Li, Huai
Hoffman, Eric P
Clarke, Robert
Wang, Yue
Motif-guided sparse decomposition of gene expression data for regulatory module identification
title Motif-guided sparse decomposition of gene expression data for regulatory module identification
title_full Motif-guided sparse decomposition of gene expression data for regulatory module identification
title_fullStr Motif-guided sparse decomposition of gene expression data for regulatory module identification
title_full_unstemmed Motif-guided sparse decomposition of gene expression data for regulatory module identification
title_short Motif-guided sparse decomposition of gene expression data for regulatory module identification
title_sort motif-guided sparse decomposition of gene expression data for regulatory module identification
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3072956/
https://www.ncbi.nlm.nih.gov/pubmed/21426557
http://dx.doi.org/10.1186/1471-2105-12-82
work_keys_str_mv AT gongting motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT xuanjianhua motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT chenli motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT rigginsrebeccab motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT lihuai motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT hoffmanericp motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT clarkerobert motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification
AT wangyue motifguidedsparsedecompositionofgeneexpressiondataforregulatorymoduleidentification