Cargando…
Integrative genome-wide chromatin signature analysis using finite mixture models
Regulation of gene expression has been shown to involve not only the binding of transcription factor at target gene promoters but also the characterization of histone around which DNA is wrapped around. Some histone modification, for example di-methylated histone H3 at lysine 4 (H3K4me2), has been s...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3481451/ https://www.ncbi.nlm.nih.gov/pubmed/23134707 http://dx.doi.org/10.1186/1471-2164-13-S6-S3 |
_version_ | 1782247741565435904 |
---|---|
author | Taslim, Cenny Lin, Shili Huang, Kun Huang, Tim Hui-Ming |
author_facet | Taslim, Cenny Lin, Shili Huang, Kun Huang, Tim Hui-Ming |
author_sort | Taslim, Cenny |
collection | PubMed |
description | Regulation of gene expression has been shown to involve not only the binding of transcription factor at target gene promoters but also the characterization of histone around which DNA is wrapped around. Some histone modification, for example di-methylated histone H3 at lysine 4 (H3K4me2), has been shown to bind to promoters and activate target genes. However, no clear pattern has been shown to predict human promoters. This paper proposed a novel quantitative approach to characterize patterns of promoter regions and predict novel and alternative promoters. We utilized high-throughput data generated using chromatin immunoprecipitation methods followed by massively parallel sequencing (ChIP-seq) technology on RNA Polymerase II (Pol-II) and H3K4me2. Common patterns of promoter regions are modeled using a mixture model involving double-exponential and uniform distributions. The fitted model obtained were then used to search for regions displaying similar patterns over the entire genome to find novel and alternative promoters. Regions with high correlations with the common patterns are identified as putative novel promoters. We used this proposed algorithm, RNA-seq data and several transcripts databases to find alternative promoters in MCF7 (normal breast cancer) cell line. We found 7,235 high-confidence regions that display the identified promoter patterns. Of these, 4,167 regions (58%) can be mapped to RefSeq regions. 2,444 regions are in a gene body or overlap with transcripts (non-coding RNAs, ESTs, and transcripts that are predicted by RNA-seq data). Some of these maybe potential alternative promoters. We also found 193 regions that map to enhancer regions (represented by androgen and estrogen receptor binding sites) and other regulatory regions such as CTCF (CCCTC binding factor) and CpG island. Around 5% (431 regions) of these correlated regions do not overlap with any transcripts or regulatory regions suggesting that these might be potential new promoters or markers for other annotation which are currently undiscovered. |
format | Online Article Text |
id | pubmed-3481451 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-34814512012-11-02 Integrative genome-wide chromatin signature analysis using finite mixture models Taslim, Cenny Lin, Shili Huang, Kun Huang, Tim Hui-Ming BMC Genomics Research Regulation of gene expression has been shown to involve not only the binding of transcription factor at target gene promoters but also the characterization of histone around which DNA is wrapped around. Some histone modification, for example di-methylated histone H3 at lysine 4 (H3K4me2), has been shown to bind to promoters and activate target genes. However, no clear pattern has been shown to predict human promoters. This paper proposed a novel quantitative approach to characterize patterns of promoter regions and predict novel and alternative promoters. We utilized high-throughput data generated using chromatin immunoprecipitation methods followed by massively parallel sequencing (ChIP-seq) technology on RNA Polymerase II (Pol-II) and H3K4me2. Common patterns of promoter regions are modeled using a mixture model involving double-exponential and uniform distributions. The fitted model obtained were then used to search for regions displaying similar patterns over the entire genome to find novel and alternative promoters. Regions with high correlations with the common patterns are identified as putative novel promoters. We used this proposed algorithm, RNA-seq data and several transcripts databases to find alternative promoters in MCF7 (normal breast cancer) cell line. We found 7,235 high-confidence regions that display the identified promoter patterns. Of these, 4,167 regions (58%) can be mapped to RefSeq regions. 2,444 regions are in a gene body or overlap with transcripts (non-coding RNAs, ESTs, and transcripts that are predicted by RNA-seq data). Some of these maybe potential alternative promoters. We also found 193 regions that map to enhancer regions (represented by androgen and estrogen receptor binding sites) and other regulatory regions such as CTCF (CCCTC binding factor) and CpG island. Around 5% (431 regions) of these correlated regions do not overlap with any transcripts or regulatory regions suggesting that these might be potential new promoters or markers for other annotation which are currently undiscovered. BioMed Central 2012-10-26 /pmc/articles/PMC3481451/ /pubmed/23134707 http://dx.doi.org/10.1186/1471-2164-13-S6-S3 Text en Copyright ©2012 Taslim et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Taslim, Cenny Lin, Shili Huang, Kun Huang, Tim Hui-Ming Integrative genome-wide chromatin signature analysis using finite mixture models |
title | Integrative genome-wide chromatin signature analysis using finite mixture models |
title_full | Integrative genome-wide chromatin signature analysis using finite mixture models |
title_fullStr | Integrative genome-wide chromatin signature analysis using finite mixture models |
title_full_unstemmed | Integrative genome-wide chromatin signature analysis using finite mixture models |
title_short | Integrative genome-wide chromatin signature analysis using finite mixture models |
title_sort | integrative genome-wide chromatin signature analysis using finite mixture models |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3481451/ https://www.ncbi.nlm.nih.gov/pubmed/23134707 http://dx.doi.org/10.1186/1471-2164-13-S6-S3 |
work_keys_str_mv | AT taslimcenny integrativegenomewidechromatinsignatureanalysisusingfinitemixturemodels AT linshili integrativegenomewidechromatinsignatureanalysisusingfinitemixturemodels AT huangkun integrativegenomewidechromatinsignatureanalysisusingfinitemixturemodels AT huangtimhuiming integrativegenomewidechromatinsignatureanalysisusingfinitemixturemodels |