Cargando…
Multiple novel promoter-architectures revealed by decoding the hidden heterogeneity within the genome
An important question in biology is how different promoter-architectures contribute to the diversity in regulation of transcription initiation. A step forward has been the production of genome-wide maps of transcription start sites (TSSs) using high-throughput sequencing. However, the subsequent ste...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227772/ https://www.ncbi.nlm.nih.gov/pubmed/25326324 http://dx.doi.org/10.1093/nar/gku924 |
Sumario: | An important question in biology is how different promoter-architectures contribute to the diversity in regulation of transcription initiation. A step forward has been the production of genome-wide maps of transcription start sites (TSSs) using high-throughput sequencing. However, the subsequent step of characterizing promoters and their functions is still largely done on the basis of previously established promoter-elements like the TATA-box in eukaryotes or the -10 box in bacteria. Unfortunately, a majority of promoters and their activities cannot be explained by these few elements. Traditional motif discovery methods that identify novel elements also fail here, because TSS neighborhoods are often highly heterogeneous containing no overrepresented motif. We present a new, organism-independent method that explicitly models this heterogeneity while unraveling different promoter-architectures. For example, in five bacteria, we detect the presence of a pyrimidine preceding the TSS under very specific circumstances. In tuberculosis, we show for the first time that the spacing between the bacterial 10-motif and TSS is utilized by the pathogen for dynamic gene-regulation. In eukaryotes, we identify several new elements that are important for development. Identified promoter-architectures show differential patterns of evolution, chromatin structure and TSS spread, suggesting distinct regulatory functions. This work highlights the importance of characterizing heterogeneity within high-throughput genomic data rather than analyzing average patterns of nucleotide composition. |
---|