Cargando…
Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes
BACKGROUND: Repetitive DNA sequences (Repeats) are significant regions in the human genome that have a specific genomic distribution, structure, and several binding sites for genome architecture and function. In consequence, the possible configurations of Repeats in specific and dynamic regions like...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6288848/ https://www.ncbi.nlm.nih.gov/pubmed/30537933 http://dx.doi.org/10.1186/s12864-018-5196-6 |
_version_ | 1783379869412884480 |
---|---|
author | Tobar-Tosse, Fabian Veléz, Patricia E. Ocampo-Toro, Eliana Moreno, Pedro A. |
author_facet | Tobar-Tosse, Fabian Veléz, Patricia E. Ocampo-Toro, Eliana Moreno, Pedro A. |
author_sort | Tobar-Tosse, Fabian |
collection | PubMed |
description | BACKGROUND: Repetitive DNA sequences (Repeats) are significant regions in the human genome that have a specific genomic distribution, structure, and several binding sites for genome architecture and function. In consequence, the possible configurations of Repeats in specific and dynamic regions like the gene promoters could define footprints for molecular mechanisms, pathways, and cell function beyond their density in the genome. Here we explored the distribution of Repeats in the upstream promoter region of the human coding genes with the aim to identify specific configurations, clusters and functional meaning of those elements. Our method includes structural descriptions, hierarchical clustering, pathway association, and functional enrichment analysis. RESULTS: We report here several configurations of Repeats in the upstream promoter region (UPR), which define 2729 patterns for the 80% of the human coding genes. There are 47 types of Repeats in these configurations, where the most frequent were Alu, Low_complexity, MIR, Simple_repeat, LINE/L2, LINE/L1, hAT-Charlie, and ERV1. The distribution, length, and the high frequency of Repeats in the UPR defines several patterns and clusters, where the minimum frequency of configuration among Repeats was higher than 0.7. We found those clusters associated with cellular pathways and ontologies; thus, it was plausible to determine groups of Repeats to specific functional insights, for example, pathways for Genetic Information Processing or Metabolism shows particular groups of Repeats with specific configurations. CONCLUSION: Based on these findings, we propose that specific configurations of repetitive elements describe frequent patterns in the upstream promoter for sets of human coding genes, which those correlated to specific and essential cell pathways and functions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5196-6) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6288848 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62888482018-12-14 Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes Tobar-Tosse, Fabian Veléz, Patricia E. Ocampo-Toro, Eliana Moreno, Pedro A. BMC Genomics Research BACKGROUND: Repetitive DNA sequences (Repeats) are significant regions in the human genome that have a specific genomic distribution, structure, and several binding sites for genome architecture and function. In consequence, the possible configurations of Repeats in specific and dynamic regions like the gene promoters could define footprints for molecular mechanisms, pathways, and cell function beyond their density in the genome. Here we explored the distribution of Repeats in the upstream promoter region of the human coding genes with the aim to identify specific configurations, clusters and functional meaning of those elements. Our method includes structural descriptions, hierarchical clustering, pathway association, and functional enrichment analysis. RESULTS: We report here several configurations of Repeats in the upstream promoter region (UPR), which define 2729 patterns for the 80% of the human coding genes. There are 47 types of Repeats in these configurations, where the most frequent were Alu, Low_complexity, MIR, Simple_repeat, LINE/L2, LINE/L1, hAT-Charlie, and ERV1. The distribution, length, and the high frequency of Repeats in the UPR defines several patterns and clusters, where the minimum frequency of configuration among Repeats was higher than 0.7. We found those clusters associated with cellular pathways and ontologies; thus, it was plausible to determine groups of Repeats to specific functional insights, for example, pathways for Genetic Information Processing or Metabolism shows particular groups of Repeats with specific configurations. CONCLUSION: Based on these findings, we propose that specific configurations of repetitive elements describe frequent patterns in the upstream promoter for sets of human coding genes, which those correlated to specific and essential cell pathways and functions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5196-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-12-11 /pmc/articles/PMC6288848/ /pubmed/30537933 http://dx.doi.org/10.1186/s12864-018-5196-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Tobar-Tosse, Fabian Veléz, Patricia E. Ocampo-Toro, Eliana Moreno, Pedro A. Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
title | Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
title_full | Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
title_fullStr | Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
title_full_unstemmed | Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
title_short | Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
title_sort | structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6288848/ https://www.ncbi.nlm.nih.gov/pubmed/30537933 http://dx.doi.org/10.1186/s12864-018-5196-6 |
work_keys_str_mv | AT tobartossefabian structureclusteringandfunctionalinsightsofrepeatsconfigurationsintheupstreampromoterregionofthehumancodinggenes AT velezpatriciae structureclusteringandfunctionalinsightsofrepeatsconfigurationsintheupstreampromoterregionofthehumancodinggenes AT ocampotoroeliana structureclusteringandfunctionalinsightsofrepeatsconfigurationsintheupstreampromoterregionofthehumancodinggenes AT morenopedroa structureclusteringandfunctionalinsightsofrepeatsconfigurationsintheupstreampromoterregionofthehumancodinggenes |