Cargando…
Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
In this study, we pairwise-compared multiple genome regions, including genes, exons, coding DNA sequences (CDS), introns, and intergenic regions of 39 Animalia genomes, including Deuterostomia (27 species) and Protostomia (12 species), by applying established k-mer-based (alignment-free) comparison...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211125/ https://www.ncbi.nlm.nih.gov/pubmed/30287792 http://dx.doi.org/10.3390/genes9100482 |
_version_ | 1783367275151097856 |
---|---|
author | Sievers, Aaron Wenz, Frederik Hausmann, Michael Hildenbrand, Georg |
author_facet | Sievers, Aaron Wenz, Frederik Hausmann, Michael Hildenbrand, Georg |
author_sort | Sievers, Aaron |
collection | PubMed |
description | In this study, we pairwise-compared multiple genome regions, including genes, exons, coding DNA sequences (CDS), introns, and intergenic regions of 39 Animalia genomes, including Deuterostomia (27 species) and Protostomia (12 species), by applying established k-mer-based (alignment-free) comparison methods. We found strong correlations between the sequence structure of introns and intergenic regions, individual organisms, and within wider phylogenetical ranges, indicating the conservation of certain structures over the full range of analyzed organisms. We analyzed these sequence structures by quantifying the contribution of different sets of DNA words to the average correlation value by decomposing the correlation coefficients with respect to these word sets. We found that the conserved structures within introns, intergenic regions, and between the two were mainly a result of conserved tandem repeats with repeat units ≤ 2 bp (e.g., (AT)(n)), while other conserved sequence structures, such as those found between exons and CDS, were dominated by tandem repeats with repeat unit sizes of 3 bp in length and more complex DNA word patterns. We conclude that the conservation between intron and intergenic regions indicates a shared function of these sequence structures. Also, the similar differences in conserved structures with known origin, especially to the conservation between exons and CDS resulting from DNA codons, indicate that k-mer composition-based functional properties of introns and intergenic regions may differ from those of exons and CDS. |
format | Online Article Text |
id | pubmed-6211125 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-62111252018-11-02 Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes Sievers, Aaron Wenz, Frederik Hausmann, Michael Hildenbrand, Georg Genes (Basel) Article In this study, we pairwise-compared multiple genome regions, including genes, exons, coding DNA sequences (CDS), introns, and intergenic regions of 39 Animalia genomes, including Deuterostomia (27 species) and Protostomia (12 species), by applying established k-mer-based (alignment-free) comparison methods. We found strong correlations between the sequence structure of introns and intergenic regions, individual organisms, and within wider phylogenetical ranges, indicating the conservation of certain structures over the full range of analyzed organisms. We analyzed these sequence structures by quantifying the contribution of different sets of DNA words to the average correlation value by decomposing the correlation coefficients with respect to these word sets. We found that the conserved structures within introns, intergenic regions, and between the two were mainly a result of conserved tandem repeats with repeat units ≤ 2 bp (e.g., (AT)(n)), while other conserved sequence structures, such as those found between exons and CDS, were dominated by tandem repeats with repeat unit sizes of 3 bp in length and more complex DNA word patterns. We conclude that the conservation between intron and intergenic regions indicates a shared function of these sequence structures. Also, the similar differences in conserved structures with known origin, especially to the conservation between exons and CDS resulting from DNA codons, indicate that k-mer composition-based functional properties of introns and intergenic regions may differ from those of exons and CDS. MDPI 2018-10-04 /pmc/articles/PMC6211125/ /pubmed/30287792 http://dx.doi.org/10.3390/genes9100482 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Sievers, Aaron Wenz, Frederik Hausmann, Michael Hildenbrand, Georg Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes |
title | Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes |
title_full | Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes |
title_fullStr | Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes |
title_full_unstemmed | Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes |
title_short | Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes |
title_sort | conservation of k-mer composition and correlation contribution between introns and intergenic regions of animalia genomes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211125/ https://www.ncbi.nlm.nih.gov/pubmed/30287792 http://dx.doi.org/10.3390/genes9100482 |
work_keys_str_mv | AT sieversaaron conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes AT wenzfrederik conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes AT hausmannmichael conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes AT hildenbrandgeorg conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes |