Cargando…

Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes

In this study, we pairwise-compared multiple genome regions, including genes, exons, coding DNA sequences (CDS), introns, and intergenic regions of 39 Animalia genomes, including Deuterostomia (27 species) and Protostomia (12 species), by applying established k-mer-based (alignment-free) comparison...

Descripción completa

Detalles Bibliográficos
Autores principales: Sievers, Aaron, Wenz, Frederik, Hausmann, Michael, Hildenbrand, Georg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211125/
https://www.ncbi.nlm.nih.gov/pubmed/30287792
http://dx.doi.org/10.3390/genes9100482
_version_ 1783367275151097856
author Sievers, Aaron
Wenz, Frederik
Hausmann, Michael
Hildenbrand, Georg
author_facet Sievers, Aaron
Wenz, Frederik
Hausmann, Michael
Hildenbrand, Georg
author_sort Sievers, Aaron
collection PubMed
description In this study, we pairwise-compared multiple genome regions, including genes, exons, coding DNA sequences (CDS), introns, and intergenic regions of 39 Animalia genomes, including Deuterostomia (27 species) and Protostomia (12 species), by applying established k-mer-based (alignment-free) comparison methods. We found strong correlations between the sequence structure of introns and intergenic regions, individual organisms, and within wider phylogenetical ranges, indicating the conservation of certain structures over the full range of analyzed organisms. We analyzed these sequence structures by quantifying the contribution of different sets of DNA words to the average correlation value by decomposing the correlation coefficients with respect to these word sets. We found that the conserved structures within introns, intergenic regions, and between the two were mainly a result of conserved tandem repeats with repeat units ≤ 2 bp (e.g., (AT)(n)), while other conserved sequence structures, such as those found between exons and CDS, were dominated by tandem repeats with repeat unit sizes of 3 bp in length and more complex DNA word patterns. We conclude that the conservation between intron and intergenic regions indicates a shared function of these sequence structures. Also, the similar differences in conserved structures with known origin, especially to the conservation between exons and CDS resulting from DNA codons, indicate that k-mer composition-based functional properties of introns and intergenic regions may differ from those of exons and CDS.
format Online
Article
Text
id pubmed-6211125
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-62111252018-11-02 Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes Sievers, Aaron Wenz, Frederik Hausmann, Michael Hildenbrand, Georg Genes (Basel) Article In this study, we pairwise-compared multiple genome regions, including genes, exons, coding DNA sequences (CDS), introns, and intergenic regions of 39 Animalia genomes, including Deuterostomia (27 species) and Protostomia (12 species), by applying established k-mer-based (alignment-free) comparison methods. We found strong correlations between the sequence structure of introns and intergenic regions, individual organisms, and within wider phylogenetical ranges, indicating the conservation of certain structures over the full range of analyzed organisms. We analyzed these sequence structures by quantifying the contribution of different sets of DNA words to the average correlation value by decomposing the correlation coefficients with respect to these word sets. We found that the conserved structures within introns, intergenic regions, and between the two were mainly a result of conserved tandem repeats with repeat units ≤ 2 bp (e.g., (AT)(n)), while other conserved sequence structures, such as those found between exons and CDS, were dominated by tandem repeats with repeat unit sizes of 3 bp in length and more complex DNA word patterns. We conclude that the conservation between intron and intergenic regions indicates a shared function of these sequence structures. Also, the similar differences in conserved structures with known origin, especially to the conservation between exons and CDS resulting from DNA codons, indicate that k-mer composition-based functional properties of introns and intergenic regions may differ from those of exons and CDS. MDPI 2018-10-04 /pmc/articles/PMC6211125/ /pubmed/30287792 http://dx.doi.org/10.3390/genes9100482 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sievers, Aaron
Wenz, Frederik
Hausmann, Michael
Hildenbrand, Georg
Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
title Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
title_full Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
title_fullStr Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
title_full_unstemmed Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
title_short Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes
title_sort conservation of k-mer composition and correlation contribution between introns and intergenic regions of animalia genomes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211125/
https://www.ncbi.nlm.nih.gov/pubmed/30287792
http://dx.doi.org/10.3390/genes9100482
work_keys_str_mv AT sieversaaron conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes
AT wenzfrederik conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes
AT hausmannmichael conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes
AT hildenbrandgeorg conservationofkmercompositionandcorrelationcontributionbetweenintronsandintergenicregionsofanimaliagenomes