Cargando…

Evolution of genomic sequence inhomogeneity at mid-range scales

BACKGROUND: Mid-range inhomogeneity or MRI is the significant enrichment of particular nucleotides in genomic sequences extending from 30 up to several thousands of nucleotides. The best-known manifestation of MRI is CpG islands representing CG-rich regions. Recently it was demonstrated that MRI cou...

Descripción completa

Detalles Bibliográficos
Autores principales: Prakash, Ashwin, Shepard, Samuel S, He, Jie, Hart, Benjamin, Chen, Miao, Amarachintha, Surya P, Mileyeva-Biebesheimer, Olga, Bechtel, Jason, Fedorov, Alexei
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2779198/
https://www.ncbi.nlm.nih.gov/pubmed/19891785
http://dx.doi.org/10.1186/1471-2164-10-513
_version_ 1782174349157990400
author Prakash, Ashwin
Shepard, Samuel S
He, Jie
Hart, Benjamin
Chen, Miao
Amarachintha, Surya P
Mileyeva-Biebesheimer, Olga
Bechtel, Jason
Fedorov, Alexei
author_facet Prakash, Ashwin
Shepard, Samuel S
He, Jie
Hart, Benjamin
Chen, Miao
Amarachintha, Surya P
Mileyeva-Biebesheimer, Olga
Bechtel, Jason
Fedorov, Alexei
author_sort Prakash, Ashwin
collection PubMed
description BACKGROUND: Mid-range inhomogeneity or MRI is the significant enrichment of particular nucleotides in genomic sequences extending from 30 up to several thousands of nucleotides. The best-known manifestation of MRI is CpG islands representing CG-rich regions. Recently it was demonstrated that MRI could be observed not only for G+C content but also for all other nucleotide pairings (e.g. A+G and G+T) as well as for individual bases. Various types of MRI regions are 4-20 times enriched in mammalian genomes compared to their occurrences in random models. RESULTS: This paper explores how different types of mutations change MRI regions. Human, chimpanzee and Macaca mulatta genomes were aligned to study the projected effects of substitutions and indels on human sequence evolution within both MRI regions and control regions of average nucleotide composition. Over 18.8 million fixed point substitutions, 3.9 million SNPs, and indels spanning 6.9 Mb were procured and evaluated in human. They include 1.8 Mb substitutions and 1.9 Mb indels within MRI regions. Ancestral and mutant (derived) alleles for substitutions have been determined. Substitutions were grouped according to their fixation within human populations: fixed substitutions (from the human-chimp-macaca alignment), major SNPs (> 80% mutant allele frequency within humans), medium SNPs (20% - 80% mutant allele frequency), minor SNPs (3% - 20%), and rare SNPs (<3%). Data on short (< 3 bp) and medium-length (3 - 50 bp) insertions and deletions within MRI regions and appropriate control regions were analyzed for the effect of indels on the expansion or diminution of such regions as well as on changing nucleotide composition. CONCLUSION: MRI regions have comparable levels of de novo mutations to the control genomic sequences with average base composition. De novo substitutions rapidly erode MRI regions, bringing their nucleotide composition toward genome-average levels. However, those substitutions that favor the maintenance of MRI properties have a higher chance to spread through the entire population. Indels have a clear tendency to maintain MRI features yet they have a smaller impact than substitutions. All in all, the observed fixation bias for mutations helps to preserve MRI regions during evolution.
format Text
id pubmed-2779198
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27791982009-11-19 Evolution of genomic sequence inhomogeneity at mid-range scales Prakash, Ashwin Shepard, Samuel S He, Jie Hart, Benjamin Chen, Miao Amarachintha, Surya P Mileyeva-Biebesheimer, Olga Bechtel, Jason Fedorov, Alexei BMC Genomics Research article BACKGROUND: Mid-range inhomogeneity or MRI is the significant enrichment of particular nucleotides in genomic sequences extending from 30 up to several thousands of nucleotides. The best-known manifestation of MRI is CpG islands representing CG-rich regions. Recently it was demonstrated that MRI could be observed not only for G+C content but also for all other nucleotide pairings (e.g. A+G and G+T) as well as for individual bases. Various types of MRI regions are 4-20 times enriched in mammalian genomes compared to their occurrences in random models. RESULTS: This paper explores how different types of mutations change MRI regions. Human, chimpanzee and Macaca mulatta genomes were aligned to study the projected effects of substitutions and indels on human sequence evolution within both MRI regions and control regions of average nucleotide composition. Over 18.8 million fixed point substitutions, 3.9 million SNPs, and indels spanning 6.9 Mb were procured and evaluated in human. They include 1.8 Mb substitutions and 1.9 Mb indels within MRI regions. Ancestral and mutant (derived) alleles for substitutions have been determined. Substitutions were grouped according to their fixation within human populations: fixed substitutions (from the human-chimp-macaca alignment), major SNPs (> 80% mutant allele frequency within humans), medium SNPs (20% - 80% mutant allele frequency), minor SNPs (3% - 20%), and rare SNPs (<3%). Data on short (< 3 bp) and medium-length (3 - 50 bp) insertions and deletions within MRI regions and appropriate control regions were analyzed for the effect of indels on the expansion or diminution of such regions as well as on changing nucleotide composition. CONCLUSION: MRI regions have comparable levels of de novo mutations to the control genomic sequences with average base composition. De novo substitutions rapidly erode MRI regions, bringing their nucleotide composition toward genome-average levels. However, those substitutions that favor the maintenance of MRI properties have a higher chance to spread through the entire population. Indels have a clear tendency to maintain MRI features yet they have a smaller impact than substitutions. All in all, the observed fixation bias for mutations helps to preserve MRI regions during evolution. BioMed Central 2009-11-05 /pmc/articles/PMC2779198/ /pubmed/19891785 http://dx.doi.org/10.1186/1471-2164-10-513 Text en Copyright ©2009 Prakash et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Prakash, Ashwin
Shepard, Samuel S
He, Jie
Hart, Benjamin
Chen, Miao
Amarachintha, Surya P
Mileyeva-Biebesheimer, Olga
Bechtel, Jason
Fedorov, Alexei
Evolution of genomic sequence inhomogeneity at mid-range scales
title Evolution of genomic sequence inhomogeneity at mid-range scales
title_full Evolution of genomic sequence inhomogeneity at mid-range scales
title_fullStr Evolution of genomic sequence inhomogeneity at mid-range scales
title_full_unstemmed Evolution of genomic sequence inhomogeneity at mid-range scales
title_short Evolution of genomic sequence inhomogeneity at mid-range scales
title_sort evolution of genomic sequence inhomogeneity at mid-range scales
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2779198/
https://www.ncbi.nlm.nih.gov/pubmed/19891785
http://dx.doi.org/10.1186/1471-2164-10-513
work_keys_str_mv AT prakashashwin evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT shepardsamuels evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT hejie evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT hartbenjamin evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT chenmiao evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT amarachinthasuryap evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT mileyevabiebesheimerolga evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT bechteljason evolutionofgenomicsequenceinhomogeneityatmidrangescales
AT fedorovalexei evolutionofgenomicsequenceinhomogeneityatmidrangescales