Cargando…

The Evolution of Word Composition in Metazoan Promoter Sequence

The field of molecular evolution provides many examples of the principle that molecular differences between species contain information about evolutionary history. One surprising case can be found in the frequency of short words in DNA: more closely related species have more similar word composition...

Descripción completa

Detalles Bibliográficos
Autores principales: Bush, Eliot C, Lahn, Bruce T
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1630712/
https://www.ncbi.nlm.nih.gov/pubmed/17083273
http://dx.doi.org/10.1371/journal.pcbi.0020150
_version_ 1782130632459026432
author Bush, Eliot C
Lahn, Bruce T
author_facet Bush, Eliot C
Lahn, Bruce T
author_sort Bush, Eliot C
collection PubMed
description The field of molecular evolution provides many examples of the principle that molecular differences between species contain information about evolutionary history. One surprising case can be found in the frequency of short words in DNA: more closely related species have more similar word compositions. Interest in this has often focused on its utility in deducing phylogenetic relationships. However, it is also of interest because of the opportunity it provides for studying the evolution of genome function. Word-frequency differences between species change too slowly to be purely the result of random mutational drift. Rather, their slow pattern of change reflects the direct or indirect action of purifying selection and the presence of functional constraints. Many such constraints are likely to exist, and an important challenge is to distinguish them. Here we develop a method to do so by isolating the effects acting at different word sizes. We apply our method to 2-, 4-, and 8-base-pair (bp) words across several classes of noncoding sequence. Our major result is that similarities in 8-bp word frequencies scale with evolutionary time for regions immediately upstream of genes. This association is present although weaker in intronic sequence, but cannot be detected in intergenic sequence using our method. In contrast, 2-bp and 4-bp word frequencies scale with time in all classes of noncoding sequence. These results suggest that different genomic processes are involved at different word sizes. The pattern in 2-bp and 4-bp words may be due to evolutionary changes in processes such as DNA replication and repair, as has been suggested before. The pattern in 8-bp words may reflect evolutionary changes in gene-regulatory machinery, such as changes in the frequencies of transcription-factor binding sites, or in the affinity of transcription factors for particular sequences.
format Text
id pubmed-1630712
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-16307122006-11-06 The Evolution of Word Composition in Metazoan Promoter Sequence Bush, Eliot C Lahn, Bruce T PLoS Comput Biol Research Article The field of molecular evolution provides many examples of the principle that molecular differences between species contain information about evolutionary history. One surprising case can be found in the frequency of short words in DNA: more closely related species have more similar word compositions. Interest in this has often focused on its utility in deducing phylogenetic relationships. However, it is also of interest because of the opportunity it provides for studying the evolution of genome function. Word-frequency differences between species change too slowly to be purely the result of random mutational drift. Rather, their slow pattern of change reflects the direct or indirect action of purifying selection and the presence of functional constraints. Many such constraints are likely to exist, and an important challenge is to distinguish them. Here we develop a method to do so by isolating the effects acting at different word sizes. We apply our method to 2-, 4-, and 8-base-pair (bp) words across several classes of noncoding sequence. Our major result is that similarities in 8-bp word frequencies scale with evolutionary time for regions immediately upstream of genes. This association is present although weaker in intronic sequence, but cannot be detected in intergenic sequence using our method. In contrast, 2-bp and 4-bp word frequencies scale with time in all classes of noncoding sequence. These results suggest that different genomic processes are involved at different word sizes. The pattern in 2-bp and 4-bp words may be due to evolutionary changes in processes such as DNA replication and repair, as has been suggested before. The pattern in 8-bp words may reflect evolutionary changes in gene-regulatory machinery, such as changes in the frequencies of transcription-factor binding sites, or in the affinity of transcription factors for particular sequences. Public Library of Science 2006-11 2006-11-03 /pmc/articles/PMC1630712/ /pubmed/17083273 http://dx.doi.org/10.1371/journal.pcbi.0020150 Text en © 2006 Bush and Lahn. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Bush, Eliot C
Lahn, Bruce T
The Evolution of Word Composition in Metazoan Promoter Sequence
title The Evolution of Word Composition in Metazoan Promoter Sequence
title_full The Evolution of Word Composition in Metazoan Promoter Sequence
title_fullStr The Evolution of Word Composition in Metazoan Promoter Sequence
title_full_unstemmed The Evolution of Word Composition in Metazoan Promoter Sequence
title_short The Evolution of Word Composition in Metazoan Promoter Sequence
title_sort evolution of word composition in metazoan promoter sequence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1630712/
https://www.ncbi.nlm.nih.gov/pubmed/17083273
http://dx.doi.org/10.1371/journal.pcbi.0020150
work_keys_str_mv AT busheliotc theevolutionofwordcompositioninmetazoanpromotersequence
AT lahnbrucet theevolutionofwordcompositioninmetazoanpromotersequence
AT busheliotc evolutionofwordcompositioninmetazoanpromotersequence
AT lahnbrucet evolutionofwordcompositioninmetazoanpromotersequence