Cargando…
Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111549/ https://www.ncbi.nlm.nih.gov/pubmed/25062004 http://dx.doi.org/10.1371/journal.pone.0103357 |
_version_ | 1782328098061025280 |
---|---|
author | De Silva, Dilrini R. Nichols, Richard Elgar, Greg |
author_facet | De Silva, Dilrini R. Nichols, Richard Elgar, Greg |
author_sort | De Silva, Dilrini R. |
collection | PubMed |
description | Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger and weaker selective constraint. Here, we consider putative regulatory elements termed Conserved Non-coding Elements (CNEs) defined by their high level of sequence identity across all vertebrates. Some mutations in these regions have been implicated in developmental disorders; we analyse CNE polymorphism data to investigate whether such deleterious effects are widespread in humans. Single nucleotide variants from the HapMap and 1000 Genomes Projects were mapped across nearly 2000 CNEs. In the 1000 Genomes data we find a significant excess of rare derived alleles in CNEs relative to coding sequences; this pattern is absent in HapMap data, apparently obscured by ascertainment bias. The distribution of polymorphism within CNEs is not uniform; we could identify two categories of sites by exploiting deep vertebrate alignments: stretches that are non-variant, and those that have at least one substitution. The conserved category has fewer polymorphic sites and a greater excess of rare derived alleles, which can be explained by a large proportion of sites under strong purifying selection within humans – higher than that for non-synonymous sites in most protein coding regions, and comparable to that at the strongly conserved trans-dev genes. Conversely, the more evolutionarily labile CNE sites have an allele frequency distribution not significantly different from non-synonymous sites. Future studies should exploit genome-wide re-sequencing to obtain better coverage in selected non-coding regions, given the likelihood that mutations in evolutionarily conserved enhancer sequences are deleterious. Discovery pipelines should validate non-coding variants to aid in identifying causal and risk-enhancing variants in complex disorders, in contrast to the current focus on exome sequencing. |
format | Online Article Text |
id | pubmed-4111549 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-41115492014-07-29 Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences De Silva, Dilrini R. Nichols, Richard Elgar, Greg PLoS One Research Article Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger and weaker selective constraint. Here, we consider putative regulatory elements termed Conserved Non-coding Elements (CNEs) defined by their high level of sequence identity across all vertebrates. Some mutations in these regions have been implicated in developmental disorders; we analyse CNE polymorphism data to investigate whether such deleterious effects are widespread in humans. Single nucleotide variants from the HapMap and 1000 Genomes Projects were mapped across nearly 2000 CNEs. In the 1000 Genomes data we find a significant excess of rare derived alleles in CNEs relative to coding sequences; this pattern is absent in HapMap data, apparently obscured by ascertainment bias. The distribution of polymorphism within CNEs is not uniform; we could identify two categories of sites by exploiting deep vertebrate alignments: stretches that are non-variant, and those that have at least one substitution. The conserved category has fewer polymorphic sites and a greater excess of rare derived alleles, which can be explained by a large proportion of sites under strong purifying selection within humans – higher than that for non-synonymous sites in most protein coding regions, and comparable to that at the strongly conserved trans-dev genes. Conversely, the more evolutionarily labile CNE sites have an allele frequency distribution not significantly different from non-synonymous sites. Future studies should exploit genome-wide re-sequencing to obtain better coverage in selected non-coding regions, given the likelihood that mutations in evolutionarily conserved enhancer sequences are deleterious. Discovery pipelines should validate non-coding variants to aid in identifying causal and risk-enhancing variants in complex disorders, in contrast to the current focus on exome sequencing. Public Library of Science 2014-07-25 /pmc/articles/PMC4111549/ /pubmed/25062004 http://dx.doi.org/10.1371/journal.pone.0103357 Text en © 2014 De Silva et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article De Silva, Dilrini R. Nichols, Richard Elgar, Greg Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences |
title | Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences |
title_full | Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences |
title_fullStr | Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences |
title_full_unstemmed | Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences |
title_short | Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences |
title_sort | purifying selection in deeply conserved human enhancers is more consistent than in coding sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111549/ https://www.ncbi.nlm.nih.gov/pubmed/25062004 http://dx.doi.org/10.1371/journal.pone.0103357 |
work_keys_str_mv | AT desilvadilrinir purifyingselectionindeeplyconservedhumanenhancersismoreconsistentthanincodingsequences AT nicholsrichard purifyingselectionindeeplyconservedhumanenhancersismoreconsistentthanincodingsequences AT elgargreg purifyingselectionindeeplyconservedhumanenhancersismoreconsistentthanincodingsequences |