Cargando…

Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences

Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger...

Descripción completa

Detalles Bibliográficos
Autores principales: De Silva, Dilrini R., Nichols, Richard, Elgar, Greg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111549/
https://www.ncbi.nlm.nih.gov/pubmed/25062004
http://dx.doi.org/10.1371/journal.pone.0103357
_version_ 1782328098061025280
author De Silva, Dilrini R.
Nichols, Richard
Elgar, Greg
author_facet De Silva, Dilrini R.
Nichols, Richard
Elgar, Greg
author_sort De Silva, Dilrini R.
collection PubMed
description Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger and weaker selective constraint. Here, we consider putative regulatory elements termed Conserved Non-coding Elements (CNEs) defined by their high level of sequence identity across all vertebrates. Some mutations in these regions have been implicated in developmental disorders; we analyse CNE polymorphism data to investigate whether such deleterious effects are widespread in humans. Single nucleotide variants from the HapMap and 1000 Genomes Projects were mapped across nearly 2000 CNEs. In the 1000 Genomes data we find a significant excess of rare derived alleles in CNEs relative to coding sequences; this pattern is absent in HapMap data, apparently obscured by ascertainment bias. The distribution of polymorphism within CNEs is not uniform; we could identify two categories of sites by exploiting deep vertebrate alignments: stretches that are non-variant, and those that have at least one substitution. The conserved category has fewer polymorphic sites and a greater excess of rare derived alleles, which can be explained by a large proportion of sites under strong purifying selection within humans – higher than that for non-synonymous sites in most protein coding regions, and comparable to that at the strongly conserved trans-dev genes. Conversely, the more evolutionarily labile CNE sites have an allele frequency distribution not significantly different from non-synonymous sites. Future studies should exploit genome-wide re-sequencing to obtain better coverage in selected non-coding regions, given the likelihood that mutations in evolutionarily conserved enhancer sequences are deleterious. Discovery pipelines should validate non-coding variants to aid in identifying causal and risk-enhancing variants in complex disorders, in contrast to the current focus on exome sequencing.
format Online
Article
Text
id pubmed-4111549
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41115492014-07-29 Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences De Silva, Dilrini R. Nichols, Richard Elgar, Greg PLoS One Research Article Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger and weaker selective constraint. Here, we consider putative regulatory elements termed Conserved Non-coding Elements (CNEs) defined by their high level of sequence identity across all vertebrates. Some mutations in these regions have been implicated in developmental disorders; we analyse CNE polymorphism data to investigate whether such deleterious effects are widespread in humans. Single nucleotide variants from the HapMap and 1000 Genomes Projects were mapped across nearly 2000 CNEs. In the 1000 Genomes data we find a significant excess of rare derived alleles in CNEs relative to coding sequences; this pattern is absent in HapMap data, apparently obscured by ascertainment bias. The distribution of polymorphism within CNEs is not uniform; we could identify two categories of sites by exploiting deep vertebrate alignments: stretches that are non-variant, and those that have at least one substitution. The conserved category has fewer polymorphic sites and a greater excess of rare derived alleles, which can be explained by a large proportion of sites under strong purifying selection within humans – higher than that for non-synonymous sites in most protein coding regions, and comparable to that at the strongly conserved trans-dev genes. Conversely, the more evolutionarily labile CNE sites have an allele frequency distribution not significantly different from non-synonymous sites. Future studies should exploit genome-wide re-sequencing to obtain better coverage in selected non-coding regions, given the likelihood that mutations in evolutionarily conserved enhancer sequences are deleterious. Discovery pipelines should validate non-coding variants to aid in identifying causal and risk-enhancing variants in complex disorders, in contrast to the current focus on exome sequencing. Public Library of Science 2014-07-25 /pmc/articles/PMC4111549/ /pubmed/25062004 http://dx.doi.org/10.1371/journal.pone.0103357 Text en © 2014 De Silva et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
De Silva, Dilrini R.
Nichols, Richard
Elgar, Greg
Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
title Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
title_full Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
title_fullStr Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
title_full_unstemmed Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
title_short Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
title_sort purifying selection in deeply conserved human enhancers is more consistent than in coding sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111549/
https://www.ncbi.nlm.nih.gov/pubmed/25062004
http://dx.doi.org/10.1371/journal.pone.0103357
work_keys_str_mv AT desilvadilrinir purifyingselectionindeeplyconservedhumanenhancersismoreconsistentthanincodingsequences
AT nicholsrichard purifyingselectionindeeplyconservedhumanenhancersismoreconsistentthanincodingsequences
AT elgargreg purifyingselectionindeeplyconservedhumanenhancersismoreconsistentthanincodingsequences