Cargando…

Analysis of Sequence Conservation at Nucleotide Resolution

One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting th...

Descripción completa

Detalles Bibliográficos
Autores principales: Asthana, Saurabh, Roytberg, Mikhail, Stamatoyannopoulos, John, Sunyaev, Shamil
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2230682/
https://www.ncbi.nlm.nih.gov/pubmed/18166073
http://dx.doi.org/10.1371/journal.pcbi.0030254
_version_ 1782150239797379072
author Asthana, Saurabh
Roytberg, Mikhail
Stamatoyannopoulos, John
Sunyaev, Shamil
author_facet Asthana, Saurabh
Roytberg, Mikhail
Stamatoyannopoulos, John
Sunyaev, Shamil
author_sort Asthana, Saurabh
collection PubMed
description One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved “chunks.” Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence.
format Text
id pubmed-2230682
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-22306822008-02-05 Analysis of Sequence Conservation at Nucleotide Resolution Asthana, Saurabh Roytberg, Mikhail Stamatoyannopoulos, John Sunyaev, Shamil PLoS Comput Biol Research Article One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved “chunks.” Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence. Public Library of Science 2007-12 2007-12-28 /pmc/articles/PMC2230682/ /pubmed/18166073 http://dx.doi.org/10.1371/journal.pcbi.0030254 Text en © 2007 Asthana et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Asthana, Saurabh
Roytberg, Mikhail
Stamatoyannopoulos, John
Sunyaev, Shamil
Analysis of Sequence Conservation at Nucleotide Resolution
title Analysis of Sequence Conservation at Nucleotide Resolution
title_full Analysis of Sequence Conservation at Nucleotide Resolution
title_fullStr Analysis of Sequence Conservation at Nucleotide Resolution
title_full_unstemmed Analysis of Sequence Conservation at Nucleotide Resolution
title_short Analysis of Sequence Conservation at Nucleotide Resolution
title_sort analysis of sequence conservation at nucleotide resolution
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2230682/
https://www.ncbi.nlm.nih.gov/pubmed/18166073
http://dx.doi.org/10.1371/journal.pcbi.0030254
work_keys_str_mv AT asthanasaurabh analysisofsequenceconservationatnucleotideresolution
AT roytbergmikhail analysisofsequenceconservationatnucleotideresolution
AT stamatoyannopoulosjohn analysisofsequenceconservationatnucleotideresolution
AT sunyaevshamil analysisofsequenceconservationatnucleotideresolution