Cargando…

CEGA—a catalog of conserved elements from genomic alignments

By identifying genomic sequence regions conserved among several species, comparative genomics offers opportunities to discover putatively functional elements without any prior knowledge of what these functions might be. Comparative analyses across mammals estimated 4–5% of the human genome to be fun...

Descripción completa

Detalles Bibliográficos
Autores principales: Dousse, Aline, Junier, Thomas, Zdobnov, Evgeny M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702837/
https://www.ncbi.nlm.nih.gov/pubmed/26527719
http://dx.doi.org/10.1093/nar/gkv1163
_version_ 1782408661641986048
author Dousse, Aline
Junier, Thomas
Zdobnov, Evgeny M.
author_facet Dousse, Aline
Junier, Thomas
Zdobnov, Evgeny M.
author_sort Dousse, Aline
collection PubMed
description By identifying genomic sequence regions conserved among several species, comparative genomics offers opportunities to discover putatively functional elements without any prior knowledge of what these functions might be. Comparative analyses across mammals estimated 4–5% of the human genome to be functionally constrained, a much larger fraction than the 1–2% occupied by annotated protein-coding or RNA genes. Such functionally constrained yet unannotated regions have been referred to as conserved non-coding sequences (CNCs) or ultra-conserved elements (UCEs), which remain largely uncharacterized but probably form a highly heterogeneous group of elements including enhancers, promoters, motifs, and others. To facilitate the study of such CNCs/UCEs, we present our resource of Conserved Elements from Genomic Alignments (CEGA), accessible from http://cega.ezlab.org. Harnessing the power of multiple species comparisons to detect genomic elements under purifying selection, CEGA provides a comprehensive set of CNCs identified at different radiations along the vertebrate lineage. Evolutionary constraint is identified using threshold-free phylogenetic modeling of unbiased and sensitive global alignments of genomic synteny blocks identified using protein orthology. We identified CNCs independently for five vertebrate clades, each referring to a different last common ancestor and therefore to an overlapping but varying set of CNCs with 24 488 in vertebrates, 241 575 in amniotes, 709 743 in Eutheria, 642 701 in Boreoeutheria and 612 364 in Euarchontoglires, spanning from 6 Mbp in vertebrates to 119 Mbp in Euarchontoglires. The dynamic CEGA web interface displays alignments, genomic locations, as well as biologically relevant data to help prioritize and select CNCs of interest for further functional investigations.
format Online
Article
Text
id pubmed-4702837
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47028372016-01-07 CEGA—a catalog of conserved elements from genomic alignments Dousse, Aline Junier, Thomas Zdobnov, Evgeny M. Nucleic Acids Res Database Issue By identifying genomic sequence regions conserved among several species, comparative genomics offers opportunities to discover putatively functional elements without any prior knowledge of what these functions might be. Comparative analyses across mammals estimated 4–5% of the human genome to be functionally constrained, a much larger fraction than the 1–2% occupied by annotated protein-coding or RNA genes. Such functionally constrained yet unannotated regions have been referred to as conserved non-coding sequences (CNCs) or ultra-conserved elements (UCEs), which remain largely uncharacterized but probably form a highly heterogeneous group of elements including enhancers, promoters, motifs, and others. To facilitate the study of such CNCs/UCEs, we present our resource of Conserved Elements from Genomic Alignments (CEGA), accessible from http://cega.ezlab.org. Harnessing the power of multiple species comparisons to detect genomic elements under purifying selection, CEGA provides a comprehensive set of CNCs identified at different radiations along the vertebrate lineage. Evolutionary constraint is identified using threshold-free phylogenetic modeling of unbiased and sensitive global alignments of genomic synteny blocks identified using protein orthology. We identified CNCs independently for five vertebrate clades, each referring to a different last common ancestor and therefore to an overlapping but varying set of CNCs with 24 488 in vertebrates, 241 575 in amniotes, 709 743 in Eutheria, 642 701 in Boreoeutheria and 612 364 in Euarchontoglires, spanning from 6 Mbp in vertebrates to 119 Mbp in Euarchontoglires. The dynamic CEGA web interface displays alignments, genomic locations, as well as biologically relevant data to help prioritize and select CNCs of interest for further functional investigations. Oxford University Press 2016-01-04 2015-11-02 /pmc/articles/PMC4702837/ /pubmed/26527719 http://dx.doi.org/10.1093/nar/gkv1163 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Issue
Dousse, Aline
Junier, Thomas
Zdobnov, Evgeny M.
CEGA—a catalog of conserved elements from genomic alignments
title CEGA—a catalog of conserved elements from genomic alignments
title_full CEGA—a catalog of conserved elements from genomic alignments
title_fullStr CEGA—a catalog of conserved elements from genomic alignments
title_full_unstemmed CEGA—a catalog of conserved elements from genomic alignments
title_short CEGA—a catalog of conserved elements from genomic alignments
title_sort cega—a catalog of conserved elements from genomic alignments
topic Database Issue
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702837/
https://www.ncbi.nlm.nih.gov/pubmed/26527719
http://dx.doi.org/10.1093/nar/gkv1163
work_keys_str_mv AT doussealine cegaacatalogofconservedelementsfromgenomicalignments
AT junierthomas cegaacatalogofconservedelementsfromgenomicalignments
AT zdobnovevgenym cegaacatalogofconservedelementsfromgenomicalignments