Cargando…

The global landscape of sequence diversity

BACKGROUND: Systematic comparisons between genomic sequence datasets have revealed a wide spectrum of sequence specificity from sequences that are highly conserved to those that are specific to individual species. Due to the limited number of fully sequenced eukaryotic genomes, analyses of this spec...

Descripción completa

Detalles Bibliográficos
Autores principales: Peregrín-Álvarez, José Manuel, Parkinson, John
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2258180/
https://www.ncbi.nlm.nih.gov/pubmed/17996061
http://dx.doi.org/10.1186/gb-2007-8-11-r238
_version_ 1782151323558346752
author Peregrín-Álvarez, José Manuel
Parkinson, John
author_facet Peregrín-Álvarez, José Manuel
Parkinson, John
author_sort Peregrín-Álvarez, José Manuel
collection PubMed
description BACKGROUND: Systematic comparisons between genomic sequence datasets have revealed a wide spectrum of sequence specificity from sequences that are highly conserved to those that are specific to individual species. Due to the limited number of fully sequenced eukaryotic genomes, analyses of this spectrum have largely focused on prokaryotes. Combining existing genomic datasets with the partial genomes of 193 eukaryotes derived from collections of expressed sequence tags, we performed a quantitative analysis of the sequence specificity spectrum to provide a global view of the origins and extent of sequence diversity across the three domains of life. RESULTS: Comparisons with prokaryotic datasets reveal a greater genetic diversity within eukaryotes that may be related to differences in modes of genetic inheritance. Mapping this diversity within a phylogenetic framework revealed that the majority of sequences are either highly conserved or specific to the species or taxon from which they derive. Between these two extremes, several evolutionary landmarks consisting of large numbers of sequences conserved within specific taxonomic groups were identified. For example, 8% of sequences derived from metazoan species are specific and conserved within the metazoan lineage. Many of these sequences likely mediate metazoan specific functions, such as cell-cell communication and differentiation. CONCLUSION: Through the use of partial genome datasets, this study provides a unique perspective of sequence conservation across the three domains of life. The provision of taxon restricted sequences should prove valuable for future computational and biochemical analyses aimed at understanding evolutionary and functional relationships.
format Text
id pubmed-2258180
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22581802008-02-28 The global landscape of sequence diversity Peregrín-Álvarez, José Manuel Parkinson, John Genome Biol Research BACKGROUND: Systematic comparisons between genomic sequence datasets have revealed a wide spectrum of sequence specificity from sequences that are highly conserved to those that are specific to individual species. Due to the limited number of fully sequenced eukaryotic genomes, analyses of this spectrum have largely focused on prokaryotes. Combining existing genomic datasets with the partial genomes of 193 eukaryotes derived from collections of expressed sequence tags, we performed a quantitative analysis of the sequence specificity spectrum to provide a global view of the origins and extent of sequence diversity across the three domains of life. RESULTS: Comparisons with prokaryotic datasets reveal a greater genetic diversity within eukaryotes that may be related to differences in modes of genetic inheritance. Mapping this diversity within a phylogenetic framework revealed that the majority of sequences are either highly conserved or specific to the species or taxon from which they derive. Between these two extremes, several evolutionary landmarks consisting of large numbers of sequences conserved within specific taxonomic groups were identified. For example, 8% of sequences derived from metazoan species are specific and conserved within the metazoan lineage. Many of these sequences likely mediate metazoan specific functions, such as cell-cell communication and differentiation. CONCLUSION: Through the use of partial genome datasets, this study provides a unique perspective of sequence conservation across the three domains of life. The provision of taxon restricted sequences should prove valuable for future computational and biochemical analyses aimed at understanding evolutionary and functional relationships. BioMed Central 2007 2007-11-08 /pmc/articles/PMC2258180/ /pubmed/17996061 http://dx.doi.org/10.1186/gb-2007-8-11-r238 Text en Copyright © 2007 Peregrín-Álvarez and Parkinson; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Peregrín-Álvarez, José Manuel
Parkinson, John
The global landscape of sequence diversity
title The global landscape of sequence diversity
title_full The global landscape of sequence diversity
title_fullStr The global landscape of sequence diversity
title_full_unstemmed The global landscape of sequence diversity
title_short The global landscape of sequence diversity
title_sort global landscape of sequence diversity
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2258180/
https://www.ncbi.nlm.nih.gov/pubmed/17996061
http://dx.doi.org/10.1186/gb-2007-8-11-r238
work_keys_str_mv AT peregrinalvarezjosemanuel thegloballandscapeofsequencediversity
AT parkinsonjohn thegloballandscapeofsequencediversity
AT peregrinalvarezjosemanuel globallandscapeofsequencediversity
AT parkinsonjohn globallandscapeofsequencediversity