Cargando…

Genome Trees from Conservation Profiles

The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data...

Descripción completa

Detalles Bibliográficos
Autores principales: Tekaia, Fredj, Yeramian, Edouard
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1314884/
https://www.ncbi.nlm.nih.gov/pubmed/16362074
http://dx.doi.org/10.1371/journal.pcbi.0010075
_version_ 1782126348532187136
author Tekaia, Fredj
Yeramian, Edouard
author_facet Tekaia, Fredj
Yeramian, Edouard
author_sort Tekaia, Fredj
collection PubMed
description The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component “presence-absence” protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings.
format Text
id pubmed-1314884
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-13148842005-12-16 Genome Trees from Conservation Profiles Tekaia, Fredj Yeramian, Edouard PLoS Comput Biol Research Article The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component “presence-absence” protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings. Public Library of Science 2005-12 2005-12-16 /pmc/articles/PMC1314884/ /pubmed/16362074 http://dx.doi.org/10.1371/journal.pcbi.0010075 Text en Copyright: © 2005 Tekaia and Yeramian. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Tekaia, Fredj
Yeramian, Edouard
Genome Trees from Conservation Profiles
title Genome Trees from Conservation Profiles
title_full Genome Trees from Conservation Profiles
title_fullStr Genome Trees from Conservation Profiles
title_full_unstemmed Genome Trees from Conservation Profiles
title_short Genome Trees from Conservation Profiles
title_sort genome trees from conservation profiles
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1314884/
https://www.ncbi.nlm.nih.gov/pubmed/16362074
http://dx.doi.org/10.1371/journal.pcbi.0010075
work_keys_str_mv AT tekaiafredj genometreesfromconservationprofiles
AT yeramianedouard genometreesfromconservationprofiles