Cargando…

Evolutionary sequence analysis of complete eukaryote genomes

BACKGROUND: Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene con...

Descripción completa

Detalles Bibliográficos
Autores principales: Blair, Jaime E, Shah, Prachi, Hedges, S Blair
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1274250/
https://www.ncbi.nlm.nih.gov/pubmed/15762985
http://dx.doi.org/10.1186/1471-2105-6-53
_version_ 1782125972934361088
author Blair, Jaime E
Shah, Prachi
Hedges, S Blair
author_facet Blair, Jaime E
Shah, Prachi
Hedges, S Blair
author_sort Blair, Jaime E
collection PubMed
description BACKGROUND: Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation. RESULTS: Despite the conservative criterion used, 753 panorthologs (proteins) were identified for evolutionary analysis with four genomes, resulting in a single alignment of 287,000 amino acids. With this data set, we estimate that the divergence between deuterostomes and arthropods took place in the Precambrian, approximately 400 million years before the first appearance of animals in the fossil record. Additional analyses were performed with seven, 12, and 15 eukaryote genomes resulting in similar divergence time estimates and phylogenies. CONCLUSION: Our results with available eukaryote genomes agree with previous results using conventional methods of sequence data assembly from genomes. They show that large sequence data sets can be generated relatively quickly and efficiently for evolutionary analyses of complete genomes.
format Text
id pubmed-1274250
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12742502005-10-29 Evolutionary sequence analysis of complete eukaryote genomes Blair, Jaime E Shah, Prachi Hedges, S Blair BMC Bioinformatics Research Article BACKGROUND: Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation. RESULTS: Despite the conservative criterion used, 753 panorthologs (proteins) were identified for evolutionary analysis with four genomes, resulting in a single alignment of 287,000 amino acids. With this data set, we estimate that the divergence between deuterostomes and arthropods took place in the Precambrian, approximately 400 million years before the first appearance of animals in the fossil record. Additional analyses were performed with seven, 12, and 15 eukaryote genomes resulting in similar divergence time estimates and phylogenies. CONCLUSION: Our results with available eukaryote genomes agree with previous results using conventional methods of sequence data assembly from genomes. They show that large sequence data sets can be generated relatively quickly and efficiently for evolutionary analyses of complete genomes. BioMed Central 2005-03-11 /pmc/articles/PMC1274250/ /pubmed/15762985 http://dx.doi.org/10.1186/1471-2105-6-53 Text en Copyright © 2005 Blair et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Blair, Jaime E
Shah, Prachi
Hedges, S Blair
Evolutionary sequence analysis of complete eukaryote genomes
title Evolutionary sequence analysis of complete eukaryote genomes
title_full Evolutionary sequence analysis of complete eukaryote genomes
title_fullStr Evolutionary sequence analysis of complete eukaryote genomes
title_full_unstemmed Evolutionary sequence analysis of complete eukaryote genomes
title_short Evolutionary sequence analysis of complete eukaryote genomes
title_sort evolutionary sequence analysis of complete eukaryote genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1274250/
https://www.ncbi.nlm.nih.gov/pubmed/15762985
http://dx.doi.org/10.1186/1471-2105-6-53
work_keys_str_mv AT blairjaimee evolutionarysequenceanalysisofcompleteeukaryotegenomes
AT shahprachi evolutionarysequenceanalysisofcompleteeukaryotegenomes
AT hedgessblair evolutionarysequenceanalysisofcompleteeukaryotegenomes