Cargando…

On homology searches by protein Blast and the characterization of the age of genes

BACKGROUND: It has been shown in a variety of organisms, including mammals, that genes that appeared recently in evolution, for example orphan genes, evolve faster than older genes. Low functional constraints at the time of origin of novel genes may explain these results. However, this observation h...

Descripción completa

Detalles Bibliográficos
Autores principales: Albà, M Mar, Castresana, Jose
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1855329/
https://www.ncbi.nlm.nih.gov/pubmed/17408474
http://dx.doi.org/10.1186/1471-2148-7-53
_version_ 1782133142960734208
author Albà, M Mar
Castresana, Jose
author_facet Albà, M Mar
Castresana, Jose
author_sort Albà, M Mar
collection PubMed
description BACKGROUND: It has been shown in a variety of organisms, including mammals, that genes that appeared recently in evolution, for example orphan genes, evolve faster than older genes. Low functional constraints at the time of origin of novel genes may explain these results. However, this observation has been recently attributed to an artifact caused by the inability of Blast to detect the fastest genes in different eukaryotic genomes. Distinguishing between these two possible explanations would be of great importance for any studies dealing with the taxon distribution of proteins and the origin of novel genes. RESULTS: Here we used simulations of protein sequences to examine the capacity of Blast to detect proteins of diverse evolutionary rates in the different species of an eukaryotic phylogenetic tree that included metazoans, fungi and plants. We simulated the evolution of protein genes with the same evolutionary rates than those observed in functional mammalian genes and with among-site rate heterogeneity. Under these conditions, we found that only a very small percentage of simulated ancestral eukaryotic proteins was affected by the Blast artifact. We show that the good detectability of Blast is due to the heterogeneity of protein evolutionary rates at different sites, since only a small conserved motif in a sequence suffices to detect its homologues. Our results indicate that Blast, at least when applied within eukaryotes, only misses homologues of extremely fast-evolving sequences, which are rare in the mammalian genome, as well as sequences evolving homogeneously or pseudogenes. CONCLUSION: Although great care should be exercised in the recognition of remote homologues, most functional mammalian genes can be detected in eukaryotic genomes by Blast. That is, the majority of functional mammalian genes are not as fast as for not being detected in other metazoans, fungi or plants, if they had been present in these organisms. Thus, the correlation previously found between age and rate seems not to be due to a pure Blast artifact, at least for mammals. This may have important implications to understand the mechanisms by which novel genes originate.
format Text
id pubmed-1855329
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18553292007-04-25 On homology searches by protein Blast and the characterization of the age of genes Albà, M Mar Castresana, Jose BMC Evol Biol Correspondence BACKGROUND: It has been shown in a variety of organisms, including mammals, that genes that appeared recently in evolution, for example orphan genes, evolve faster than older genes. Low functional constraints at the time of origin of novel genes may explain these results. However, this observation has been recently attributed to an artifact caused by the inability of Blast to detect the fastest genes in different eukaryotic genomes. Distinguishing between these two possible explanations would be of great importance for any studies dealing with the taxon distribution of proteins and the origin of novel genes. RESULTS: Here we used simulations of protein sequences to examine the capacity of Blast to detect proteins of diverse evolutionary rates in the different species of an eukaryotic phylogenetic tree that included metazoans, fungi and plants. We simulated the evolution of protein genes with the same evolutionary rates than those observed in functional mammalian genes and with among-site rate heterogeneity. Under these conditions, we found that only a very small percentage of simulated ancestral eukaryotic proteins was affected by the Blast artifact. We show that the good detectability of Blast is due to the heterogeneity of protein evolutionary rates at different sites, since only a small conserved motif in a sequence suffices to detect its homologues. Our results indicate that Blast, at least when applied within eukaryotes, only misses homologues of extremely fast-evolving sequences, which are rare in the mammalian genome, as well as sequences evolving homogeneously or pseudogenes. CONCLUSION: Although great care should be exercised in the recognition of remote homologues, most functional mammalian genes can be detected in eukaryotic genomes by Blast. That is, the majority of functional mammalian genes are not as fast as for not being detected in other metazoans, fungi or plants, if they had been present in these organisms. Thus, the correlation previously found between age and rate seems not to be due to a pure Blast artifact, at least for mammals. This may have important implications to understand the mechanisms by which novel genes originate. BioMed Central 2007-04-04 /pmc/articles/PMC1855329/ /pubmed/17408474 http://dx.doi.org/10.1186/1471-2148-7-53 Text en Copyright © 2007 Albà and Castresana; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Correspondence
Albà, M Mar
Castresana, Jose
On homology searches by protein Blast and the characterization of the age of genes
title On homology searches by protein Blast and the characterization of the age of genes
title_full On homology searches by protein Blast and the characterization of the age of genes
title_fullStr On homology searches by protein Blast and the characterization of the age of genes
title_full_unstemmed On homology searches by protein Blast and the characterization of the age of genes
title_short On homology searches by protein Blast and the characterization of the age of genes
title_sort on homology searches by protein blast and the characterization of the age of genes
topic Correspondence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1855329/
https://www.ncbi.nlm.nih.gov/pubmed/17408474
http://dx.doi.org/10.1186/1471-2148-7-53
work_keys_str_mv AT albammar onhomologysearchesbyproteinblastandthecharacterizationoftheageofgenes
AT castresanajose onhomologysearchesbyproteinblastandthecharacterizationoftheageofgenes