Cargando…
Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models
BACKGROUND: The size distribution of gene families in a broad range of genomes is well approximated by a generalized Pareto function. Evolution of ensembles of gene families can be described with Birth, Death, and Innovation Models (BDIMs). Analysis of the properties of different versions of BDIMs h...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2004
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC523855/ https://www.ncbi.nlm.nih.gov/pubmed/15357876 http://dx.doi.org/10.1186/1471-2148-4-32 |
_version_ | 1782121877405171712 |
---|---|
author | Karev, Georgy P Wolf, Yuri I Berezovskaya, Faina S Koonin, Eugene V |
author_facet | Karev, Georgy P Wolf, Yuri I Berezovskaya, Faina S Koonin, Eugene V |
author_sort | Karev, Georgy P |
collection | PubMed |
description | BACKGROUND: The size distribution of gene families in a broad range of genomes is well approximated by a generalized Pareto function. Evolution of ensembles of gene families can be described with Birth, Death, and Innovation Models (BDIMs). Analysis of the properties of different versions of BDIMs has the potential of revealing important features of genome evolution. RESULTS: In this work, we extend our previous analysis of stochastic BDIMs. In addition to the previously examined rational BDIMs, we introduce potentially more realistic logistic BDIMs, in which birth/death rates are limited for the largest families, and show that their properties are similar to those of models that include no such limitation. We show that the mean time required for the formation of the largest gene families detected in eukaryotic genomes is limited by the mean number of duplications per gene and does not increase indefinitely with the model degree. Instead, this time reaches a minimum value, which corresponds to a non-linear rational BDIM with the degree of approximately 2.7. Even for this BDIM, the mean time of the largest family formation is orders of magnitude greater than any realistic estimates based on the timescale of life's evolution. We employed the embedding chains technique to estimate the expected number of elementary evolutionary events (gene duplications and deletions) preceding the formation of gene families of the observed size and found that the mean number of events exceeds the family size by orders of magnitude, suggesting a highly dynamic process of genome evolution. The variance of the time required for the formation of the largest families was found to be extremely large, with the coefficient of variation >> 1. This indicates that some gene families might grow much faster than the mean rate such that the minimal time required for family formation is more relevant for a realistic representation of genome evolution than the mean time. We determined this minimal time using Monte Carlo simulations of family growth from an ensemble of simultaneously evolving singletons. In these simulations, the time elapsed before the formation of the largest family was much shorter than the estimated mean time and was compatible with the timescale of evolution of eukaryotes. CONCLUSIONS: The analysis of stochastic BDIMs presented here shows that non-linear versions of such models can well approximate not only the size distribution of gene families but also the dynamics of their formation during genome evolution. The fact that only higher degree BDIMs are compatible with the observed characteristics of genome evolution suggests that the growth of gene families is self-accelerating, which might reflect differential selective pressure acting on different genes. |
format | Text |
id | pubmed-523855 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2004 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-5238552004-10-22 Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models Karev, Georgy P Wolf, Yuri I Berezovskaya, Faina S Koonin, Eugene V BMC Evol Biol Research Article BACKGROUND: The size distribution of gene families in a broad range of genomes is well approximated by a generalized Pareto function. Evolution of ensembles of gene families can be described with Birth, Death, and Innovation Models (BDIMs). Analysis of the properties of different versions of BDIMs has the potential of revealing important features of genome evolution. RESULTS: In this work, we extend our previous analysis of stochastic BDIMs. In addition to the previously examined rational BDIMs, we introduce potentially more realistic logistic BDIMs, in which birth/death rates are limited for the largest families, and show that their properties are similar to those of models that include no such limitation. We show that the mean time required for the formation of the largest gene families detected in eukaryotic genomes is limited by the mean number of duplications per gene and does not increase indefinitely with the model degree. Instead, this time reaches a minimum value, which corresponds to a non-linear rational BDIM with the degree of approximately 2.7. Even for this BDIM, the mean time of the largest family formation is orders of magnitude greater than any realistic estimates based on the timescale of life's evolution. We employed the embedding chains technique to estimate the expected number of elementary evolutionary events (gene duplications and deletions) preceding the formation of gene families of the observed size and found that the mean number of events exceeds the family size by orders of magnitude, suggesting a highly dynamic process of genome evolution. The variance of the time required for the formation of the largest families was found to be extremely large, with the coefficient of variation >> 1. This indicates that some gene families might grow much faster than the mean rate such that the minimal time required for family formation is more relevant for a realistic representation of genome evolution than the mean time. We determined this minimal time using Monte Carlo simulations of family growth from an ensemble of simultaneously evolving singletons. In these simulations, the time elapsed before the formation of the largest family was much shorter than the estimated mean time and was compatible with the timescale of evolution of eukaryotes. CONCLUSIONS: The analysis of stochastic BDIMs presented here shows that non-linear versions of such models can well approximate not only the size distribution of gene families but also the dynamics of their formation during genome evolution. The fact that only higher degree BDIMs are compatible with the observed characteristics of genome evolution suggests that the growth of gene families is self-accelerating, which might reflect differential selective pressure acting on different genes. BioMed Central 2004-09-09 /pmc/articles/PMC523855/ /pubmed/15357876 http://dx.doi.org/10.1186/1471-2148-4-32 Text en Copyright © 2004 Karev et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open-access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Karev, Georgy P Wolf, Yuri I Berezovskaya, Faina S Koonin, Eugene V Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
title | Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
title_full | Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
title_fullStr | Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
title_full_unstemmed | Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
title_short | Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
title_sort | gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC523855/ https://www.ncbi.nlm.nih.gov/pubmed/15357876 http://dx.doi.org/10.1186/1471-2148-4-32 |
work_keys_str_mv | AT karevgeorgyp genefamilyevolutionanindepththeoreticalandsimulationanalysisofnonlinearbirthdeathinnovationmodels AT wolfyurii genefamilyevolutionanindepththeoreticalandsimulationanalysisofnonlinearbirthdeathinnovationmodels AT berezovskayafainas genefamilyevolutionanindepththeoreticalandsimulationanalysisofnonlinearbirthdeathinnovationmodels AT koonineugenev genefamilyevolutionanindepththeoreticalandsimulationanalysisofnonlinearbirthdeathinnovationmodels |