Cargando…
Estimation of prokaryotic supergenome size and composition from gene frequency distributions
BACKGROUND: Because prokaryotic genomes experience a rapid flux of genes, selection may act at a higher level than an individual genome. We explore a quantitative model of the distributed genome whereby groups of genomes evolve by acquiring genes from a fixed reservoir which we denote as supergenome...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240607/ https://www.ncbi.nlm.nih.gov/pubmed/25572821 http://dx.doi.org/10.1186/1471-2164-15-S6-S14 |
_version_ | 1782345742770241536 |
---|---|
author | Lobkovsky, Alexander E Wolf, Yuri I Koonin, Eugene V |
author_facet | Lobkovsky, Alexander E Wolf, Yuri I Koonin, Eugene V |
author_sort | Lobkovsky, Alexander E |
collection | PubMed |
description | BACKGROUND: Because prokaryotic genomes experience a rapid flux of genes, selection may act at a higher level than an individual genome. We explore a quantitative model of the distributed genome whereby groups of genomes evolve by acquiring genes from a fixed reservoir which we denote as supergenome. Previous attempts to understand the nature of the supergenome treated genomes as random, independent collections of genes and assumed that the supergenome consists of a small number of homogeneous sub-reservoirs. Here we explore the consequences of relaxing both assumptions. RESULTS: We surveyed several methods for estimating the size and composition of the supergenome. The methods assumed that genomes were either random, independent samples of the supergenome or that they evolved from a common ancestor along a known tree via stochastic sampling from the reservoir. The reservoir was assumed to be either a collection of homogeneous sub-reservoirs or alternatively composed of genes with Gamma distributed gain probabilities. Empirical gene frequencies were used to either compute the likelihood of the data directly or first to reconstruct the history of gene gains and then compute the likelihood of the reconstructed numbers of gains. CONCLUSIONS: Supergenome size estimates using the empirical gene frequencies directly are not robust with respect to the choice of the model. By contrast, using the gene frequencies and the phylogenetic tree to reconstruct multiple gene gains produces reliable estimates of the supergenome size and indicates that a homogeneous supergenome is more consistent with the data than a supergenome with Gamma distributed gain probabilities. |
format | Online Article Text |
id | pubmed-4240607 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42406072014-11-25 Estimation of prokaryotic supergenome size and composition from gene frequency distributions Lobkovsky, Alexander E Wolf, Yuri I Koonin, Eugene V BMC Genomics Research BACKGROUND: Because prokaryotic genomes experience a rapid flux of genes, selection may act at a higher level than an individual genome. We explore a quantitative model of the distributed genome whereby groups of genomes evolve by acquiring genes from a fixed reservoir which we denote as supergenome. Previous attempts to understand the nature of the supergenome treated genomes as random, independent collections of genes and assumed that the supergenome consists of a small number of homogeneous sub-reservoirs. Here we explore the consequences of relaxing both assumptions. RESULTS: We surveyed several methods for estimating the size and composition of the supergenome. The methods assumed that genomes were either random, independent samples of the supergenome or that they evolved from a common ancestor along a known tree via stochastic sampling from the reservoir. The reservoir was assumed to be either a collection of homogeneous sub-reservoirs or alternatively composed of genes with Gamma distributed gain probabilities. Empirical gene frequencies were used to either compute the likelihood of the data directly or first to reconstruct the history of gene gains and then compute the likelihood of the reconstructed numbers of gains. CONCLUSIONS: Supergenome size estimates using the empirical gene frequencies directly are not robust with respect to the choice of the model. By contrast, using the gene frequencies and the phylogenetic tree to reconstruct multiple gene gains produces reliable estimates of the supergenome size and indicates that a homogeneous supergenome is more consistent with the data than a supergenome with Gamma distributed gain probabilities. BioMed Central 2014-10-17 /pmc/articles/PMC4240607/ /pubmed/25572821 http://dx.doi.org/10.1186/1471-2164-15-S6-S14 Text en Copyright © 2014 Lobkovsky et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Lobkovsky, Alexander E Wolf, Yuri I Koonin, Eugene V Estimation of prokaryotic supergenome size and composition from gene frequency distributions |
title | Estimation of prokaryotic supergenome size and composition from gene frequency distributions |
title_full | Estimation of prokaryotic supergenome size and composition from gene frequency distributions |
title_fullStr | Estimation of prokaryotic supergenome size and composition from gene frequency distributions |
title_full_unstemmed | Estimation of prokaryotic supergenome size and composition from gene frequency distributions |
title_short | Estimation of prokaryotic supergenome size and composition from gene frequency distributions |
title_sort | estimation of prokaryotic supergenome size and composition from gene frequency distributions |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240607/ https://www.ncbi.nlm.nih.gov/pubmed/25572821 http://dx.doi.org/10.1186/1471-2164-15-S6-S14 |
work_keys_str_mv | AT lobkovskyalexandere estimationofprokaryoticsupergenomesizeandcompositionfromgenefrequencydistributions AT wolfyurii estimationofprokaryoticsupergenomesizeandcompositionfromgenefrequencydistributions AT koonineugenev estimationofprokaryoticsupergenomesizeandcompositionfromgenefrequencydistributions |