Cargando…
Generalization of the Ewens sampling formula to arbitrary fitness landscapes
In considering evolution of transcribed regions, regulatory sequences, and other genomic loci, we are often faced with a situation in which the number of allelic states greatly exceeds the size of the population. In this limit, the population eventually adopts a steady state characterized by mutatio...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5764269/ https://www.ncbi.nlm.nih.gov/pubmed/29324850 http://dx.doi.org/10.1371/journal.pone.0190186 |
_version_ | 1783292027994112000 |
---|---|
author | Khromov, Pavel Malliaris, Constantin D. Morozov, Alexandre V. |
author_facet | Khromov, Pavel Malliaris, Constantin D. Morozov, Alexandre V. |
author_sort | Khromov, Pavel |
collection | PubMed |
description | In considering evolution of transcribed regions, regulatory sequences, and other genomic loci, we are often faced with a situation in which the number of allelic states greatly exceeds the size of the population. In this limit, the population eventually adopts a steady state characterized by mutation-selection-drift balance. Although new alleles continue to be explored through mutation, the statistics of the population, and in particular the probabilities of seeing specific allelic configurations in samples taken from the population, do not change with time. In the absence of selection, the probabilities of allelic configurations are given by the Ewens sampling formula, widely used in population genetics to detect deviations from neutrality. Here we develop an extension of this formula to arbitrary fitness distributions. Although our approach is general, we focus on the class of fitness landscapes, inspired by recent high-throughput genotype-phenotype maps, in which alleles can be in several distinct phenotypic states. This class of landscapes yields sampling probabilities that are computationally more tractable and can form a basis for inference of selection signatures from genomic data. Using an efficient numerical implementation of the sampling probabilities, we demonstrate that, for a sizable range of mutation rates and selection coefficients, the steady-state allelic diversity is not neutral. Therefore, it may be used to infer selection coefficients, as well as other evolutionary parameters from population data. We also carry out numerical simulations to challenge various approximations involved in deriving our sampling formulas, such as the infinite-allele limit and the “full connectivity” assumption inherent in the Ewens theory, in which each allele can mutate into any other allele. We find that, at least for the specific numerical examples studied, our theory remains sufficiently accurate even if these assumptions are relaxed. Thus our framework establishes both theoretical and practical foundations for inferring selection signatures from population-level genomic sequence samples. |
format | Online Article Text |
id | pubmed-5764269 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-57642692018-01-23 Generalization of the Ewens sampling formula to arbitrary fitness landscapes Khromov, Pavel Malliaris, Constantin D. Morozov, Alexandre V. PLoS One Research Article In considering evolution of transcribed regions, regulatory sequences, and other genomic loci, we are often faced with a situation in which the number of allelic states greatly exceeds the size of the population. In this limit, the population eventually adopts a steady state characterized by mutation-selection-drift balance. Although new alleles continue to be explored through mutation, the statistics of the population, and in particular the probabilities of seeing specific allelic configurations in samples taken from the population, do not change with time. In the absence of selection, the probabilities of allelic configurations are given by the Ewens sampling formula, widely used in population genetics to detect deviations from neutrality. Here we develop an extension of this formula to arbitrary fitness distributions. Although our approach is general, we focus on the class of fitness landscapes, inspired by recent high-throughput genotype-phenotype maps, in which alleles can be in several distinct phenotypic states. This class of landscapes yields sampling probabilities that are computationally more tractable and can form a basis for inference of selection signatures from genomic data. Using an efficient numerical implementation of the sampling probabilities, we demonstrate that, for a sizable range of mutation rates and selection coefficients, the steady-state allelic diversity is not neutral. Therefore, it may be used to infer selection coefficients, as well as other evolutionary parameters from population data. We also carry out numerical simulations to challenge various approximations involved in deriving our sampling formulas, such as the infinite-allele limit and the “full connectivity” assumption inherent in the Ewens theory, in which each allele can mutate into any other allele. We find that, at least for the specific numerical examples studied, our theory remains sufficiently accurate even if these assumptions are relaxed. Thus our framework establishes both theoretical and practical foundations for inferring selection signatures from population-level genomic sequence samples. Public Library of Science 2018-01-11 /pmc/articles/PMC5764269/ /pubmed/29324850 http://dx.doi.org/10.1371/journal.pone.0190186 Text en © 2018 Khromov et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Khromov, Pavel Malliaris, Constantin D. Morozov, Alexandre V. Generalization of the Ewens sampling formula to arbitrary fitness landscapes |
title | Generalization of the Ewens sampling formula to arbitrary fitness landscapes |
title_full | Generalization of the Ewens sampling formula to arbitrary fitness landscapes |
title_fullStr | Generalization of the Ewens sampling formula to arbitrary fitness landscapes |
title_full_unstemmed | Generalization of the Ewens sampling formula to arbitrary fitness landscapes |
title_short | Generalization of the Ewens sampling formula to arbitrary fitness landscapes |
title_sort | generalization of the ewens sampling formula to arbitrary fitness landscapes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5764269/ https://www.ncbi.nlm.nih.gov/pubmed/29324850 http://dx.doi.org/10.1371/journal.pone.0190186 |
work_keys_str_mv | AT khromovpavel generalizationoftheewenssamplingformulatoarbitraryfitnesslandscapes AT malliarisconstantind generalizationoftheewenssamplingformulatoarbitraryfitnesslandscapes AT morozovalexandrev generalizationoftheewenssamplingformulatoarbitraryfitnesslandscapes |