Cargando…

How Good Are Statistical Models at Approximating Complex Fitness Landscapes?

Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world quest...

Descripción completa

Detalles Bibliográficos
Autores principales: du Plessis, Louis, Leventhal, Gabriel E., Bonhoeffer, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989103/
https://www.ncbi.nlm.nih.gov/pubmed/27189564
http://dx.doi.org/10.1093/molbev/msw097
_version_ 1782448515484483584
author du Plessis, Louis
Leventhal, Gabriel E.
Bonhoeffer, Sebastian
author_facet du Plessis, Louis
Leventhal, Gabriel E.
Bonhoeffer, Sebastian
author_sort du Plessis, Louis
collection PubMed
description Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world questions, as the high dimensionality of sequence spaces makes it impossible to exhaustively measure the fitness of all variants of biologically meaningful sequences. We must therefore revert to statistical descriptions of fitness landscapes that are based on a sparse sample of fitness measurements. It remains unclear, however, how much data are required for such statistical descriptions to be useful. Here, we assess the ability of regression models accounting for single and pairwise mutations to correctly approximate a complex quasi-empirical fitness landscape. We compare approximations based on various sampling regimes of an RNA landscape and find that the sampling regime strongly influences the quality of the regression. On the one hand it is generally impossible to generate sufficient samples to achieve a good approximation of the complete fitness landscape, and on the other hand systematic sampling schemes can only provide a good description of the immediate neighborhood of a sequence of interest. Nevertheless, we obtain a remarkably good and unbiased fit to the local landscape when using sequences from a population that has evolved under strong selection. Thus, current statistical methods can provide a good approximation to the landscape of naturally evolving populations.
format Online
Article
Text
id pubmed-4989103
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49891032016-08-19 How Good Are Statistical Models at Approximating Complex Fitness Landscapes? du Plessis, Louis Leventhal, Gabriel E. Bonhoeffer, Sebastian Mol Biol Evol Methods Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world questions, as the high dimensionality of sequence spaces makes it impossible to exhaustively measure the fitness of all variants of biologically meaningful sequences. We must therefore revert to statistical descriptions of fitness landscapes that are based on a sparse sample of fitness measurements. It remains unclear, however, how much data are required for such statistical descriptions to be useful. Here, we assess the ability of regression models accounting for single and pairwise mutations to correctly approximate a complex quasi-empirical fitness landscape. We compare approximations based on various sampling regimes of an RNA landscape and find that the sampling regime strongly influences the quality of the regression. On the one hand it is generally impossible to generate sufficient samples to achieve a good approximation of the complete fitness landscape, and on the other hand systematic sampling schemes can only provide a good description of the immediate neighborhood of a sequence of interest. Nevertheless, we obtain a remarkably good and unbiased fit to the local landscape when using sequences from a population that has evolved under strong selection. Thus, current statistical methods can provide a good approximation to the landscape of naturally evolving populations. Oxford University Press 2016-09 2016-06-14 /pmc/articles/PMC4989103/ /pubmed/27189564 http://dx.doi.org/10.1093/molbev/msw097 Text en © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods
du Plessis, Louis
Leventhal, Gabriel E.
Bonhoeffer, Sebastian
How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
title How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
title_full How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
title_fullStr How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
title_full_unstemmed How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
title_short How Good Are Statistical Models at Approximating Complex Fitness Landscapes?
title_sort how good are statistical models at approximating complex fitness landscapes?
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989103/
https://www.ncbi.nlm.nih.gov/pubmed/27189564
http://dx.doi.org/10.1093/molbev/msw097
work_keys_str_mv AT duplessislouis howgoodarestatisticalmodelsatapproximatingcomplexfitnesslandscapes
AT leventhalgabriele howgoodarestatisticalmodelsatapproximatingcomplexfitnesslandscapes
AT bonhoeffersebastian howgoodarestatisticalmodelsatapproximatingcomplexfitnesslandscapes