Cargando…

Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics

BACKGROUND: Markov chains are a common framework for individual-based state and time discrete models in evolution. Though they played an important role in the development of basic population genetic theory, the analysis of more complex evolutionary scenarios typically involves approximation with oth...

Descripción completa

Detalles Bibliográficos
Autores principales: Reichel, Katja, Bahier, Valentin, Midoux, Cédric, Parisey, Nicolas, Masson, Jean-Pierre, Stoeckel, Solenn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4696214/
https://www.ncbi.nlm.nih.gov/pubmed/26719759
http://dx.doi.org/10.1186/s13015-015-0061-5
_version_ 1782407751794688000
author Reichel, Katja
Bahier, Valentin
Midoux, Cédric
Parisey, Nicolas
Masson, Jean-Pierre
Stoeckel, Solenn
author_facet Reichel, Katja
Bahier, Valentin
Midoux, Cédric
Parisey, Nicolas
Masson, Jean-Pierre
Stoeckel, Solenn
author_sort Reichel, Katja
collection PubMed
description BACKGROUND: Markov chains are a common framework for individual-based state and time discrete models in evolution. Though they played an important role in the development of basic population genetic theory, the analysis of more complex evolutionary scenarios typically involves approximation with other types of models. As the number of states increases, the big, dense transition matrices involved become increasingly unwieldy. However, advances in computational technology continue to reduce the challenges of “big data”, thus giving new potential to state-rich Markov chains in theoretical population genetics. RESULTS: Using a population genetic model based on genotype frequencies as an example, we propose a set of methods to assist in the computation and interpretation of big, dense Markov chain transition matrices. With the help of network analysis, we demonstrate how they can be transformed into clear and easily interpretable graphs, providing a new perspective even on the classic case of a randomly mating, finite population with mutation. Moreover, we describe an algorithm to save computer memory by substituting the original matrix with a sparse approximate while preserving its mathematically important properties, including a closely corresponding dominant (normalized) eigenvector. A global sensitivity analysis of the approximation results in our example shows that size reduction of more than 90 % is possible without significantly affecting the basic model results. Sample implementations of our methods are collected in the Python module mamoth. CONCLUSION: Our methods help to make stochastic population genetic models involving big, dense transition matrices computationally feasible. Our visualization techniques provide new ways to explore such models and concisely present the results. Thus, our methods will contribute to establish state-rich Markov chains as a valuable supplement to the diversity of population genetic models currently employed, providing interesting new details about evolution e.g. under non-standard reproductive systems such as partial clonality. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13015-015-0061-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4696214
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46962142015-12-31 Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics Reichel, Katja Bahier, Valentin Midoux, Cédric Parisey, Nicolas Masson, Jean-Pierre Stoeckel, Solenn Algorithms Mol Biol Software Article BACKGROUND: Markov chains are a common framework for individual-based state and time discrete models in evolution. Though they played an important role in the development of basic population genetic theory, the analysis of more complex evolutionary scenarios typically involves approximation with other types of models. As the number of states increases, the big, dense transition matrices involved become increasingly unwieldy. However, advances in computational technology continue to reduce the challenges of “big data”, thus giving new potential to state-rich Markov chains in theoretical population genetics. RESULTS: Using a population genetic model based on genotype frequencies as an example, we propose a set of methods to assist in the computation and interpretation of big, dense Markov chain transition matrices. With the help of network analysis, we demonstrate how they can be transformed into clear and easily interpretable graphs, providing a new perspective even on the classic case of a randomly mating, finite population with mutation. Moreover, we describe an algorithm to save computer memory by substituting the original matrix with a sparse approximate while preserving its mathematically important properties, including a closely corresponding dominant (normalized) eigenvector. A global sensitivity analysis of the approximation results in our example shows that size reduction of more than 90 % is possible without significantly affecting the basic model results. Sample implementations of our methods are collected in the Python module mamoth. CONCLUSION: Our methods help to make stochastic population genetic models involving big, dense transition matrices computationally feasible. Our visualization techniques provide new ways to explore such models and concisely present the results. Thus, our methods will contribute to establish state-rich Markov chains as a valuable supplement to the diversity of population genetic models currently employed, providing interesting new details about evolution e.g. under non-standard reproductive systems such as partial clonality. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13015-015-0061-5) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-30 /pmc/articles/PMC4696214/ /pubmed/26719759 http://dx.doi.org/10.1186/s13015-015-0061-5 Text en © Reichel et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software Article
Reichel, Katja
Bahier, Valentin
Midoux, Cédric
Parisey, Nicolas
Masson, Jean-Pierre
Stoeckel, Solenn
Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics
title Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics
title_full Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics
title_fullStr Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics
title_full_unstemmed Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics
title_short Interpretation and approximation tools for big, dense Markov chain transition matrices in population genetics
title_sort interpretation and approximation tools for big, dense markov chain transition matrices in population genetics
topic Software Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4696214/
https://www.ncbi.nlm.nih.gov/pubmed/26719759
http://dx.doi.org/10.1186/s13015-015-0061-5
work_keys_str_mv AT reichelkatja interpretationandapproximationtoolsforbigdensemarkovchaintransitionmatricesinpopulationgenetics
AT bahiervalentin interpretationandapproximationtoolsforbigdensemarkovchaintransitionmatricesinpopulationgenetics
AT midouxcedric interpretationandapproximationtoolsforbigdensemarkovchaintransitionmatricesinpopulationgenetics
AT pariseynicolas interpretationandapproximationtoolsforbigdensemarkovchaintransitionmatricesinpopulationgenetics
AT massonjeanpierre interpretationandapproximationtoolsforbigdensemarkovchaintransitionmatricesinpopulationgenetics
AT stoeckelsolenn interpretationandapproximationtoolsforbigdensemarkovchaintransitionmatricesinpopulationgenetics