Cargando…

A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind

Ewen’s sampling formula is a foundational theoretical result that connects probability and number theory with molecular genetics and molecular evolution; it was the analytical result required for testing the neutral theory of evolution, and has since been directly or indirectly utilized in a number...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Swaine L., Temme, Nico M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7642932/
https://www.ncbi.nlm.nih.gov/pubmed/32900901
http://dx.doi.org/10.1534/g3.120.401575
_version_ 1783606176812892160
author Chen, Swaine L.
Temme, Nico M.
author_facet Chen, Swaine L.
Temme, Nico M.
author_sort Chen, Swaine L.
collection PubMed
description Ewen’s sampling formula is a foundational theoretical result that connects probability and number theory with molecular genetics and molecular evolution; it was the analytical result required for testing the neutral theory of evolution, and has since been directly or indirectly utilized in a number of population genetics statistics. Ewen’s sampling formula, in turn, is deeply connected to Stirling numbers of the first kind. Here, we explore the cumulative distribution function of these Stirling numbers, which enables a single direct estimate of the sum, using representations in terms of the incomplete beta function. This estimator enables an improved method for calculating an asymptotic estimate for one useful statistic, Fu’s [Formula: see text]. By reducing the calculation from a sum of terms involving Stirling numbers to a single estimate, we simultaneously improve accuracy and dramatically increase speed.
format Online
Article
Text
id pubmed-7642932
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-76429322020-11-13 A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind Chen, Swaine L. Temme, Nico M. G3 (Bethesda) Software and Data Resources Ewen’s sampling formula is a foundational theoretical result that connects probability and number theory with molecular genetics and molecular evolution; it was the analytical result required for testing the neutral theory of evolution, and has since been directly or indirectly utilized in a number of population genetics statistics. Ewen’s sampling formula, in turn, is deeply connected to Stirling numbers of the first kind. Here, we explore the cumulative distribution function of these Stirling numbers, which enables a single direct estimate of the sum, using representations in terms of the incomplete beta function. This estimator enables an improved method for calculating an asymptotic estimate for one useful statistic, Fu’s [Formula: see text]. By reducing the calculation from a sum of terms involving Stirling numbers to a single estimate, we simultaneously improve accuracy and dramatically increase speed. Genetics Society of America 2020-09-08 /pmc/articles/PMC7642932/ /pubmed/32900901 http://dx.doi.org/10.1534/g3.120.401575 Text en Copyright © 2020 Chen, Temme http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software and Data Resources
Chen, Swaine L.
Temme, Nico M.
A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind
title A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind
title_full A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind
title_fullStr A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind
title_full_unstemmed A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind
title_short A Faster and More Accurate Algorithm for Calculating Population Genetics Statistics Requiring Sums of Stirling Numbers of the First Kind
title_sort faster and more accurate algorithm for calculating population genetics statistics requiring sums of stirling numbers of the first kind
topic Software and Data Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7642932/
https://www.ncbi.nlm.nih.gov/pubmed/32900901
http://dx.doi.org/10.1534/g3.120.401575
work_keys_str_mv AT chenswainel afasterandmoreaccuratealgorithmforcalculatingpopulationgeneticsstatisticsrequiringsumsofstirlingnumbersofthefirstkind
AT temmenicom afasterandmoreaccuratealgorithmforcalculatingpopulationgeneticsstatisticsrequiringsumsofstirlingnumbersofthefirstkind
AT chenswainel fasterandmoreaccuratealgorithmforcalculatingpopulationgeneticsstatisticsrequiringsumsofstirlingnumbersofthefirstkind
AT temmenicom fasterandmoreaccuratealgorithmforcalculatingpopulationgeneticsstatisticsrequiringsumsofstirlingnumbersofthefirstkind