Cargando…
The Relationship Between F(ST) and the Frequency of the Most Frequent Allele
F(ST) is frequently used as a summary of genetic differentiation among groups. It has been suggested that F(ST) depends on the allele frequencies at a locus, as it exhibits a variety of peculiar properties related to genetic diversity: higher values for biallelic single-nucleotide polymorphisms (SNP...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3567740/ https://www.ncbi.nlm.nih.gov/pubmed/23172852 http://dx.doi.org/10.1534/genetics.112.144758 |
_version_ | 1782258718575951872 |
---|---|
author | Jakobsson, Mattias Edge, Michael D. Rosenberg, Noah A. |
author_facet | Jakobsson, Mattias Edge, Michael D. Rosenberg, Noah A. |
author_sort | Jakobsson, Mattias |
collection | PubMed |
description | F(ST) is frequently used as a summary of genetic differentiation among groups. It has been suggested that F(ST) depends on the allele frequencies at a locus, as it exhibits a variety of peculiar properties related to genetic diversity: higher values for biallelic single-nucleotide polymorphisms (SNPs) than for multiallelic microsatellites, low values among high-diversity populations viewed as substantially distinct, and low values for populations that differ primarily in their profiles of rare alleles. A full mathematical understanding of the dependence of F(ST) on allele frequencies, however, has been elusive. Here, we examine the relationship between F(ST) and the frequency of the most frequent allele, demonstrating that the range of values that F(ST) can take is restricted considerably by the allele-frequency distribution. For a two-population model, we derive strict bounds on F(ST) as a function of the frequency M of the allele with highest mean frequency between the pair of populations. Using these bounds, we show that for a value of M chosen uniformly between 0 and 1 at a multiallelic locus whose number of alleles is left unspecified, the mean maximum F(ST) is ∼0.3585. Further, F(ST) is restricted to values much less than 1 when M is low or high, and the contribution to the maximum F(ST) made by the most frequent allele is on average ∼0.4485. Using bounds on homozygosity that we have previously derived as functions of M, we describe strict bounds on F(ST) in terms of the homozygosity of the total population, finding that the mean maximum F(ST) given this homozygosity is 1 − ln 2 ≈ 0.3069. Our results provide a conceptual basis for understanding the dependence of F(ST) on allele frequencies and genetic diversity and for interpreting the roles of these quantities in computations of F(ST) from population-genetic data. Further, our analysis suggests that many unusual observations of F(ST), including the relatively low F(ST) values in high-diversity human populations from Africa and the relatively low estimates of F(ST) for microsatellites compared to SNPs, can be understood not as biological phenomena associated with different groups of populations or classes of markers but rather as consequences of the intrinsic mathematical dependence of F(ST) on the properties of allele-frequency distributions. |
format | Online Article Text |
id | pubmed-3567740 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-35677402013-02-08 The Relationship Between F(ST) and the Frequency of the Most Frequent Allele Jakobsson, Mattias Edge, Michael D. Rosenberg, Noah A. Genetics Investigations F(ST) is frequently used as a summary of genetic differentiation among groups. It has been suggested that F(ST) depends on the allele frequencies at a locus, as it exhibits a variety of peculiar properties related to genetic diversity: higher values for biallelic single-nucleotide polymorphisms (SNPs) than for multiallelic microsatellites, low values among high-diversity populations viewed as substantially distinct, and low values for populations that differ primarily in their profiles of rare alleles. A full mathematical understanding of the dependence of F(ST) on allele frequencies, however, has been elusive. Here, we examine the relationship between F(ST) and the frequency of the most frequent allele, demonstrating that the range of values that F(ST) can take is restricted considerably by the allele-frequency distribution. For a two-population model, we derive strict bounds on F(ST) as a function of the frequency M of the allele with highest mean frequency between the pair of populations. Using these bounds, we show that for a value of M chosen uniformly between 0 and 1 at a multiallelic locus whose number of alleles is left unspecified, the mean maximum F(ST) is ∼0.3585. Further, F(ST) is restricted to values much less than 1 when M is low or high, and the contribution to the maximum F(ST) made by the most frequent allele is on average ∼0.4485. Using bounds on homozygosity that we have previously derived as functions of M, we describe strict bounds on F(ST) in terms of the homozygosity of the total population, finding that the mean maximum F(ST) given this homozygosity is 1 − ln 2 ≈ 0.3069. Our results provide a conceptual basis for understanding the dependence of F(ST) on allele frequencies and genetic diversity and for interpreting the roles of these quantities in computations of F(ST) from population-genetic data. Further, our analysis suggests that many unusual observations of F(ST), including the relatively low F(ST) values in high-diversity human populations from Africa and the relatively low estimates of F(ST) for microsatellites compared to SNPs, can be understood not as biological phenomena associated with different groups of populations or classes of markers but rather as consequences of the intrinsic mathematical dependence of F(ST) on the properties of allele-frequency distributions. Genetics Society of America 2013-02 /pmc/articles/PMC3567740/ /pubmed/23172852 http://dx.doi.org/10.1534/genetics.112.144758 Text en Copyright © 2013 by the Genetics Society of America Available freely online through the author-supported open access option. |
spellingShingle | Investigations Jakobsson, Mattias Edge, Michael D. Rosenberg, Noah A. The Relationship Between F(ST) and the Frequency of the Most Frequent Allele |
title | The Relationship Between F(ST) and the Frequency of the Most Frequent Allele |
title_full | The Relationship Between F(ST) and the Frequency of the Most Frequent Allele |
title_fullStr | The Relationship Between F(ST) and the Frequency of the Most Frequent Allele |
title_full_unstemmed | The Relationship Between F(ST) and the Frequency of the Most Frequent Allele |
title_short | The Relationship Between F(ST) and the Frequency of the Most Frequent Allele |
title_sort | relationship between f(st) and the frequency of the most frequent allele |
topic | Investigations |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3567740/ https://www.ncbi.nlm.nih.gov/pubmed/23172852 http://dx.doi.org/10.1534/genetics.112.144758 |
work_keys_str_mv | AT jakobssonmattias therelationshipbetweenfstandthefrequencyofthemostfrequentallele AT edgemichaeld therelationshipbetweenfstandthefrequencyofthemostfrequentallele AT rosenbergnoaha therelationshipbetweenfstandthefrequencyofthemostfrequentallele AT jakobssonmattias relationshipbetweenfstandthefrequencyofthemostfrequentallele AT edgemichaeld relationshipbetweenfstandthefrequencyofthemostfrequentallele AT rosenbergnoaha relationshipbetweenfstandthefrequencyofthemostfrequentallele |