Cargando…
Fast and accurate joint inference of coancestry parameters for populations and/or individuals
We introduce a fast, new algorithm for inferring from allele count data the F(ST) parameters describing genetic distances among a set of populations and/or unrelated diploid individuals, and a tree with branch lengths corresponding to F(ST) values. The tree can reflect historical processes of splitt...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9888729/ https://www.ncbi.nlm.nih.gov/pubmed/36656906 http://dx.doi.org/10.1371/journal.pgen.1010054 |
_version_ | 1784880585585459200 |
---|---|
author | Mary-Huard, Tristan Balding, David |
author_facet | Mary-Huard, Tristan Balding, David |
author_sort | Mary-Huard, Tristan |
collection | PubMed |
description | We introduce a fast, new algorithm for inferring from allele count data the F(ST) parameters describing genetic distances among a set of populations and/or unrelated diploid individuals, and a tree with branch lengths corresponding to F(ST) values. The tree can reflect historical processes of splitting and divergence, but seeks to represent the actual genetic variance as accurately as possible with a tree structure. We generalise two major approaches to defining F(ST), via correlations and mismatch probabilities of sampled allele pairs, which measure shared and non-shared components of genetic variance. A diploid individual can be treated as a population of two gametes, which allows inference of coancestry coefficients for individuals as well as for populations, or a combination of the two. A simulation study illustrates that our fast method-of-moments estimation of F(ST) values, simultaneously for multiple populations/individuals, gains statistical efficiency over pairwise approaches when the population structure is close to tree-like. We apply our approach to genome-wide genotypes from the 26 worldwide human populations of the 1000 Genomes Project. We first analyse at the population level, then a subset of individuals and in a final analysis we pool individuals from the more homogeneous populations. This flexible analysis approach gives advantages over traditional approaches to population structure/coancestry, including visual and quantitative assessments of long-standing questions about the relative magnitudes of within- and between-population genetic differences. |
format | Online Article Text |
id | pubmed-9888729 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-98887292023-02-01 Fast and accurate joint inference of coancestry parameters for populations and/or individuals Mary-Huard, Tristan Balding, David PLoS Genet Research Article We introduce a fast, new algorithm for inferring from allele count data the F(ST) parameters describing genetic distances among a set of populations and/or unrelated diploid individuals, and a tree with branch lengths corresponding to F(ST) values. The tree can reflect historical processes of splitting and divergence, but seeks to represent the actual genetic variance as accurately as possible with a tree structure. We generalise two major approaches to defining F(ST), via correlations and mismatch probabilities of sampled allele pairs, which measure shared and non-shared components of genetic variance. A diploid individual can be treated as a population of two gametes, which allows inference of coancestry coefficients for individuals as well as for populations, or a combination of the two. A simulation study illustrates that our fast method-of-moments estimation of F(ST) values, simultaneously for multiple populations/individuals, gains statistical efficiency over pairwise approaches when the population structure is close to tree-like. We apply our approach to genome-wide genotypes from the 26 worldwide human populations of the 1000 Genomes Project. We first analyse at the population level, then a subset of individuals and in a final analysis we pool individuals from the more homogeneous populations. This flexible analysis approach gives advantages over traditional approaches to population structure/coancestry, including visual and quantitative assessments of long-standing questions about the relative magnitudes of within- and between-population genetic differences. Public Library of Science 2023-01-19 /pmc/articles/PMC9888729/ /pubmed/36656906 http://dx.doi.org/10.1371/journal.pgen.1010054 Text en © 2023 Mary-Huard, Balding https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Mary-Huard, Tristan Balding, David Fast and accurate joint inference of coancestry parameters for populations and/or individuals |
title | Fast and accurate joint inference of coancestry parameters for populations and/or individuals |
title_full | Fast and accurate joint inference of coancestry parameters for populations and/or individuals |
title_fullStr | Fast and accurate joint inference of coancestry parameters for populations and/or individuals |
title_full_unstemmed | Fast and accurate joint inference of coancestry parameters for populations and/or individuals |
title_short | Fast and accurate joint inference of coancestry parameters for populations and/or individuals |
title_sort | fast and accurate joint inference of coancestry parameters for populations and/or individuals |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9888729/ https://www.ncbi.nlm.nih.gov/pubmed/36656906 http://dx.doi.org/10.1371/journal.pgen.1010054 |
work_keys_str_mv | AT maryhuardtristan fastandaccuratejointinferenceofcoancestryparametersforpopulationsandorindividuals AT baldingdavid fastandaccuratejointinferenceofcoancestryparametersforpopulationsandorindividuals |