Cargando…

Estimating F(ST) and kinship for arbitrary population structures

F(ST) and kinship are key parameters often estimated in modern population genetics studies in order to quantitatively characterize structure and relatedness. Kinship matrices have also become a fundamental quantity used in genome-wide association studies and heritability estimation. The most frequen...

Descripción completa

Detalles Bibliográficos
Autores principales: Ochoa, Alejandro, Storey, John D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7846127/
https://www.ncbi.nlm.nih.gov/pubmed/33465078
http://dx.doi.org/10.1371/journal.pgen.1009241
_version_ 1783644680422948864
author Ochoa, Alejandro
Storey, John D.
author_facet Ochoa, Alejandro
Storey, John D.
author_sort Ochoa, Alejandro
collection PubMed
description F(ST) and kinship are key parameters often estimated in modern population genetics studies in order to quantitatively characterize structure and relatedness. Kinship matrices have also become a fundamental quantity used in genome-wide association studies and heritability estimation. The most frequently-used estimators of F(ST) and kinship are method-of-moments estimators whose accuracies depend strongly on the existence of simple underlying forms of structure, such as the independent subpopulations model of non-overlapping, independently evolving subpopulations. However, modern data sets have revealed that these simple models of structure likely do not hold in many populations, including humans. In this work, we analyze the behavior of these estimators in the presence of arbitrarily-complex population structures, which results in an improved estimation framework specifically designed for arbitrary population structures. After generalizing the definition of F(ST) to arbitrary population structures and establishing a framework for assessing bias and consistency of genome-wide estimators, we calculate the accuracy of existing F(ST) and kinship estimators under arbitrary population structures, characterizing biases and estimation challenges unobserved under their originally-assumed models of structure. We then present our new approach, which consistently estimates kinship and F(ST) when the minimum kinship value in the dataset is estimated consistently. We illustrate our results using simulated genotypes from an admixture model, constructing a one-dimensional geographic scenario that departs nontrivially from the independent subpopulations model. Our simulations reveal the potential for severe biases in estimates of existing approaches that are overcome by our new framework. This work may significantly improve future analyses that rely on accurate kinship and F(ST) estimates.
format Online
Article
Text
id pubmed-7846127
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-78461272021-02-04 Estimating F(ST) and kinship for arbitrary population structures Ochoa, Alejandro Storey, John D. PLoS Genet Research Article F(ST) and kinship are key parameters often estimated in modern population genetics studies in order to quantitatively characterize structure and relatedness. Kinship matrices have also become a fundamental quantity used in genome-wide association studies and heritability estimation. The most frequently-used estimators of F(ST) and kinship are method-of-moments estimators whose accuracies depend strongly on the existence of simple underlying forms of structure, such as the independent subpopulations model of non-overlapping, independently evolving subpopulations. However, modern data sets have revealed that these simple models of structure likely do not hold in many populations, including humans. In this work, we analyze the behavior of these estimators in the presence of arbitrarily-complex population structures, which results in an improved estimation framework specifically designed for arbitrary population structures. After generalizing the definition of F(ST) to arbitrary population structures and establishing a framework for assessing bias and consistency of genome-wide estimators, we calculate the accuracy of existing F(ST) and kinship estimators under arbitrary population structures, characterizing biases and estimation challenges unobserved under their originally-assumed models of structure. We then present our new approach, which consistently estimates kinship and F(ST) when the minimum kinship value in the dataset is estimated consistently. We illustrate our results using simulated genotypes from an admixture model, constructing a one-dimensional geographic scenario that departs nontrivially from the independent subpopulations model. Our simulations reveal the potential for severe biases in estimates of existing approaches that are overcome by our new framework. This work may significantly improve future analyses that rely on accurate kinship and F(ST) estimates. Public Library of Science 2021-01-19 /pmc/articles/PMC7846127/ /pubmed/33465078 http://dx.doi.org/10.1371/journal.pgen.1009241 Text en © 2021 Ochoa, Storey http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ochoa, Alejandro
Storey, John D.
Estimating F(ST) and kinship for arbitrary population structures
title Estimating F(ST) and kinship for arbitrary population structures
title_full Estimating F(ST) and kinship for arbitrary population structures
title_fullStr Estimating F(ST) and kinship for arbitrary population structures
title_full_unstemmed Estimating F(ST) and kinship for arbitrary population structures
title_short Estimating F(ST) and kinship for arbitrary population structures
title_sort estimating f(st) and kinship for arbitrary population structures
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7846127/
https://www.ncbi.nlm.nih.gov/pubmed/33465078
http://dx.doi.org/10.1371/journal.pgen.1009241
work_keys_str_mv AT ochoaalejandro estimatingfstandkinshipforarbitrarypopulationstructures
AT storeyjohnd estimatingfstandkinshipforarbitrarypopulationstructures