Cargando…

Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals

The Patterson F- and D-statistics are commonly used measures for quantifying population relationships and for testing hypotheses about demographic history. These statistics make use of allele frequency information across populations to infer different aspects of population history, such as populatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Mughal, Mehreen R, DeGiorgio, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8733448/
https://www.ncbi.nlm.nih.gov/pubmed/34849832
http://dx.doi.org/10.1093/genetics/iyab090
_version_ 1784627806330683392
author Mughal, Mehreen R
DeGiorgio, Michael
author_facet Mughal, Mehreen R
DeGiorgio, Michael
author_sort Mughal, Mehreen R
collection PubMed
description The Patterson F- and D-statistics are commonly used measures for quantifying population relationships and for testing hypotheses about demographic history. These statistics make use of allele frequency information across populations to infer different aspects of population history, such as population structure and introgression events. Inclusion of related or inbred individuals can bias such statistics, which may often lead to the filtering of such individuals. Here, we derive statistical properties of the F- and D-statistics, including their biases due to the inclusion of related or inbred individuals, their variances, and their corresponding mean squared errors. Moreover, for those statistics that are biased, we develop unbiased estimators and evaluate the variances of these new quantities. Comparisons of the new unbiased statistics to the originals demonstrates that our newly derived statistics often have lower error across a wide population parameter space. Furthermore, we apply these unbiased estimators using several global human populations with the inclusion of related individuals to highlight their application on an empirical dataset. Finally, we implement these unbiased estimators in open-source software package funbiased for easy application by the scientific community.
format Online
Article
Text
id pubmed-8733448
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87334482022-01-07 Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals Mughal, Mehreen R DeGiorgio, Michael Genetics Investigation The Patterson F- and D-statistics are commonly used measures for quantifying population relationships and for testing hypotheses about demographic history. These statistics make use of allele frequency information across populations to infer different aspects of population history, such as population structure and introgression events. Inclusion of related or inbred individuals can bias such statistics, which may often lead to the filtering of such individuals. Here, we derive statistical properties of the F- and D-statistics, including their biases due to the inclusion of related or inbred individuals, their variances, and their corresponding mean squared errors. Moreover, for those statistics that are biased, we develop unbiased estimators and evaluate the variances of these new quantities. Comparisons of the new unbiased statistics to the originals demonstrates that our newly derived statistics often have lower error across a wide population parameter space. Furthermore, we apply these unbiased estimators using several global human populations with the inclusion of related individuals to highlight their application on an empirical dataset. Finally, we implement these unbiased estimators in open-source software package funbiased for easy application by the scientific community. Oxford University Press 2021-07-15 /pmc/articles/PMC8733448/ /pubmed/34849832 http://dx.doi.org/10.1093/genetics/iyab090 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigation
Mughal, Mehreen R
DeGiorgio, Michael
Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals
title Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals
title_full Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals
title_fullStr Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals
title_full_unstemmed Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals
title_short Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals
title_sort properties and unbiased estimation of f- and d-statistics in samples containing related and inbred individuals
topic Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8733448/
https://www.ncbi.nlm.nih.gov/pubmed/34849832
http://dx.doi.org/10.1093/genetics/iyab090
work_keys_str_mv AT mughalmehreenr propertiesandunbiasedestimationoffanddstatisticsinsamplescontainingrelatedandinbredindividuals
AT degiorgiomichael propertiesandunbiasedestimationoffanddstatisticsinsamplescontainingrelatedandinbredindividuals