Cargando…

A rarefaction approach for measuring population differences in rare and common variation

In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes diffe...

Descripción completa

Detalles Bibliográficos
Autores principales: Cotter, Daniel J, Hofgard, Elyssa F, Novembre, John, Szpiech, Zachary A, Rosenberg, Noah A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213490/
https://www.ncbi.nlm.nih.gov/pubmed/37075098
http://dx.doi.org/10.1093/genetics/iyad070
_version_ 1785047632159178752
author Cotter, Daniel J
Hofgard, Elyssa F
Novembre, John
Szpiech, Zachary A
Rosenberg, Noah A
author_facet Cotter, Daniel J
Hofgard, Elyssa F
Novembre, John
Szpiech, Zachary A
Rosenberg, Noah A
author_sort Cotter, Daniel J
collection PubMed
description In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating “rare” and “common” corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations.
format Online
Article
Text
id pubmed-10213490
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102134902023-05-27 A rarefaction approach for measuring population differences in rare and common variation Cotter, Daniel J Hofgard, Elyssa F Novembre, John Szpiech, Zachary A Rosenberg, Noah A Genetics Investigation In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating “rare” and “common” corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations. Oxford University Press 2023-04-19 /pmc/articles/PMC10213490/ /pubmed/37075098 http://dx.doi.org/10.1093/genetics/iyad070 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigation
Cotter, Daniel J
Hofgard, Elyssa F
Novembre, John
Szpiech, Zachary A
Rosenberg, Noah A
A rarefaction approach for measuring population differences in rare and common variation
title A rarefaction approach for measuring population differences in rare and common variation
title_full A rarefaction approach for measuring population differences in rare and common variation
title_fullStr A rarefaction approach for measuring population differences in rare and common variation
title_full_unstemmed A rarefaction approach for measuring population differences in rare and common variation
title_short A rarefaction approach for measuring population differences in rare and common variation
title_sort rarefaction approach for measuring population differences in rare and common variation
topic Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213490/
https://www.ncbi.nlm.nih.gov/pubmed/37075098
http://dx.doi.org/10.1093/genetics/iyad070
work_keys_str_mv AT cotterdanielj ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT hofgardelyssaf ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT novembrejohn ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT szpiechzacharya ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT rosenbergnoaha ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT cotterdanielj rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT hofgardelyssaf rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT novembrejohn rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT szpiechzacharya rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation
AT rosenbergnoaha rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation