Cargando…
A rarefaction approach for measuring population differences in rare and common variation
In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes diffe...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213490/ https://www.ncbi.nlm.nih.gov/pubmed/37075098 http://dx.doi.org/10.1093/genetics/iyad070 |
_version_ | 1785047632159178752 |
---|---|
author | Cotter, Daniel J Hofgard, Elyssa F Novembre, John Szpiech, Zachary A Rosenberg, Noah A |
author_facet | Cotter, Daniel J Hofgard, Elyssa F Novembre, John Szpiech, Zachary A Rosenberg, Noah A |
author_sort | Cotter, Daniel J |
collection | PubMed |
description | In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating “rare” and “common” corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations. |
format | Online Article Text |
id | pubmed-10213490 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-102134902023-05-27 A rarefaction approach for measuring population differences in rare and common variation Cotter, Daniel J Hofgard, Elyssa F Novembre, John Szpiech, Zachary A Rosenberg, Noah A Genetics Investigation In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating “rare” and “common” corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations. Oxford University Press 2023-04-19 /pmc/articles/PMC10213490/ /pubmed/37075098 http://dx.doi.org/10.1093/genetics/iyad070 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Investigation Cotter, Daniel J Hofgard, Elyssa F Novembre, John Szpiech, Zachary A Rosenberg, Noah A A rarefaction approach for measuring population differences in rare and common variation |
title | A rarefaction approach for measuring population differences in rare and common variation |
title_full | A rarefaction approach for measuring population differences in rare and common variation |
title_fullStr | A rarefaction approach for measuring population differences in rare and common variation |
title_full_unstemmed | A rarefaction approach for measuring population differences in rare and common variation |
title_short | A rarefaction approach for measuring population differences in rare and common variation |
title_sort | rarefaction approach for measuring population differences in rare and common variation |
topic | Investigation |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213490/ https://www.ncbi.nlm.nih.gov/pubmed/37075098 http://dx.doi.org/10.1093/genetics/iyad070 |
work_keys_str_mv | AT cotterdanielj ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT hofgardelyssaf ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT novembrejohn ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT szpiechzacharya ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT rosenbergnoaha ararefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT cotterdanielj rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT hofgardelyssaf rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT novembrejohn rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT szpiechzacharya rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation AT rosenbergnoaha rarefactionapproachformeasuringpopulationdifferencesinrareandcommonvariation |