Cargando…

An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data

Sample size is a critical aspect of study design in population genomics research, yet few empirical studies have examined the impacts of small sample sizes. We used datasets from eight diverging bird lineages to make pairwise comparisons at different levels of taxonomic divergence (populations, subs...

Descripción completa

Detalles Bibliográficos
Autores principales: McLaughlin, Jessica F., Winker, Kevin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501783/
https://www.ncbi.nlm.nih.gov/pubmed/32995092
http://dx.doi.org/10.7717/peerj.9939
_version_ 1783584099460448256
author McLaughlin, Jessica F.
Winker, Kevin
author_facet McLaughlin, Jessica F.
Winker, Kevin
author_sort McLaughlin, Jessica F.
collection PubMed
description Sample size is a critical aspect of study design in population genomics research, yet few empirical studies have examined the impacts of small sample sizes. We used datasets from eight diverging bird lineages to make pairwise comparisons at different levels of taxonomic divergence (populations, subspecies, and species). Our data are from loci linked to ultraconserved elements and our analyses used one single nucleotide polymorphism per locus. All individuals were genotyped at all loci, effectively doubling sample size for coalescent analyses. We estimated population demographic parameters (effective population size, migration rate, and time since divergence) in a coalescent framework using Diffusion Approximation for Demographic Inference, an allele frequency spectrum method. Using divergence-with-gene-flow models optimized with full datasets, we subsampled at sequentially smaller sample sizes from full datasets of 6–8 diploid individuals per population (with both alleles called) down to 1:1, and then we compared estimates and their changes in accuracy. Accuracy was strongly affected by sample size, with considerable differences among estimated parameters and among lineages. Effective population size parameters (ν) tended to be underestimated at low sample sizes (fewer than three diploid individuals per population, or 6:6 haplotypes in coalescent terms). Migration (m) was fairly consistently estimated until <2 individuals per population, and no consistent trend of over-or underestimation was found in either time since divergence (T) or theta (Θ = 4N(ref)μ). Lineages that were taxonomically recognized above the population level (subspecies and species pairs; that is, deeper divergences) tended to have lower variation in scaled root mean square error of parameter estimation at smaller sample sizes than population-level divergences, and many parameters were estimated accurately down to three diploid individuals per population. Shallower divergence levels (i.e., populations) often required at least five individuals per population for reliable demographic inferences using this approach. Although divergence levels might be unknown at the outset of study design, our results provide a framework for planning appropriate sampling and for interpreting results if smaller sample sizes must be used.
format Online
Article
Text
id pubmed-7501783
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-75017832020-09-28 An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data McLaughlin, Jessica F. Winker, Kevin PeerJ Biodiversity Sample size is a critical aspect of study design in population genomics research, yet few empirical studies have examined the impacts of small sample sizes. We used datasets from eight diverging bird lineages to make pairwise comparisons at different levels of taxonomic divergence (populations, subspecies, and species). Our data are from loci linked to ultraconserved elements and our analyses used one single nucleotide polymorphism per locus. All individuals were genotyped at all loci, effectively doubling sample size for coalescent analyses. We estimated population demographic parameters (effective population size, migration rate, and time since divergence) in a coalescent framework using Diffusion Approximation for Demographic Inference, an allele frequency spectrum method. Using divergence-with-gene-flow models optimized with full datasets, we subsampled at sequentially smaller sample sizes from full datasets of 6–8 diploid individuals per population (with both alleles called) down to 1:1, and then we compared estimates and their changes in accuracy. Accuracy was strongly affected by sample size, with considerable differences among estimated parameters and among lineages. Effective population size parameters (ν) tended to be underestimated at low sample sizes (fewer than three diploid individuals per population, or 6:6 haplotypes in coalescent terms). Migration (m) was fairly consistently estimated until <2 individuals per population, and no consistent trend of over-or underestimation was found in either time since divergence (T) or theta (Θ = 4N(ref)μ). Lineages that were taxonomically recognized above the population level (subspecies and species pairs; that is, deeper divergences) tended to have lower variation in scaled root mean square error of parameter estimation at smaller sample sizes than population-level divergences, and many parameters were estimated accurately down to three diploid individuals per population. Shallower divergence levels (i.e., populations) often required at least five individuals per population for reliable demographic inferences using this approach. Although divergence levels might be unknown at the outset of study design, our results provide a framework for planning appropriate sampling and for interpreting results if smaller sample sizes must be used. PeerJ Inc. 2020-09-16 /pmc/articles/PMC7501783/ /pubmed/32995092 http://dx.doi.org/10.7717/peerj.9939 Text en © 2020 McLaughlin and Winker https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biodiversity
McLaughlin, Jessica F.
Winker, Kevin
An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data
title An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data
title_full An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data
title_fullStr An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data
title_full_unstemmed An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data
title_short An empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (SNP) data
title_sort empirical examination of sample size effects on population demographic estimates in birds using single nucleotide polymorphism (snp) data
topic Biodiversity
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501783/
https://www.ncbi.nlm.nih.gov/pubmed/32995092
http://dx.doi.org/10.7717/peerj.9939
work_keys_str_mv AT mclaughlinjessicaf anempiricalexaminationofsamplesizeeffectsonpopulationdemographicestimatesinbirdsusingsinglenucleotidepolymorphismsnpdata
AT winkerkevin anempiricalexaminationofsamplesizeeffectsonpopulationdemographicestimatesinbirdsusingsinglenucleotidepolymorphismsnpdata
AT mclaughlinjessicaf empiricalexaminationofsamplesizeeffectsonpopulationdemographicestimatesinbirdsusingsinglenucleotidepolymorphismsnpdata
AT winkerkevin empiricalexaminationofsamplesizeeffectsonpopulationdemographicestimatesinbirdsusingsinglenucleotidepolymorphismsnpdata