Cargando…

Estimating effective population size using RADseq: Effects of SNP selection and sample size

Effective population size (N(e)) is a key parameter of population genetics. However, N (e) remains challenging to estimate for natural populations as several factors are likely to bias estimates. These factors include sampling design, sequencing method, and data filtering. One issue inherent to the...

Descripción completa

Detalles Bibliográficos
Autores principales: Marandel, Florianne, Charrier, Grégory, Lamy, Jean‐Baptiste, Le Cam, Sabrina, Lorance, Pascal, Trenkel, Verena M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7042749/
https://www.ncbi.nlm.nih.gov/pubmed/32128126
http://dx.doi.org/10.1002/ece3.6016
_version_ 1783501365133180928
author Marandel, Florianne
Charrier, Grégory
Lamy, Jean‐Baptiste
Le Cam, Sabrina
Lorance, Pascal
Trenkel, Verena M.
author_facet Marandel, Florianne
Charrier, Grégory
Lamy, Jean‐Baptiste
Le Cam, Sabrina
Lorance, Pascal
Trenkel, Verena M.
author_sort Marandel, Florianne
collection PubMed
description Effective population size (N(e)) is a key parameter of population genetics. However, N (e) remains challenging to estimate for natural populations as several factors are likely to bias estimates. These factors include sampling design, sequencing method, and data filtering. One issue inherent to the restriction site‐associated DNA sequencing (RADseq) protocol is missing data and SNP selection criteria (e.g., minimum minor allele frequency, number of SNPs). To evaluate the potential impact of SNP selection criteria on N(e) estimates (Linkage Disequilibrium method) we used RADseq data for a nonmodel species, the thornback ray. In this data set, the inbreeding coefficient F (IS) was positively correlated with the amount of missing data, implying data were missing nonrandomly. The precision of N(e)estimates decreased with the number of SNPs. Mean N(e) estimates (averaged across 50 random data sets with2000 SNPs) ranged between 237 and 1784. Increasing the percentage of missing data from 25% to 50% increased N(e) estimates between 82% and 120%, while increasing the minor allele frequency (MAF) threshold from 0.01 to 0.1 decreased estimates between 71% and 75%. Considering these effects is important when interpreting RADseq data‐derived estimates of effective population size in empirical studies.
format Online
Article
Text
id pubmed-7042749
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-70427492020-03-03 Estimating effective population size using RADseq: Effects of SNP selection and sample size Marandel, Florianne Charrier, Grégory Lamy, Jean‐Baptiste Le Cam, Sabrina Lorance, Pascal Trenkel, Verena M. Ecol Evol Original Research Effective population size (N(e)) is a key parameter of population genetics. However, N (e) remains challenging to estimate for natural populations as several factors are likely to bias estimates. These factors include sampling design, sequencing method, and data filtering. One issue inherent to the restriction site‐associated DNA sequencing (RADseq) protocol is missing data and SNP selection criteria (e.g., minimum minor allele frequency, number of SNPs). To evaluate the potential impact of SNP selection criteria on N(e) estimates (Linkage Disequilibrium method) we used RADseq data for a nonmodel species, the thornback ray. In this data set, the inbreeding coefficient F (IS) was positively correlated with the amount of missing data, implying data were missing nonrandomly. The precision of N(e)estimates decreased with the number of SNPs. Mean N(e) estimates (averaged across 50 random data sets with2000 SNPs) ranged between 237 and 1784. Increasing the percentage of missing data from 25% to 50% increased N(e) estimates between 82% and 120%, while increasing the minor allele frequency (MAF) threshold from 0.01 to 0.1 decreased estimates between 71% and 75%. Considering these effects is important when interpreting RADseq data‐derived estimates of effective population size in empirical studies. John Wiley and Sons Inc. 2020-02-11 /pmc/articles/PMC7042749/ /pubmed/32128126 http://dx.doi.org/10.1002/ece3.6016 Text en © 2020 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Marandel, Florianne
Charrier, Grégory
Lamy, Jean‐Baptiste
Le Cam, Sabrina
Lorance, Pascal
Trenkel, Verena M.
Estimating effective population size using RADseq: Effects of SNP selection and sample size
title Estimating effective population size using RADseq: Effects of SNP selection and sample size
title_full Estimating effective population size using RADseq: Effects of SNP selection and sample size
title_fullStr Estimating effective population size using RADseq: Effects of SNP selection and sample size
title_full_unstemmed Estimating effective population size using RADseq: Effects of SNP selection and sample size
title_short Estimating effective population size using RADseq: Effects of SNP selection and sample size
title_sort estimating effective population size using radseq: effects of snp selection and sample size
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7042749/
https://www.ncbi.nlm.nih.gov/pubmed/32128126
http://dx.doi.org/10.1002/ece3.6016
work_keys_str_mv AT marandelflorianne estimatingeffectivepopulationsizeusingradseqeffectsofsnpselectionandsamplesize
AT charriergregory estimatingeffectivepopulationsizeusingradseqeffectsofsnpselectionandsamplesize
AT lamyjeanbaptiste estimatingeffectivepopulationsizeusingradseqeffectsofsnpselectionandsamplesize
AT lecamsabrina estimatingeffectivepopulationsizeusingradseqeffectsofsnpselectionandsamplesize
AT lorancepascal estimatingeffectivepopulationsizeusingradseqeffectsofsnpselectionandsamplesize
AT trenkelverenam estimatingeffectivepopulationsizeusingradseqeffectsofsnpselectionandsamplesize