Cargando…

There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates

The application of high-density polymorphic single-nucleotide polymorphisms (SNP) markers derived from high-throughput sequencing methods has heralded plenty of biological questions about the linkages of processes operating at micro- and macroevolutionary scales. However, the effects of SNP filterin...

Descripción completa

Detalles Bibliográficos
Autores principales: Nazareno, Alison G., Knowles, L. Lacey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8551369/
https://www.ncbi.nlm.nih.gov/pubmed/34721447
http://dx.doi.org/10.3389/fpls.2021.677009
_version_ 1784591142162006016
author Nazareno, Alison G.
Knowles, L. Lacey
author_facet Nazareno, Alison G.
Knowles, L. Lacey
author_sort Nazareno, Alison G.
collection PubMed
description The application of high-density polymorphic single-nucleotide polymorphisms (SNP) markers derived from high-throughput sequencing methods has heralded plenty of biological questions about the linkages of processes operating at micro- and macroevolutionary scales. However, the effects of SNP filtering practices on population genetic inference have received much less attention. By performing sensitivity analyses, we empirically investigated how decisions about the percentage of missing data (MD) and the minor allele frequency (MAF) set in bioinformatic processing of genomic data affect direct (i.e., parentage analysis) and indirect (i.e., fine-scale spatial genetic structure – SGS) gene flow estimates. We focus specifically on these manifestations in small plant populations, and particularly, in the rare tropical plant species Dinizia jueirana-facao, where assumptions implicit to analytical procedures for accurate estimates of gene flow may not hold. Avoiding biases in dispersal estimates are essential given this species is facing extinction risks due to habitat loss, and so we also investigate the effects of forest fragmentation on the accuracy of dispersal estimates under different filtering criteria by testing for recent decrease in the scale of gene flow. Our sensitivity analyses demonstrate that gene flow estimates are robust to different setting of MAF (0.05–0.35) and MD (0–20%). Comparing the direct and indirect estimates of dispersal, we find that contemporary estimates of gene dispersal distance (σ(r)(t) = 41.8 m) was ∼ fourfold smaller than the historical estimates, supporting the hypothesis of a temporal shift in the scale of gene flow in D. jueirana-facao, which is consistent with predictions based on recent, dramatic forest fragmentation process. While we identified settings for filtering genomic data to avoid biases in gene flow estimates, we stress that there is no ‘rule of thumb’ for bioinformatic filtering and that relying on default program settings is not advisable. Instead, we suggest that the approach implemented here be applied independently in each separate empirical study to confirm appropriate settings to obtain unbiased population genetics estimates.
format Online
Article
Text
id pubmed-8551369
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-85513692021-10-29 There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates Nazareno, Alison G. Knowles, L. Lacey Front Plant Sci Plant Science The application of high-density polymorphic single-nucleotide polymorphisms (SNP) markers derived from high-throughput sequencing methods has heralded plenty of biological questions about the linkages of processes operating at micro- and macroevolutionary scales. However, the effects of SNP filtering practices on population genetic inference have received much less attention. By performing sensitivity analyses, we empirically investigated how decisions about the percentage of missing data (MD) and the minor allele frequency (MAF) set in bioinformatic processing of genomic data affect direct (i.e., parentage analysis) and indirect (i.e., fine-scale spatial genetic structure – SGS) gene flow estimates. We focus specifically on these manifestations in small plant populations, and particularly, in the rare tropical plant species Dinizia jueirana-facao, where assumptions implicit to analytical procedures for accurate estimates of gene flow may not hold. Avoiding biases in dispersal estimates are essential given this species is facing extinction risks due to habitat loss, and so we also investigate the effects of forest fragmentation on the accuracy of dispersal estimates under different filtering criteria by testing for recent decrease in the scale of gene flow. Our sensitivity analyses demonstrate that gene flow estimates are robust to different setting of MAF (0.05–0.35) and MD (0–20%). Comparing the direct and indirect estimates of dispersal, we find that contemporary estimates of gene dispersal distance (σ(r)(t) = 41.8 m) was ∼ fourfold smaller than the historical estimates, supporting the hypothesis of a temporal shift in the scale of gene flow in D. jueirana-facao, which is consistent with predictions based on recent, dramatic forest fragmentation process. While we identified settings for filtering genomic data to avoid biases in gene flow estimates, we stress that there is no ‘rule of thumb’ for bioinformatic filtering and that relying on default program settings is not advisable. Instead, we suggest that the approach implemented here be applied independently in each separate empirical study to confirm appropriate settings to obtain unbiased population genetics estimates. Frontiers Media S.A. 2021-10-14 /pmc/articles/PMC8551369/ /pubmed/34721447 http://dx.doi.org/10.3389/fpls.2021.677009 Text en Copyright © 2021 Nazareno and Knowles. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Nazareno, Alison G.
Knowles, L. Lacey
There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates
title There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates
title_full There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates
title_fullStr There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates
title_full_unstemmed There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates
title_short There Is No ‘Rule of Thumb’: Genomic Filter Settings for a Small Plant Population to Obtain Unbiased Gene Flow Estimates
title_sort there is no ‘rule of thumb’: genomic filter settings for a small plant population to obtain unbiased gene flow estimates
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8551369/
https://www.ncbi.nlm.nih.gov/pubmed/34721447
http://dx.doi.org/10.3389/fpls.2021.677009
work_keys_str_mv AT nazarenoalisong thereisnoruleofthumbgenomicfiltersettingsforasmallplantpopulationtoobtainunbiasedgeneflowestimates
AT knowlesllacey thereisnoruleofthumbgenomicfiltersettingsforasmallplantpopulationtoobtainunbiasedgeneflowestimates