Cargando…

SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation

Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are...

Descripción completa

Detalles Bibliográficos
Autores principales: van Steenderen, Clarke J. M., Sutton, Guy F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9306842/
https://www.ncbi.nlm.nih.gov/pubmed/35094502
http://dx.doi.org/10.1111/1755-0998.13591
_version_ 1784752631298654208
author van Steenderen, Clarke J. M.
Sutton, Guy F.
author_facet van Steenderen, Clarke J. M.
Sutton, Guy F.
author_sort van Steenderen, Clarke J. M.
collection PubMed
description Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated data sets, where model assumptions are not violated. Here, we present a user‐friendly R Shiny application, ‘SPEDE‐sampler’ (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that (1) sample size, (2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and (3) singletons have on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it with GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE‐sampler, allowing for the further investigation of potential cryptic species or geographical substructuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program's workflow, and the variation that can arise when applying the GMYC model to empirical data sets. The R Shiny program is available for download at https://github.com/clarkevansteenderen/spede_sampler_R.
format Online
Article
Text
id pubmed-9306842
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-93068422022-07-28 SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation van Steenderen, Clarke J. M. Sutton, Guy F. Mol Ecol Resour RESOURCE ARTICLES Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely used methods is the Generalized Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated data sets, where model assumptions are not violated. Here, we present a user‐friendly R Shiny application, ‘SPEDE‐sampler’ (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that (1) sample size, (2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and (3) singletons have on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it with GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE‐sampler, allowing for the further investigation of potential cryptic species or geographical substructuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program's workflow, and the variation that can arise when applying the GMYC model to empirical data sets. The R Shiny program is available for download at https://github.com/clarkevansteenderen/spede_sampler_R. John Wiley and Sons Inc. 2022-02-16 2022-07 /pmc/articles/PMC9306842/ /pubmed/35094502 http://dx.doi.org/10.1111/1755-0998.13591 Text en © 2022 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle RESOURCE ARTICLES
van Steenderen, Clarke J. M.
Sutton, Guy F.
SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
title SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
title_full SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
title_fullStr SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
title_full_unstemmed SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
title_short SPEDE‐sampler: An R Shiny application to assess how methodological choices and taxon sampling can affect Generalized Mixed Yule Coalescent output and interpretation
title_sort spede‐sampler: an r shiny application to assess how methodological choices and taxon sampling can affect generalized mixed yule coalescent output and interpretation
topic RESOURCE ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9306842/
https://www.ncbi.nlm.nih.gov/pubmed/35094502
http://dx.doi.org/10.1111/1755-0998.13591
work_keys_str_mv AT vansteenderenclarkejm spedesampleranrshinyapplicationtoassesshowmethodologicalchoicesandtaxonsamplingcanaffectgeneralizedmixedyulecoalescentoutputandinterpretation
AT suttonguyf spedesampleranrshinyapplicationtoassesshowmethodologicalchoicesandtaxonsamplingcanaffectgeneralizedmixedyulecoalescentoutputandinterpretation