Cargando…

Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods

Genome-scale species tree inference is largely restricted to heuristic approaches that use estimated gene trees to reconstruct species-level relationships. Central to these heuristic species tree methods is the assumption that the gene trees are estimated without error. To increase the accuracy of i...

Descripción completa

Detalles Bibliográficos
Autores principales: Adams, Richard H., Castoe, Todd A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6812401/
https://www.ncbi.nlm.nih.gov/pubmed/31667118
http://dx.doi.org/10.1016/j.mex.2019.09.025
_version_ 1783462650422755328
author Adams, Richard H.
Castoe, Todd A.
author_facet Adams, Richard H.
Castoe, Todd A.
author_sort Adams, Richard H.
collection PubMed
description Genome-scale species tree inference is largely restricted to heuristic approaches that use estimated gene trees to reconstruct species-level relationships. Central to these heuristic species tree methods is the assumption that the gene trees are estimated without error. To increase the accuracy of input gene trees used to infer species trees, several techniques have recently been developed for constructing longer “supergenes” that represent sets of loci inferred to share the same genealogical history. While these supergene methods are designed to increase the amount of data for gene tree estimation by concatenating several loci into “supergenes” to increase gene tree accuracy, no formal protocols have been proposed to validate this key “supergene” concatenation step. In a recent study, we developed several supergene validation strategies for assessing the accuracy of a popular supergene method: the so-called “statistical binning” pipeline. In this article, we describe a more generalizable and model-based “supergene validation” protocol for assessing the accuracy of supergenes and supergene methods using model-based tests of phylogenetic congruency. • Supergenes are validated by adopting model-based tests of topological congruence; • These model-based procedures out preform non-model based methods for supergene construction; • The results of this protocol can be used to assess the overall performance of a supergene method across a phylogenomic dataset.
format Online
Article
Text
id pubmed-6812401
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-68124012019-10-30 Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods Adams, Richard H. Castoe, Todd A. MethodsX Biochemistry, Genetics and Molecular Biology Genome-scale species tree inference is largely restricted to heuristic approaches that use estimated gene trees to reconstruct species-level relationships. Central to these heuristic species tree methods is the assumption that the gene trees are estimated without error. To increase the accuracy of input gene trees used to infer species trees, several techniques have recently been developed for constructing longer “supergenes” that represent sets of loci inferred to share the same genealogical history. While these supergene methods are designed to increase the amount of data for gene tree estimation by concatenating several loci into “supergenes” to increase gene tree accuracy, no formal protocols have been proposed to validate this key “supergene” concatenation step. In a recent study, we developed several supergene validation strategies for assessing the accuracy of a popular supergene method: the so-called “statistical binning” pipeline. In this article, we describe a more generalizable and model-based “supergene validation” protocol for assessing the accuracy of supergenes and supergene methods using model-based tests of phylogenetic congruency. • Supergenes are validated by adopting model-based tests of topological congruence; • These model-based procedures out preform non-model based methods for supergene construction; • The results of this protocol can be used to assess the overall performance of a supergene method across a phylogenomic dataset. Elsevier 2019-09-24 /pmc/articles/PMC6812401/ /pubmed/31667118 http://dx.doi.org/10.1016/j.mex.2019.09.025 Text en © 2019 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Biochemistry, Genetics and Molecular Biology
Adams, Richard H.
Castoe, Todd A.
Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods
title Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods
title_full Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods
title_fullStr Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods
title_full_unstemmed Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods
title_short Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods
title_sort supergene validation: a model-based protocol for assessing the accuracy of non-model-based supergene methods
topic Biochemistry, Genetics and Molecular Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6812401/
https://www.ncbi.nlm.nih.gov/pubmed/31667118
http://dx.doi.org/10.1016/j.mex.2019.09.025
work_keys_str_mv AT adamsrichardh supergenevalidationamodelbasedprotocolforassessingtheaccuracyofnonmodelbasedsupergenemethods
AT castoetodda supergenevalidationamodelbasedprotocolforassessingtheaccuracyofnonmodelbasedsupergenemethods