Cargando…

A simulation study comparing supertree and combined analysis methods using SMIDGen

BACKGROUND: Supertree methods comprise one approach to reconstructing large molecular phylogenies given multi-marker datasets: trees are estimated on each marker and then combined into a tree (the "supertree") on the entire set of taxa. Supertrees can be constructed using various algorithm...

Descripción completa

Detalles Bibliográficos
Autores principales: Swenson, M Shel, Barbançon, François, Warnow, Tandy, Linder, C Randal
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837663/
https://www.ncbi.nlm.nih.gov/pubmed/20047664
http://dx.doi.org/10.1186/1748-7188-5-8
_version_ 1782178841620381696
author Swenson, M Shel
Barbançon, François
Warnow, Tandy
Linder, C Randal
author_facet Swenson, M Shel
Barbançon, François
Warnow, Tandy
Linder, C Randal
author_sort Swenson, M Shel
collection PubMed
description BACKGROUND: Supertree methods comprise one approach to reconstructing large molecular phylogenies given multi-marker datasets: trees are estimated on each marker and then combined into a tree (the "supertree") on the entire set of taxa. Supertrees can be constructed using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a "supermatrix" or "total evidence" approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix. RESULTS: In this paper, we describe an extensive simulation study we performed comparing two supertree methods, MRP and weighted MRP, to combined analysis methods on large model trees. A key contribution of this study is our novel simulation methodology (Super-Method Input Data Generator, or SMIDGen) that better reflects biological processes and the practices of systematists than earlier simulations. We show that combined analysis based upon maximum likelihood outperforms MRP and weighted MRP, giving especially big improvements when the largest subtree does not contain most of the taxa. CONCLUSIONS: This study demonstrates that MRP and weighted MRP produce distinctly less accurate trees than combined analyses for a given base method (maximum parsimony or maximum likelihood). Since there are situations in which combined analyses are not feasible, there is a clear need for better supertree methods. The source tree and combined datasets used in this study can be used to test other supertree and combined analysis methods.
format Text
id pubmed-2837663
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28376632010-03-13 A simulation study comparing supertree and combined analysis methods using SMIDGen Swenson, M Shel Barbançon, François Warnow, Tandy Linder, C Randal Algorithms Mol Biol Research BACKGROUND: Supertree methods comprise one approach to reconstructing large molecular phylogenies given multi-marker datasets: trees are estimated on each marker and then combined into a tree (the "supertree") on the entire set of taxa. Supertrees can be constructed using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a "supermatrix" or "total evidence" approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix. RESULTS: In this paper, we describe an extensive simulation study we performed comparing two supertree methods, MRP and weighted MRP, to combined analysis methods on large model trees. A key contribution of this study is our novel simulation methodology (Super-Method Input Data Generator, or SMIDGen) that better reflects biological processes and the practices of systematists than earlier simulations. We show that combined analysis based upon maximum likelihood outperforms MRP and weighted MRP, giving especially big improvements when the largest subtree does not contain most of the taxa. CONCLUSIONS: This study demonstrates that MRP and weighted MRP produce distinctly less accurate trees than combined analyses for a given base method (maximum parsimony or maximum likelihood). Since there are situations in which combined analyses are not feasible, there is a clear need for better supertree methods. The source tree and combined datasets used in this study can be used to test other supertree and combined analysis methods. BioMed Central 2010-01-04 /pmc/articles/PMC2837663/ /pubmed/20047664 http://dx.doi.org/10.1186/1748-7188-5-8 Text en Copyright ©2010 Swenson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Swenson, M Shel
Barbançon, François
Warnow, Tandy
Linder, C Randal
A simulation study comparing supertree and combined analysis methods using SMIDGen
title A simulation study comparing supertree and combined analysis methods using SMIDGen
title_full A simulation study comparing supertree and combined analysis methods using SMIDGen
title_fullStr A simulation study comparing supertree and combined analysis methods using SMIDGen
title_full_unstemmed A simulation study comparing supertree and combined analysis methods using SMIDGen
title_short A simulation study comparing supertree and combined analysis methods using SMIDGen
title_sort simulation study comparing supertree and combined analysis methods using smidgen
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837663/
https://www.ncbi.nlm.nih.gov/pubmed/20047664
http://dx.doi.org/10.1186/1748-7188-5-8
work_keys_str_mv AT swensonmshel asimulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT barbanconfrancois asimulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT warnowtandy asimulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT lindercrandal asimulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT swensonmshel simulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT barbanconfrancois simulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT warnowtandy simulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen
AT lindercrandal simulationstudycomparingsupertreeandcombinedanalysismethodsusingsmidgen