Cargando…

Accuracy of phylogeny reconstruction methods combining overlapping gene data sets

BACKGROUND: The availability of many gene alignments with overlapping taxon sets raises the question of which strategy is the best to infer species phylogenies from multiple gene information. Methods and programs abound that use the gene alignment in different ways to reconstruct the species tree. I...

Descripción completa

Detalles Bibliográficos
Autores principales: Kupczok, Anne, Schmidt, Heiko A, von Haeseler, Arndt
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3022592/
https://www.ncbi.nlm.nih.gov/pubmed/21134245
http://dx.doi.org/10.1186/1748-7188-5-37
_version_ 1782196526533050368
author Kupczok, Anne
Schmidt, Heiko A
von Haeseler, Arndt
author_facet Kupczok, Anne
Schmidt, Heiko A
von Haeseler, Arndt
author_sort Kupczok, Anne
collection PubMed
description BACKGROUND: The availability of many gene alignments with overlapping taxon sets raises the question of which strategy is the best to infer species phylogenies from multiple gene information. Methods and programs abound that use the gene alignment in different ways to reconstruct the species tree. In particular, different methods combine the original data at different points along the way from the underlying sequences to the final tree. Accordingly, they are classified into superalignment, supertree and medium-level approaches. Here, we present a simulation study to compare different methods from each of these three approaches. RESULTS: We observe that superalignment methods usually outperform the other approaches over a wide range of parameters including sparse data and gene-specific evolutionary parameters. In the presence of high incongruency among gene trees, however, other combination methods show better performance than the superalignment approach. Surprisingly, some supertree and medium-level methods exhibit, on average, worse results than a single gene phylogeny with complete taxon information. CONCLUSIONS: For some methods, using the reconstructed gene tree as an estimation of the species tree is superior to the combination of incomplete information. Superalignment usually performs best since it is less susceptible to stochastic error. Supertree methods can outperform superalignment in the presence of gene-tree conflict.
format Text
id pubmed-3022592
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30225922011-01-19 Accuracy of phylogeny reconstruction methods combining overlapping gene data sets Kupczok, Anne Schmidt, Heiko A von Haeseler, Arndt Algorithms Mol Biol Research BACKGROUND: The availability of many gene alignments with overlapping taxon sets raises the question of which strategy is the best to infer species phylogenies from multiple gene information. Methods and programs abound that use the gene alignment in different ways to reconstruct the species tree. In particular, different methods combine the original data at different points along the way from the underlying sequences to the final tree. Accordingly, they are classified into superalignment, supertree and medium-level approaches. Here, we present a simulation study to compare different methods from each of these three approaches. RESULTS: We observe that superalignment methods usually outperform the other approaches over a wide range of parameters including sparse data and gene-specific evolutionary parameters. In the presence of high incongruency among gene trees, however, other combination methods show better performance than the superalignment approach. Surprisingly, some supertree and medium-level methods exhibit, on average, worse results than a single gene phylogeny with complete taxon information. CONCLUSIONS: For some methods, using the reconstructed gene tree as an estimation of the species tree is superior to the combination of incomplete information. Superalignment usually performs best since it is less susceptible to stochastic error. Supertree methods can outperform superalignment in the presence of gene-tree conflict. BioMed Central 2010-12-06 /pmc/articles/PMC3022592/ /pubmed/21134245 http://dx.doi.org/10.1186/1748-7188-5-37 Text en Copyright ©2010 Kupczok et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Kupczok, Anne
Schmidt, Heiko A
von Haeseler, Arndt
Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
title Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
title_full Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
title_fullStr Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
title_full_unstemmed Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
title_short Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
title_sort accuracy of phylogeny reconstruction methods combining overlapping gene data sets
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3022592/
https://www.ncbi.nlm.nih.gov/pubmed/21134245
http://dx.doi.org/10.1186/1748-7188-5-37
work_keys_str_mv AT kupczokanne accuracyofphylogenyreconstructionmethodscombiningoverlappinggenedatasets
AT schmidtheikoa accuracyofphylogenyreconstructionmethodscombiningoverlappinggenedatasets
AT vonhaeselerarndt accuracyofphylogenyreconstructionmethodscombiningoverlappinggenedatasets