Cargando…

Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis

Phylogenetic analysis based on multi-loci data sets is performed by means of supermatrix (SM) or supertree (ST) approaches. Recently, methods that rely on species tree (SppT) inference by the multi-species coalescence have also been implemented to tackle this problem. Generally, the relative perform...

Descripción completa

Detalles Bibliográficos
Autores principales: Aguiar, Barbara O., Schrago, Carlos G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748088/
https://www.ncbi.nlm.nih.gov/pubmed/23997573
http://dx.doi.org/10.4137/EBO.S12483
_version_ 1782281033070149632
author Aguiar, Barbara O.
Schrago, Carlos G.
author_facet Aguiar, Barbara O.
Schrago, Carlos G.
author_sort Aguiar, Barbara O.
collection PubMed
description Phylogenetic analysis based on multi-loci data sets is performed by means of supermatrix (SM) or supertree (ST) approaches. Recently, methods that rely on species tree (SppT) inference by the multi-species coalescence have also been implemented to tackle this problem. Generally, the relative performance of these three major strategies has been calculated using simulation of biological sequences. However, sequence simulation may not entirely replicate the complexity of the evolutionary process. Thus, issues regarding the usefulness of in silico sequences in studying the performance of phylogenetic methods have been raised. Here, we used both classical simulation and empirical data to investigate the relative performance of ST, SM, and the SppT methods. SM analyses performed better than the ST and SppTs in simulations, but not in empirical analyses where some ST methods significantly outperformed the others. Additionally, SM was the only method that was robust under evolutionary model violations in simulations. These results show that conventional biological sequence simulation cannot adequately resolve which method is most efficient to recover the SppT. In such simulations, the SM approach recovers the established phylogeny in most instances, whereas the performance of the ST and SppT methods is downgraded in simpler cases. When compared, the analyses based on empirical and simulated sequences yielded largely inconsistent results, with the latter showing a bias towards a seemingly superiority of SM approaches.
format Online
Article
Text
id pubmed-3748088
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-37480882013-08-30 Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis Aguiar, Barbara O. Schrago, Carlos G. Evol Bioinform Online Original Research Phylogenetic analysis based on multi-loci data sets is performed by means of supermatrix (SM) or supertree (ST) approaches. Recently, methods that rely on species tree (SppT) inference by the multi-species coalescence have also been implemented to tackle this problem. Generally, the relative performance of these three major strategies has been calculated using simulation of biological sequences. However, sequence simulation may not entirely replicate the complexity of the evolutionary process. Thus, issues regarding the usefulness of in silico sequences in studying the performance of phylogenetic methods have been raised. Here, we used both classical simulation and empirical data to investigate the relative performance of ST, SM, and the SppT methods. SM analyses performed better than the ST and SppTs in simulations, but not in empirical analyses where some ST methods significantly outperformed the others. Additionally, SM was the only method that was robust under evolutionary model violations in simulations. These results show that conventional biological sequence simulation cannot adequately resolve which method is most efficient to recover the SppT. In such simulations, the SM approach recovers the established phylogeny in most instances, whereas the performance of the ST and SppT methods is downgraded in simpler cases. When compared, the analyses based on empirical and simulated sequences yielded largely inconsistent results, with the latter showing a bias towards a seemingly superiority of SM approaches. Libertas Academica 2013-08-13 /pmc/articles/PMC3748088/ /pubmed/23997573 http://dx.doi.org/10.4137/EBO.S12483 Text en © 2013 the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article published under the Creative Commons CC-BY-NC 3.0 license.
spellingShingle Original Research
Aguiar, Barbara O.
Schrago, Carlos G.
Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis
title Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis
title_full Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis
title_fullStr Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis
title_full_unstemmed Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis
title_short Conventional Simulation of Biological Sequences Leads to a Biased Assessment of Multi-Loci Phylogenetic Analysis
title_sort conventional simulation of biological sequences leads to a biased assessment of multi-loci phylogenetic analysis
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748088/
https://www.ncbi.nlm.nih.gov/pubmed/23997573
http://dx.doi.org/10.4137/EBO.S12483
work_keys_str_mv AT aguiarbarbarao conventionalsimulationofbiologicalsequencesleadstoabiasedassessmentofmultilociphylogeneticanalysis
AT schragocarlosg conventionalsimulationofbiologicalsequencesleadstoabiasedassessmentofmultilociphylogeneticanalysis