Cargando…
Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects
BACKGROUND: Published molecular phylogenies are usually based on data whose quality has not been explored prior to tree inference. This leads to errors because trees obtained with conventional methods suppress conflicting evidence, and because support values may be high even if there is no distinct...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2040160/ https://www.ncbi.nlm.nih.gov/pubmed/17725833 http://dx.doi.org/10.1186/1471-2148-7-147 |
_version_ | 1782137075229786112 |
---|---|
author | Wägele, Johann Wolfgang Mayer, Christoph |
author_facet | Wägele, Johann Wolfgang Mayer, Christoph |
author_sort | Wägele, Johann Wolfgang |
collection | PubMed |
description | BACKGROUND: Published molecular phylogenies are usually based on data whose quality has not been explored prior to tree inference. This leads to errors because trees obtained with conventional methods suppress conflicting evidence, and because support values may be high even if there is no distinct phylogenetic signal. Tools that allow an a priori examination of data quality are rarely applied. RESULTS: Using data from published molecular analyses on the phylogeny of crustaceans it is shown that tree topologies and popular support values do not show existing differences in data quality. To visualize variations in signal distinctness, we use network analyses based on split decomposition and split support spectra. Both methods show the same differences in data quality and the same clade-supporting patterns. Both methods are useful to discover long-branch effects. We discern three classes of long branch effects. Class I effects consist of attraction of terminal taxa caused by symplesiomorphies, which results in a false monophyly of paraphyletic groups. Addition of carefully selected taxa can fix this effect. Class II effects are caused by drastic signal erosion. Long branches affected by this phenomenon usually slip down the tree to form false clades that in reality are polyphyletic. To recover the correct phylogeny, more conservative genes must be used. Class III effects consist of attraction due to accumulated chance similarities or convergent character states. This sort of noise can be reduced by selecting less variable portions of the data set, avoiding biases, and adding slower genes. CONCLUSION: To increase confidence in molecular phylogenies an exploratory analysis of the signal to noise ratio can be conducted with split decomposition methods. If long-branch effects are detected, it is necessary to discern between three classes of effects to find the best approach for an improvement of the raw data. |
format | Text |
id | pubmed-2040160 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-20401602007-10-23 Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects Wägele, Johann Wolfgang Mayer, Christoph BMC Evol Biol Research Article BACKGROUND: Published molecular phylogenies are usually based on data whose quality has not been explored prior to tree inference. This leads to errors because trees obtained with conventional methods suppress conflicting evidence, and because support values may be high even if there is no distinct phylogenetic signal. Tools that allow an a priori examination of data quality are rarely applied. RESULTS: Using data from published molecular analyses on the phylogeny of crustaceans it is shown that tree topologies and popular support values do not show existing differences in data quality. To visualize variations in signal distinctness, we use network analyses based on split decomposition and split support spectra. Both methods show the same differences in data quality and the same clade-supporting patterns. Both methods are useful to discover long-branch effects. We discern three classes of long branch effects. Class I effects consist of attraction of terminal taxa caused by symplesiomorphies, which results in a false monophyly of paraphyletic groups. Addition of carefully selected taxa can fix this effect. Class II effects are caused by drastic signal erosion. Long branches affected by this phenomenon usually slip down the tree to form false clades that in reality are polyphyletic. To recover the correct phylogeny, more conservative genes must be used. Class III effects consist of attraction due to accumulated chance similarities or convergent character states. This sort of noise can be reduced by selecting less variable portions of the data set, avoiding biases, and adding slower genes. CONCLUSION: To increase confidence in molecular phylogenies an exploratory analysis of the signal to noise ratio can be conducted with split decomposition methods. If long-branch effects are detected, it is necessary to discern between three classes of effects to find the best approach for an improvement of the raw data. BioMed Central 2007-08-28 /pmc/articles/PMC2040160/ /pubmed/17725833 http://dx.doi.org/10.1186/1471-2148-7-147 Text en Copyright © 2007 Wägele and Mayer; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Wägele, Johann Wolfgang Mayer, Christoph Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
title | Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
title_full | Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
title_fullStr | Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
title_full_unstemmed | Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
title_short | Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
title_sort | visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2040160/ https://www.ncbi.nlm.nih.gov/pubmed/17725833 http://dx.doi.org/10.1186/1471-2148-7-147 |
work_keys_str_mv | AT wagelejohannwolfgang visualizingdifferencesinphylogeneticinformationcontentofalignmentsanddistinctionofthreeclassesoflongbrancheffects AT mayerchristoph visualizingdifferencesinphylogeneticinformationcontentofalignmentsanddistinctionofthreeclassesoflongbrancheffects |