Cargando…
PhySIC_IST: cleaning source trees to infer more informative supertrees
BACKGROUND: Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescenc...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2576265/ https://www.ncbi.nlm.nih.gov/pubmed/18834542 http://dx.doi.org/10.1186/1471-2105-9-413 |
_version_ | 1782160380828581888 |
---|---|
author | Scornavacca, Celine Berry, Vincent Lefort, Vincent Douzery, Emmanuel JP Ranwez, Vincent |
author_facet | Scornavacca, Celine Berry, Vincent Lefort, Vincent Douzery, Emmanuel JP Ranwez, Vincent |
author_sort | Scornavacca, Celine |
collection | PubMed |
description | BACKGROUND: Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees. RESULTS: To overcome this problem, we propose to infer non-plenary supertrees, i.e. supertrees that do not necessarily contain all the taxa present in the source trees, discarding those whose position greatly differs among source trees or for which insufficient information is provided. We detail a variant of the PhySIC veto method called PhySIC_IST that can infer non-plenary supertrees. PhySIC_IST aims at inferring supertrees that satisfy the same appealing theoretical properties as with PhySIC, while being as informative as possible under this constraint. The informativeness of a supertree is estimated using a variation of the CIC (Cladistic Information Content) criterion, that takes into account both the presence of multifurcations and the absence of some taxa. Additionally, we propose a statistical preprocessing step called STC (Source Trees Correction) to correct the source trees prior to the supertree inference. STC is a liberal step that removes the parts of each source tree that significantly conflict with other source trees. Combining STC with a veto method allows an explicit trade-off between veto and liberal approaches, tuned by a single parameter. Performing large-scale simulations, we observe that STC+PhySIC_IST infers much more informative supertrees than PhySIC, while preserving low type I error compared to the well-known MRP method. Two biological case studies on animals confirm that the STC preprocess successfully detects anomalies in the source trees while STC+PhySIC_IST provides well-resolved supertrees agreeing with current knowledge in systematics. CONCLUSION: The paper introduces and tests two new methodologies, PhySIC_IST and STC, that demonstrate the interest in inferring non-plenary supertrees as well as preprocessing the source trees. An implementation of the methods is available at: . |
format | Text |
id | pubmed-2576265 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-25762652008-10-31 PhySIC_IST: cleaning source trees to infer more informative supertrees Scornavacca, Celine Berry, Vincent Lefort, Vincent Douzery, Emmanuel JP Ranwez, Vincent BMC Bioinformatics Research Article BACKGROUND: Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees. RESULTS: To overcome this problem, we propose to infer non-plenary supertrees, i.e. supertrees that do not necessarily contain all the taxa present in the source trees, discarding those whose position greatly differs among source trees or for which insufficient information is provided. We detail a variant of the PhySIC veto method called PhySIC_IST that can infer non-plenary supertrees. PhySIC_IST aims at inferring supertrees that satisfy the same appealing theoretical properties as with PhySIC, while being as informative as possible under this constraint. The informativeness of a supertree is estimated using a variation of the CIC (Cladistic Information Content) criterion, that takes into account both the presence of multifurcations and the absence of some taxa. Additionally, we propose a statistical preprocessing step called STC (Source Trees Correction) to correct the source trees prior to the supertree inference. STC is a liberal step that removes the parts of each source tree that significantly conflict with other source trees. Combining STC with a veto method allows an explicit trade-off between veto and liberal approaches, tuned by a single parameter. Performing large-scale simulations, we observe that STC+PhySIC_IST infers much more informative supertrees than PhySIC, while preserving low type I error compared to the well-known MRP method. Two biological case studies on animals confirm that the STC preprocess successfully detects anomalies in the source trees while STC+PhySIC_IST provides well-resolved supertrees agreeing with current knowledge in systematics. CONCLUSION: The paper introduces and tests two new methodologies, PhySIC_IST and STC, that demonstrate the interest in inferring non-plenary supertrees as well as preprocessing the source trees. An implementation of the methods is available at: . BioMed Central 2008-10-04 /pmc/articles/PMC2576265/ /pubmed/18834542 http://dx.doi.org/10.1186/1471-2105-9-413 Text en Copyright © 2008 Scornavacca et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Scornavacca, Celine Berry, Vincent Lefort, Vincent Douzery, Emmanuel JP Ranwez, Vincent PhySIC_IST: cleaning source trees to infer more informative supertrees |
title | PhySIC_IST: cleaning source trees to infer more informative supertrees |
title_full | PhySIC_IST: cleaning source trees to infer more informative supertrees |
title_fullStr | PhySIC_IST: cleaning source trees to infer more informative supertrees |
title_full_unstemmed | PhySIC_IST: cleaning source trees to infer more informative supertrees |
title_short | PhySIC_IST: cleaning source trees to infer more informative supertrees |
title_sort | physic_ist: cleaning source trees to infer more informative supertrees |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2576265/ https://www.ncbi.nlm.nih.gov/pubmed/18834542 http://dx.doi.org/10.1186/1471-2105-9-413 |
work_keys_str_mv | AT scornavaccaceline physicistcleaningsourcetreestoinfermoreinformativesupertrees AT berryvincent physicistcleaningsourcetreestoinfermoreinformativesupertrees AT lefortvincent physicistcleaningsourcetreestoinfermoreinformativesupertrees AT douzeryemmanueljp physicistcleaningsourcetreestoinfermoreinformativesupertrees AT ranwezvincent physicistcleaningsourcetreestoinfermoreinformativesupertrees |