Cargando…

EUCALYPT: efficient tree reconciliation enumerator

BACKGROUND: Phylogenetic tree reconciliation is the approach of choice for investigating the coevolution of sets of organisms such as hosts and parasites. It consists in a mapping between the parasite tree and the host tree using event-based maximum parsimony. Given a cost model for the events, many...

Descripción completa

Detalles Bibliográficos
Autores principales: Donati, Beatrice, Baudet, Christian, Sinaimeri, Blerina, Crescenzi, Pierluigi, Sagot, Marie-France
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4310143/
https://www.ncbi.nlm.nih.gov/pubmed/25648467
http://dx.doi.org/10.1186/s13015-014-0031-3
_version_ 1782354815750242304
author Donati, Beatrice
Baudet, Christian
Sinaimeri, Blerina
Crescenzi, Pierluigi
Sagot, Marie-France
author_facet Donati, Beatrice
Baudet, Christian
Sinaimeri, Blerina
Crescenzi, Pierluigi
Sagot, Marie-France
author_sort Donati, Beatrice
collection PubMed
description BACKGROUND: Phylogenetic tree reconciliation is the approach of choice for investigating the coevolution of sets of organisms such as hosts and parasites. It consists in a mapping between the parasite tree and the host tree using event-based maximum parsimony. Given a cost model for the events, many optimal reconciliations are however possible. Any further biological interpretation of them must therefore take this into account, making the capacity to enumerate all optimal solutions a crucial point. Only two algorithms currently exist that attempt such enumeration; in one case not all possible solutions are produced while in the other not all cost vectors are currently handled. The objective of this paper is two-fold. The first is to fill this gap, and the second is to test whether the number of solutions generally observed can be an issue in terms of interpretation. RESULTS: We present a polynomial-delay algorithm for enumerating all optimal reconciliations. We show that in general many solutions exist. We give an example where, for two pairs of host-parasite trees having each less than 41 leaves, the number of solutions is 5120, even when only time-feasible ones are kept. To facilitate their interpretation, those solutions are also classified in terms of how many of each event they contain. The number of different classes of solutions may thus be notably smaller than the number of solutions, yet they may remain high enough, in particular for the cases where losses have cost 0. In fact, depending on the cost vector, both numbers of solutions and of classes thereof may increase considerably. To further deal with this problem, we introduce and analyse a restricted version where host switches are allowed to happen only between species that are within some fixed distance along the host tree. This restriction allows us to reduce the number of time-feasible solutions while preserving the same optimal cost, as well as to find time-feasible solutions with a cost close to the optimal in the cases where no time-feasible solution is found. CONCLUSIONS: We present Eucalypt, a polynomial-delay algorithm for enumerating all optimal reconciliations which is freely available at http://eucalypt.gforge.inria.fr/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13015-014-0031-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4310143
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43101432015-02-03 EUCALYPT: efficient tree reconciliation enumerator Donati, Beatrice Baudet, Christian Sinaimeri, Blerina Crescenzi, Pierluigi Sagot, Marie-France Algorithms Mol Biol Software Article BACKGROUND: Phylogenetic tree reconciliation is the approach of choice for investigating the coevolution of sets of organisms such as hosts and parasites. It consists in a mapping between the parasite tree and the host tree using event-based maximum parsimony. Given a cost model for the events, many optimal reconciliations are however possible. Any further biological interpretation of them must therefore take this into account, making the capacity to enumerate all optimal solutions a crucial point. Only two algorithms currently exist that attempt such enumeration; in one case not all possible solutions are produced while in the other not all cost vectors are currently handled. The objective of this paper is two-fold. The first is to fill this gap, and the second is to test whether the number of solutions generally observed can be an issue in terms of interpretation. RESULTS: We present a polynomial-delay algorithm for enumerating all optimal reconciliations. We show that in general many solutions exist. We give an example where, for two pairs of host-parasite trees having each less than 41 leaves, the number of solutions is 5120, even when only time-feasible ones are kept. To facilitate their interpretation, those solutions are also classified in terms of how many of each event they contain. The number of different classes of solutions may thus be notably smaller than the number of solutions, yet they may remain high enough, in particular for the cases where losses have cost 0. In fact, depending on the cost vector, both numbers of solutions and of classes thereof may increase considerably. To further deal with this problem, we introduce and analyse a restricted version where host switches are allowed to happen only between species that are within some fixed distance along the host tree. This restriction allows us to reduce the number of time-feasible solutions while preserving the same optimal cost, as well as to find time-feasible solutions with a cost close to the optimal in the cases where no time-feasible solution is found. CONCLUSIONS: We present Eucalypt, a polynomial-delay algorithm for enumerating all optimal reconciliations which is freely available at http://eucalypt.gforge.inria.fr/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13015-014-0031-3) contains supplementary material, which is available to authorized users. BioMed Central 2015-01-23 /pmc/articles/PMC4310143/ /pubmed/25648467 http://dx.doi.org/10.1186/s13015-014-0031-3 Text en © Donati et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software Article
Donati, Beatrice
Baudet, Christian
Sinaimeri, Blerina
Crescenzi, Pierluigi
Sagot, Marie-France
EUCALYPT: efficient tree reconciliation enumerator
title EUCALYPT: efficient tree reconciliation enumerator
title_full EUCALYPT: efficient tree reconciliation enumerator
title_fullStr EUCALYPT: efficient tree reconciliation enumerator
title_full_unstemmed EUCALYPT: efficient tree reconciliation enumerator
title_short EUCALYPT: efficient tree reconciliation enumerator
title_sort eucalypt: efficient tree reconciliation enumerator
topic Software Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4310143/
https://www.ncbi.nlm.nih.gov/pubmed/25648467
http://dx.doi.org/10.1186/s13015-014-0031-3
work_keys_str_mv AT donatibeatrice eucalyptefficienttreereconciliationenumerator
AT baudetchristian eucalyptefficienttreereconciliationenumerator
AT sinaimeriblerina eucalyptefficienttreereconciliationenumerator
AT crescenzipierluigi eucalyptefficienttreereconciliationenumerator
AT sagotmariefrance eucalyptefficienttreereconciliationenumerator