Cargando…

A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality

BACKGROUND: Given a gene and a species tree, reconciliation methods attempt to retrieve the macro-evolutionary events that best explain the discrepancies between the two tree topologies. The DTL parsimonious approach searches for a most parsimonious reconciliation between a gene tree and a (dated) s...

Descripción completa

Detalles Bibliográficos
Autores principales: To, Thu-Hien, Jacox, Edwin, Ranwez, Vincent, Scornavacca, Celine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4647304/
https://www.ncbi.nlm.nih.gov/pubmed/26573665
http://dx.doi.org/10.1186/s12859-015-0803-x
_version_ 1782401071220523008
author To, Thu-Hien
Jacox, Edwin
Ranwez, Vincent
Scornavacca, Celine
author_facet To, Thu-Hien
Jacox, Edwin
Ranwez, Vincent
Scornavacca, Celine
author_sort To, Thu-Hien
collection PubMed
description BACKGROUND: Given a gene and a species tree, reconciliation methods attempt to retrieve the macro-evolutionary events that best explain the discrepancies between the two tree topologies. The DTL parsimonious approach searches for a most parsimonious reconciliation between a gene tree and a (dated) species tree, considering four possible macro-evolutionary events (speciation, duplication, transfer, and loss) with specific costs. Unfortunately, many events are erroneously predicted due to errors in the input trees, inappropriate input cost values or because of the existence of several equally parsimonious scenarios. It is thus crucial to provide a measure of the reliability for predicted events. It has been recently proposed that the reliability of an event can be estimated via its frequency in the set of most parsimonious reconciliations obtained using a variety of reasonable input cost vectors. To compute such a support, a straightforward but time-consuming approach is to generate the costs slightly departing from the original ones, independently compute the set of all most parsimonious reconciliations for each vector, and combine these sets a posteriori. Another proposed approach uses Pareto-optimality to partition cost values into regions which induce reconciliations with the same number of DTL events. The support of an event is then defined as its frequency in the set of regions. However, often, the number of regions is not large enough to provide reliable supports. RESULTS: We present here a method to compute efficiently event supports via a polynomial-sized graph, which can represent all reconciliations for several different costs. Moreover, two methods are proposed to take into account alternative input costs: either explicitly providing an input cost range or allowing a tolerance for the over cost of a reconciliation. Our methods are faster than the region based method, substantially faster than the sampling-costs approach, and have a higher event-prediction accuracy on simulated data. CONCLUSIONS: We propose a new approach to improve the accuracy of event supports for parsimonious reconciliation methods to account for uncertainty in the input costs. Furthermore, because of their speed, our methods can be used on large gene families. Our algorithms are implemented in the ecceTERA program, freely available from http://mbb.univ-montp2.fr/MBB/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0803-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4647304
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46473042015-11-18 A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality To, Thu-Hien Jacox, Edwin Ranwez, Vincent Scornavacca, Celine BMC Bioinformatics Research Article BACKGROUND: Given a gene and a species tree, reconciliation methods attempt to retrieve the macro-evolutionary events that best explain the discrepancies between the two tree topologies. The DTL parsimonious approach searches for a most parsimonious reconciliation between a gene tree and a (dated) species tree, considering four possible macro-evolutionary events (speciation, duplication, transfer, and loss) with specific costs. Unfortunately, many events are erroneously predicted due to errors in the input trees, inappropriate input cost values or because of the existence of several equally parsimonious scenarios. It is thus crucial to provide a measure of the reliability for predicted events. It has been recently proposed that the reliability of an event can be estimated via its frequency in the set of most parsimonious reconciliations obtained using a variety of reasonable input cost vectors. To compute such a support, a straightforward but time-consuming approach is to generate the costs slightly departing from the original ones, independently compute the set of all most parsimonious reconciliations for each vector, and combine these sets a posteriori. Another proposed approach uses Pareto-optimality to partition cost values into regions which induce reconciliations with the same number of DTL events. The support of an event is then defined as its frequency in the set of regions. However, often, the number of regions is not large enough to provide reliable supports. RESULTS: We present here a method to compute efficiently event supports via a polynomial-sized graph, which can represent all reconciliations for several different costs. Moreover, two methods are proposed to take into account alternative input costs: either explicitly providing an input cost range or allowing a tolerance for the over cost of a reconciliation. Our methods are faster than the region based method, substantially faster than the sampling-costs approach, and have a higher event-prediction accuracy on simulated data. CONCLUSIONS: We propose a new approach to improve the accuracy of event supports for parsimonious reconciliation methods to account for uncertainty in the input costs. Furthermore, because of their speed, our methods can be used on large gene families. Our algorithms are implemented in the ecceTERA program, freely available from http://mbb.univ-montp2.fr/MBB/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0803-x) contains supplementary material, which is available to authorized users. BioMed Central 2015-11-14 /pmc/articles/PMC4647304/ /pubmed/26573665 http://dx.doi.org/10.1186/s12859-015-0803-x Text en © To et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
To, Thu-Hien
Jacox, Edwin
Ranwez, Vincent
Scornavacca, Celine
A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality
title A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality
title_full A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality
title_fullStr A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality
title_full_unstemmed A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality
title_short A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality
title_sort fast method for calculating reliable event supports in tree reconciliations via pareto optimality
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4647304/
https://www.ncbi.nlm.nih.gov/pubmed/26573665
http://dx.doi.org/10.1186/s12859-015-0803-x
work_keys_str_mv AT tothuhien afastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT jacoxedwin afastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT ranwezvincent afastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT scornavaccaceline afastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT tothuhien fastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT jacoxedwin fastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT ranwezvincent fastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality
AT scornavaccaceline fastmethodforcalculatingreliableeventsupportsintreereconciliationsviaparetooptimality