Cargando…

Reconstructing a SuperGeneTree minimizing reconciliation

Combining a set of trees on partial datasets into a single tree is a classical method for inferring large phylogenetic trees. Ideally, the combined tree should display each input partial tree, which is only possible if input trees do not contain contradictory phylogenetic information. The simplest v...

Descripción completa

Detalles Bibliográficos
Autores principales: Lafond, Manuel, Ouangraoua, Aïda, El-Mabrouk, Nadia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4602317/
https://www.ncbi.nlm.nih.gov/pubmed/26451911
http://dx.doi.org/10.1186/1471-2105-16-S14-S4
_version_ 1782394697554067456
author Lafond, Manuel
Ouangraoua, Aïda
El-Mabrouk, Nadia
author_facet Lafond, Manuel
Ouangraoua, Aïda
El-Mabrouk, Nadia
author_sort Lafond, Manuel
collection PubMed
description Combining a set of trees on partial datasets into a single tree is a classical method for inferring large phylogenetic trees. Ideally, the combined tree should display each input partial tree, which is only possible if input trees do not contain contradictory phylogenetic information. The simplest version of the supertree problem is thus to state whether a set of trees is compatible, and if so, construct a tree displaying them all. Classically, supertree methods have been applied to the reconstruction of species trees. Here we rather consider reconstructing a super gene tree in light of a known species tree S. We define the supergenetree problem as finding, among all supertrees displaying a set of input gene trees, one supertree minimizing a reconciliation distance with S. We first show how classical exact methods to the supertree problem can be extended to the supergenetree problem. As all these methods are highly exponential, we also exhibit a natural greedy heuristic for the duplication cost, based on minimizing the set of duplications preceding the first speciation event. We then show that both the supergenetree problem and its restriction to minimizing duplications preceding the first speciation are NP-hard to approximate within a n(1-ϵ )factor, for any 0 < ϵ < 1. Finally, we show that a restriction of this problem to uniquely labeled speciation gene trees, which is relevant to many biological applications, is also NP-hard. Therefore, we introduce new avenues in the field of supertrees, and set the theoretical basis for the exploration of various algorithmic aspects of the problems.
format Online
Article
Text
id pubmed-4602317
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46023172015-10-14 Reconstructing a SuperGeneTree minimizing reconciliation Lafond, Manuel Ouangraoua, Aïda El-Mabrouk, Nadia BMC Bioinformatics Research Combining a set of trees on partial datasets into a single tree is a classical method for inferring large phylogenetic trees. Ideally, the combined tree should display each input partial tree, which is only possible if input trees do not contain contradictory phylogenetic information. The simplest version of the supertree problem is thus to state whether a set of trees is compatible, and if so, construct a tree displaying them all. Classically, supertree methods have been applied to the reconstruction of species trees. Here we rather consider reconstructing a super gene tree in light of a known species tree S. We define the supergenetree problem as finding, among all supertrees displaying a set of input gene trees, one supertree minimizing a reconciliation distance with S. We first show how classical exact methods to the supertree problem can be extended to the supergenetree problem. As all these methods are highly exponential, we also exhibit a natural greedy heuristic for the duplication cost, based on minimizing the set of duplications preceding the first speciation event. We then show that both the supergenetree problem and its restriction to minimizing duplications preceding the first speciation are NP-hard to approximate within a n(1-ϵ )factor, for any 0 < ϵ < 1. Finally, we show that a restriction of this problem to uniquely labeled speciation gene trees, which is relevant to many biological applications, is also NP-hard. Therefore, we introduce new avenues in the field of supertrees, and set the theoretical basis for the exploration of various algorithmic aspects of the problems. BioMed Central 2015-10-02 /pmc/articles/PMC4602317/ /pubmed/26451911 http://dx.doi.org/10.1186/1471-2105-16-S14-S4 Text en Copyright © 2015 Lafond et al. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Lafond, Manuel
Ouangraoua, Aïda
El-Mabrouk, Nadia
Reconstructing a SuperGeneTree minimizing reconciliation
title Reconstructing a SuperGeneTree minimizing reconciliation
title_full Reconstructing a SuperGeneTree minimizing reconciliation
title_fullStr Reconstructing a SuperGeneTree minimizing reconciliation
title_full_unstemmed Reconstructing a SuperGeneTree minimizing reconciliation
title_short Reconstructing a SuperGeneTree minimizing reconciliation
title_sort reconstructing a supergenetree minimizing reconciliation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4602317/
https://www.ncbi.nlm.nih.gov/pubmed/26451911
http://dx.doi.org/10.1186/1471-2105-16-S14-S4
work_keys_str_mv AT lafondmanuel reconstructingasupergenetreeminimizingreconciliation
AT ouangraouaaida reconstructingasupergenetreeminimizingreconciliation
AT elmabrouknadia reconstructingasupergenetreeminimizingreconciliation