Cargando…

Polytomy refinement for the correction of dubious duplications in gene trees

Motivation: Large-scale methods for inferring gene trees are error-prone. Correcting gene trees for weakly supported features often results in non-binary trees, i.e. trees with polytomies, thus raising the natural question of refining such polytomies into binary trees. A feature pointing toward pote...

Descripción completa

Detalles Bibliográficos
Autores principales: Lafond, Manuel, Chauve, Cedric, Dondi, Riccardo, El-Mabrouk, Nadia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147916/
https://www.ncbi.nlm.nih.gov/pubmed/25161242
http://dx.doi.org/10.1093/bioinformatics/btu463
Descripción
Sumario:Motivation: Large-scale methods for inferring gene trees are error-prone. Correcting gene trees for weakly supported features often results in non-binary trees, i.e. trees with polytomies, thus raising the natural question of refining such polytomies into binary trees. A feature pointing toward potential errors in gene trees are duplications that are not supported by the presence of multiple gene copies. Results: We introduce the problem of refining polytomies in a gene tree while minimizing the number of created non-apparent duplications in the resulting tree. We show that this problem can be described as a graph-theoretical optimization problem. We provide a bounded heuristic with guaranteed optimality for well-characterized instances. We apply our algorithm to a set of ray-finned fish gene trees from the Ensembl database to illustrate its ability to correct dubious duplications. Availability and implementation: The C++ source code for the algorithms and simulations described in the article are available at http://www-ens.iro.umontreal.ca/~lafonman/software.php. Contact: lafonman@iro.umontreal.ca or mabrouk@iro.umontreal.ca Supplementary information: Supplementary data are available at Bioinformatics online.