Cargando…

Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios

BACKGROUND: A long recognized problem is the inference of the supertree S that amalgamates a given set {G(j)} of trees G(j), with leaves in each G(j) being assigned homologous elements. We ground on an approach to find the tree S by minimizing the total cost of mappings α(j) of individual gene trees...

Descripción completa

Detalles Bibliográficos
Autores principales: Lyubetsky, Vassily A, Rubanov, Lev I, Rusin, Leonid Y, Gorbunov, Konstantin Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3577452/
https://www.ncbi.nlm.nih.gov/pubmed/23259766
http://dx.doi.org/10.1186/1745-6150-7-48
_version_ 1782259914261921792
author Lyubetsky, Vassily A
Rubanov, Lev I
Rusin, Leonid Y
Gorbunov, Konstantin Yu
author_facet Lyubetsky, Vassily A
Rubanov, Lev I
Rusin, Leonid Y
Gorbunov, Konstantin Yu
author_sort Lyubetsky, Vassily A
collection PubMed
description BACKGROUND: A long recognized problem is the inference of the supertree S that amalgamates a given set {G(j)} of trees G(j), with leaves in each G(j) being assigned homologous elements. We ground on an approach to find the tree S by minimizing the total cost of mappings α(j) of individual gene trees G(j) into S. Traditionally, this cost is defined basically as a sum of duplications and gaps in each α(j). The classical problem is to minimize the total cost, where S runs over the set of all trees that contain an exhaustive non-redundant set of species from all input G(j). RESULTS: We suggest a reformulation of the classical NP-hard problem of building a supertree in terms of the global minimization of the same cost functional but only over species trees S that consist of clades belonging to a fixed set P (e.g., an exhaustive set of clades in all G(j)). We developed a deterministic solving algorithm with a low degree polynomial (typically cubic) time complexity with respect to the size of input data. We define an extensive set of elementary evolutionary events and suggest an original definition of mapping β of tree G into tree S. We introduce the cost functional c(G, S, f ) and define the mapping β as the global minimum of this functional with respect to the variable f, in which sense it is a generalization of classical mapping α. We suggest a reformulation of the classical NP-hard mapping (reconciliation) problem by introducing time slices into the species tree S and present a cubic time solving algorithm to compute the mapping β. We introduce two novel definitions of the evolutionary scenario based on mapping β or a random process of gene evolution along a species tree. CONCLUSIONS: Developed algorithms are mathematically proved, which justifies the following statements. The supertree building algorithm finds exactly the global minimum of the total cost if only gene duplications and losses are allowed and the given sets of gene trees satisfies a certain condition. The mapping algorithm finds exactly the minimal mapping β, the minimal total cost and the evolutionary scenario as a minimum over all possible distributions of elementary evolutionary events along the edges of tree S. The algorithms and their effective software implementations provide useful tools in many biological studies. They facilitate processing of voluminous tree data in acceptable time still largely avoiding heuristics. Performance of the tools is tested with artificial and prokaryotic tree data. REVIEWERS: This article was reviewed by Prof. Anthony Almudevar, Prof. Alexander Bolshoy (nominated by Prof. Peter Olofsson), and Prof. Marek Kimmel.
format Online
Article
Text
id pubmed-3577452
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35774522013-02-26 Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios Lyubetsky, Vassily A Rubanov, Lev I Rusin, Leonid Y Gorbunov, Konstantin Yu Biol Direct Research BACKGROUND: A long recognized problem is the inference of the supertree S that amalgamates a given set {G(j)} of trees G(j), with leaves in each G(j) being assigned homologous elements. We ground on an approach to find the tree S by minimizing the total cost of mappings α(j) of individual gene trees G(j) into S. Traditionally, this cost is defined basically as a sum of duplications and gaps in each α(j). The classical problem is to minimize the total cost, where S runs over the set of all trees that contain an exhaustive non-redundant set of species from all input G(j). RESULTS: We suggest a reformulation of the classical NP-hard problem of building a supertree in terms of the global minimization of the same cost functional but only over species trees S that consist of clades belonging to a fixed set P (e.g., an exhaustive set of clades in all G(j)). We developed a deterministic solving algorithm with a low degree polynomial (typically cubic) time complexity with respect to the size of input data. We define an extensive set of elementary evolutionary events and suggest an original definition of mapping β of tree G into tree S. We introduce the cost functional c(G, S, f ) and define the mapping β as the global minimum of this functional with respect to the variable f, in which sense it is a generalization of classical mapping α. We suggest a reformulation of the classical NP-hard mapping (reconciliation) problem by introducing time slices into the species tree S and present a cubic time solving algorithm to compute the mapping β. We introduce two novel definitions of the evolutionary scenario based on mapping β or a random process of gene evolution along a species tree. CONCLUSIONS: Developed algorithms are mathematically proved, which justifies the following statements. The supertree building algorithm finds exactly the global minimum of the total cost if only gene duplications and losses are allowed and the given sets of gene trees satisfies a certain condition. The mapping algorithm finds exactly the minimal mapping β, the minimal total cost and the evolutionary scenario as a minimum over all possible distributions of elementary evolutionary events along the edges of tree S. The algorithms and their effective software implementations provide useful tools in many biological studies. They facilitate processing of voluminous tree data in acceptable time still largely avoiding heuristics. Performance of the tools is tested with artificial and prokaryotic tree data. REVIEWERS: This article was reviewed by Prof. Anthony Almudevar, Prof. Alexander Bolshoy (nominated by Prof. Peter Olofsson), and Prof. Marek Kimmel. BioMed Central 2012-12-22 /pmc/articles/PMC3577452/ /pubmed/23259766 http://dx.doi.org/10.1186/1745-6150-7-48 Text en Copyright ©2012 Lyubetsky et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Lyubetsky, Vassily A
Rubanov, Lev I
Rusin, Leonid Y
Gorbunov, Konstantin Yu
Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
title Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
title_full Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
title_fullStr Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
title_full_unstemmed Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
title_short Cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
title_sort cubic time algorithms of amalgamating gene trees and building evolutionary scenarios
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3577452/
https://www.ncbi.nlm.nih.gov/pubmed/23259766
http://dx.doi.org/10.1186/1745-6150-7-48
work_keys_str_mv AT lyubetskyvassilya cubictimealgorithmsofamalgamatinggenetreesandbuildingevolutionaryscenarios
AT rubanovlevi cubictimealgorithmsofamalgamatinggenetreesandbuildingevolutionaryscenarios
AT rusinleonidy cubictimealgorithmsofamalgamatinggenetreesandbuildingevolutionaryscenarios
AT gorbunovkonstantinyu cubictimealgorithmsofamalgamatinggenetreesandbuildingevolutionaryscenarios