Cargando…

Testing the agreement of trees with internal labels

BACKGROUND: A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a collection [Formula: see text] of semi-labeled trees, called input trees, over p...

Descripción completa

Detalles Bibliográficos
Autores principales: Fernández-Baca, David, Liu, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8643029/
https://www.ncbi.nlm.nih.gov/pubmed/34863219
http://dx.doi.org/10.1186/s13015-021-00201-9
Descripción
Sumario:BACKGROUND: A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a collection [Formula: see text] of semi-labeled trees, called input trees, over partially overlapping sets of taxa. The agreement problem asks whether there exists a tree [Formula: see text] , called an agreement tree, whose taxon set is the union of the taxon sets of the input trees such that the restriction of [Formula: see text] to the taxon set of [Formula: see text] is isomorphic to [Formula: see text] , for each [Formula: see text] . The agreement problems is a special case of the supertree problem, the problem of synthesizing a collection of phylogenetic trees with partially overlapping taxon sets into a single supertree that represents the information in the input trees. An obstacle to building large phylogenetic supertrees is the limited amount of taxonomic overlap among the phylogenetic studies from which the input trees are obtained. Incorporating taxonomies into supertree analyses can alleviate this issue. RESULTS: We give a [Formula: see text] algorithm for the agreement problem, where n is the total number of distinct taxa in [Formula: see text] , k is the number of trees in [Formula: see text] , and [Formula: see text] is the maximum number of children of a node in [Formula: see text] . CONCLUSION: Our algorithm can aid in integrating taxonomies into supertree analyses. Our computational experience with the algorithm suggests that its performance in practice is much better than its worst-case bound indicates.