Cargando…

Extracting conflict-free information from multi-labeled trees

BACKGROUND: A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and ye...

Descripción completa

Detalles Bibliográficos
Autores principales: Deepak, Akshay, Fernández-Baca, David, McMahon, Michelle M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3716922/
https://www.ncbi.nlm.nih.gov/pubmed/23837994
http://dx.doi.org/10.1186/1748-7188-8-18
_version_ 1782277619466633216
author Deepak, Akshay
Fernández-Baca, David
McMahon, Michelle M
author_facet Deepak, Akshay
Fernández-Baca, David
McMahon, Michelle M
author_sort Deepak, Akshay
collection PubMed
description BACKGROUND: A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. RESULTS: We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved. CONCLUSIONS: Our measure of conflict-free information content based on quartets is simple and topologically appealing. In the experiments, the maximally reduced form is often much smaller than the original tree, yet retains most of the taxa. The reduction algorithm is quadratic in the number of leaves and its complexity is unaffected by the multiplicity of leaf labels or the degree of the nodes.
format Online
Article
Text
id pubmed-3716922
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37169222013-07-23 Extracting conflict-free information from multi-labeled trees Deepak, Akshay Fernández-Baca, David McMahon, Michelle M Algorithms Mol Biol Research BACKGROUND: A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. RESULTS: We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved. CONCLUSIONS: Our measure of conflict-free information content based on quartets is simple and topologically appealing. In the experiments, the maximally reduced form is often much smaller than the original tree, yet retains most of the taxa. The reduction algorithm is quadratic in the number of leaves and its complexity is unaffected by the multiplicity of leaf labels or the degree of the nodes. BioMed Central 2013-07-09 /pmc/articles/PMC3716922/ /pubmed/23837994 http://dx.doi.org/10.1186/1748-7188-8-18 Text en Copyright © 2013 Deepak et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Deepak, Akshay
Fernández-Baca, David
McMahon, Michelle M
Extracting conflict-free information from multi-labeled trees
title Extracting conflict-free information from multi-labeled trees
title_full Extracting conflict-free information from multi-labeled trees
title_fullStr Extracting conflict-free information from multi-labeled trees
title_full_unstemmed Extracting conflict-free information from multi-labeled trees
title_short Extracting conflict-free information from multi-labeled trees
title_sort extracting conflict-free information from multi-labeled trees
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3716922/
https://www.ncbi.nlm.nih.gov/pubmed/23837994
http://dx.doi.org/10.1186/1748-7188-8-18
work_keys_str_mv AT deepakakshay extractingconflictfreeinformationfrommultilabeledtrees
AT fernandezbacadavid extractingconflictfreeinformationfrommultilabeledtrees
AT mcmahonmichellem extractingconflictfreeinformationfrommultilabeledtrees