Cargando…

Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf

BACKGROUND: Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair...

Descripción completa

Detalles Bibliográficos
Autores principales: Cardona, Gabriel, Mir, Arnau, Rosselló, Francesc, Rotger, Lucía, Sánchez, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3716993/
https://www.ncbi.nlm.nih.gov/pubmed/23323711
http://dx.doi.org/10.1186/1471-2105-14-3
_version_ 1782277635732144128
author Cardona, Gabriel
Mir, Arnau
Rosselló, Francesc
Rotger, Lucía
Sánchez, David
author_facet Cardona, Gabriel
Mir, Arnau
Rosselló, Francesc
Rotger, Lucía
Sánchez, David
author_sort Cardona, Gabriel
collection PubMed
description BACKGROUND: Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair of phylogenetic trees by first encoding them by means of their half-matrices of cophenetic values, and then comparing these matrices. This idea has been used several times since then to define dissimilarity measures between phylogenetic trees but, to our knowledge, no proper metric on weighted phylogenetic trees with nested taxa based on this idea has been formally defined and studied yet. Actually, the cophenetic values of pairs of different taxa alone are not enough to single out phylogenetic trees with weighted arcs or nested taxa. RESULTS: For every (rooted) phylogenetic tree T, let its cophenetic vectorφ(T) consist of all pairs of cophenetic values between pairs of taxa in T and all depths of taxa in T. It turns out that these cophenetic vectors single out weighted phylogenetic trees with nested taxa. We then define a family of cophenetic metrics d(φ,p) by comparing these cophenetic vectors by means of L(p) norms, and we study, either analytically or numerically, some of their basic properties: neighbors, diameter, distribution, and their rank correlation with each other and with other metrics. CONCLUSIONS: The cophenetic metrics can be safely used on weighted phylogenetic trees with nested taxa and no restriction on degrees, and they can be computed in O(n(2)) time, where n stands for the number of taxa. The metrics d(φ,1) and d(φ,2) have positive skewed distributions, and they show a low rank correlation with the Robinson-Foulds metric and the nodal metrics, and a very high correlation with each other and with the splitted nodal metrics. The diameter of d(φ,p), for [Formula: see text] , is in O(n((p+2)/p)), and thus for low p they are more discriminative, having a wider range of values.
format Online
Article
Text
id pubmed-3716993
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37169932013-07-23 Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf Cardona, Gabriel Mir, Arnau Rosselló, Francesc Rotger, Lucía Sánchez, David BMC Bioinformatics Methodology Article BACKGROUND: Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair of phylogenetic trees by first encoding them by means of their half-matrices of cophenetic values, and then comparing these matrices. This idea has been used several times since then to define dissimilarity measures between phylogenetic trees but, to our knowledge, no proper metric on weighted phylogenetic trees with nested taxa based on this idea has been formally defined and studied yet. Actually, the cophenetic values of pairs of different taxa alone are not enough to single out phylogenetic trees with weighted arcs or nested taxa. RESULTS: For every (rooted) phylogenetic tree T, let its cophenetic vectorφ(T) consist of all pairs of cophenetic values between pairs of taxa in T and all depths of taxa in T. It turns out that these cophenetic vectors single out weighted phylogenetic trees with nested taxa. We then define a family of cophenetic metrics d(φ,p) by comparing these cophenetic vectors by means of L(p) norms, and we study, either analytically or numerically, some of their basic properties: neighbors, diameter, distribution, and their rank correlation with each other and with other metrics. CONCLUSIONS: The cophenetic metrics can be safely used on weighted phylogenetic trees with nested taxa and no restriction on degrees, and they can be computed in O(n(2)) time, where n stands for the number of taxa. The metrics d(φ,1) and d(φ,2) have positive skewed distributions, and they show a low rank correlation with the Robinson-Foulds metric and the nodal metrics, and a very high correlation with each other and with the splitted nodal metrics. The diameter of d(φ,p), for [Formula: see text] , is in O(n((p+2)/p)), and thus for low p they are more discriminative, having a wider range of values. BioMed Central 2013-01-16 /pmc/articles/PMC3716993/ /pubmed/23323711 http://dx.doi.org/10.1186/1471-2105-14-3 Text en Copyright © 2013 Cardona et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Cardona, Gabriel
Mir, Arnau
Rosselló, Francesc
Rotger, Lucía
Sánchez, David
Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
title Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
title_full Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
title_fullStr Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
title_full_unstemmed Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
title_short Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf
title_sort cophenetic metrics for phylogenetic trees, after sokal and rohlf
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3716993/
https://www.ncbi.nlm.nih.gov/pubmed/23323711
http://dx.doi.org/10.1186/1471-2105-14-3
work_keys_str_mv AT cardonagabriel copheneticmetricsforphylogenetictreesaftersokalandrohlf
AT mirarnau copheneticmetricsforphylogenetictreesaftersokalandrohlf
AT rossellofrancesc copheneticmetricsforphylogenetictreesaftersokalandrohlf
AT rotgerlucia copheneticmetricsforphylogenetictreesaftersokalandrohlf
AT sanchezdavid copheneticmetricsforphylogenetictreesaftersokalandrohlf