Cargando…
A generalized Robinson-Foulds distance for labeled trees
BACKGROUND: The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial as...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7677779/ https://www.ncbi.nlm.nih.gov/pubmed/33208096 http://dx.doi.org/10.1186/s12864-020-07011-0 |
_version_ | 1783612046915403776 |
---|---|
author | Briand, Samuel Dessimoz, Christophe El-Mabrouk, Nadia Lafond, Manuel Lobinska, Gabriela |
author_facet | Briand, Samuel Dessimoz, Christophe El-Mabrouk, Nadia Lafond, Manuel Lobinska, Gabriela |
author_sort | Briand, Samuel |
collection | PubMed |
description | BACKGROUND: The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc). RESULTS: We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting “good” edges, i.e. edges shared between the two trees. CONCLUSIONS: We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf. |
format | Online Article Text |
id | pubmed-7677779 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-76777792020-11-20 A generalized Robinson-Foulds distance for labeled trees Briand, Samuel Dessimoz, Christophe El-Mabrouk, Nadia Lafond, Manuel Lobinska, Gabriela BMC Genomics Research BACKGROUND: The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc). RESULTS: We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting “good” edges, i.e. edges shared between the two trees. CONCLUSIONS: We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf. BioMed Central 2020-11-18 /pmc/articles/PMC7677779/ /pubmed/33208096 http://dx.doi.org/10.1186/s12864-020-07011-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Briand, Samuel Dessimoz, Christophe El-Mabrouk, Nadia Lafond, Manuel Lobinska, Gabriela A generalized Robinson-Foulds distance for labeled trees |
title | A generalized Robinson-Foulds distance for labeled trees |
title_full | A generalized Robinson-Foulds distance for labeled trees |
title_fullStr | A generalized Robinson-Foulds distance for labeled trees |
title_full_unstemmed | A generalized Robinson-Foulds distance for labeled trees |
title_short | A generalized Robinson-Foulds distance for labeled trees |
title_sort | generalized robinson-foulds distance for labeled trees |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7677779/ https://www.ncbi.nlm.nih.gov/pubmed/33208096 http://dx.doi.org/10.1186/s12864-020-07011-0 |
work_keys_str_mv | AT briandsamuel ageneralizedrobinsonfouldsdistanceforlabeledtrees AT dessimozchristophe ageneralizedrobinsonfouldsdistanceforlabeledtrees AT elmabrouknadia ageneralizedrobinsonfouldsdistanceforlabeledtrees AT lafondmanuel ageneralizedrobinsonfouldsdistanceforlabeledtrees AT lobinskagabriela ageneralizedrobinsonfouldsdistanceforlabeledtrees AT briandsamuel generalizedrobinsonfouldsdistanceforlabeledtrees AT dessimozchristophe generalizedrobinsonfouldsdistanceforlabeledtrees AT elmabrouknadia generalizedrobinsonfouldsdistanceforlabeledtrees AT lafondmanuel generalizedrobinsonfouldsdistanceforlabeledtrees AT lobinskagabriela generalizedrobinsonfouldsdistanceforlabeledtrees |