Cargando…
Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
SIMPLE SUMMARY: We show how the conventional (Euclidean) deep learning methods developed for phylogenetics can benefit from using hyperbolic geometry. The results point to lowered distance distortion and better accuracy in updating trees but not necessarily for phylogenetic placement. ABSTRACT: Phyl...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9495508/ https://www.ncbi.nlm.nih.gov/pubmed/36138735 http://dx.doi.org/10.3390/biology11091256 |
_version_ | 1784794035044483072 |
---|---|
author | Jiang, Yueyu Tabaghi, Puoya Mirarab, Siavash |
author_facet | Jiang, Yueyu Tabaghi, Puoya Mirarab, Siavash |
author_sort | Jiang, Yueyu |
collection | PubMed |
description | SIMPLE SUMMARY: We show how the conventional (Euclidean) deep learning methods developed for phylogenetics can benefit from using hyperbolic geometry. The results point to lowered distance distortion and better accuracy in updating trees but not necessarily for phylogenetic placement. ABSTRACT: Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional space that preserves species tree distances. They then use a distance-based placement method to place the queries on that species tree. In this paper, we examine the appropriate geometry for faithfully representing tree distances while embedding gene sequences. Theory predicts that hyperbolic spaces should provide a drastic reduction in distance distortion compared to the conventional Euclidean space. Nevertheless, hyperbolic embedding imposes its own unique challenges related to arithmetic operations, exponentially-growing functions, and limited bit precision, and we address these challenges. Our results confirm that hyperbolic embeddings have substantially lower distance errors than Euclidean space. However, these better-estimated distances do not always lead to better phylogenetic placement. We then show that the deep learning framework can be used not just to place on a backbone tree but to update it to obtain a fully resolved tree. With our hyperbolic embedding framework, species trees can be updated remarkably accurately with only a handful of genes. |
format | Online Article Text |
id | pubmed-9495508 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-94955082022-09-23 Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates Jiang, Yueyu Tabaghi, Puoya Mirarab, Siavash Biology (Basel) Article SIMPLE SUMMARY: We show how the conventional (Euclidean) deep learning methods developed for phylogenetics can benefit from using hyperbolic geometry. The results point to lowered distance distortion and better accuracy in updating trees but not necessarily for phylogenetic placement. ABSTRACT: Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional space that preserves species tree distances. They then use a distance-based placement method to place the queries on that species tree. In this paper, we examine the appropriate geometry for faithfully representing tree distances while embedding gene sequences. Theory predicts that hyperbolic spaces should provide a drastic reduction in distance distortion compared to the conventional Euclidean space. Nevertheless, hyperbolic embedding imposes its own unique challenges related to arithmetic operations, exponentially-growing functions, and limited bit precision, and we address these challenges. Our results confirm that hyperbolic embeddings have substantially lower distance errors than Euclidean space. However, these better-estimated distances do not always lead to better phylogenetic placement. We then show that the deep learning framework can be used not just to place on a backbone tree but to update it to obtain a fully resolved tree. With our hyperbolic embedding framework, species trees can be updated remarkably accurately with only a handful of genes. MDPI 2022-08-24 /pmc/articles/PMC9495508/ /pubmed/36138735 http://dx.doi.org/10.3390/biology11091256 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Jiang, Yueyu Tabaghi, Puoya Mirarab, Siavash Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates |
title | Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates |
title_full | Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates |
title_fullStr | Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates |
title_full_unstemmed | Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates |
title_short | Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates |
title_sort | learning hyperbolic embedding for phylogenetic tree placement and updates |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9495508/ https://www.ncbi.nlm.nih.gov/pubmed/36138735 http://dx.doi.org/10.3390/biology11091256 |
work_keys_str_mv | AT jiangyueyu learninghyperbolicembeddingforphylogenetictreeplacementandupdates AT tabaghipuoya learninghyperbolicembeddingforphylogenetictreeplacementandupdates AT mirarabsiavash learninghyperbolicembeddingforphylogenetictreeplacementandupdates |