Cargando…

Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates

SIMPLE SUMMARY: We show how the conventional (Euclidean) deep learning methods developed for phylogenetics can benefit from using hyperbolic geometry. The results point to lowered distance distortion and better accuracy in updating trees but not necessarily for phylogenetic placement. ABSTRACT: Phyl...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Yueyu, Tabaghi, Puoya, Mirarab, Siavash
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9495508/
https://www.ncbi.nlm.nih.gov/pubmed/36138735
http://dx.doi.org/10.3390/biology11091256
_version_ 1784794035044483072
author Jiang, Yueyu
Tabaghi, Puoya
Mirarab, Siavash
author_facet Jiang, Yueyu
Tabaghi, Puoya
Mirarab, Siavash
author_sort Jiang, Yueyu
collection PubMed
description SIMPLE SUMMARY: We show how the conventional (Euclidean) deep learning methods developed for phylogenetics can benefit from using hyperbolic geometry. The results point to lowered distance distortion and better accuracy in updating trees but not necessarily for phylogenetic placement. ABSTRACT: Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional space that preserves species tree distances. They then use a distance-based placement method to place the queries on that species tree. In this paper, we examine the appropriate geometry for faithfully representing tree distances while embedding gene sequences. Theory predicts that hyperbolic spaces should provide a drastic reduction in distance distortion compared to the conventional Euclidean space. Nevertheless, hyperbolic embedding imposes its own unique challenges related to arithmetic operations, exponentially-growing functions, and limited bit precision, and we address these challenges. Our results confirm that hyperbolic embeddings have substantially lower distance errors than Euclidean space. However, these better-estimated distances do not always lead to better phylogenetic placement. We then show that the deep learning framework can be used not just to place on a backbone tree but to update it to obtain a fully resolved tree. With our hyperbolic embedding framework, species trees can be updated remarkably accurately with only a handful of genes.
format Online
Article
Text
id pubmed-9495508
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94955082022-09-23 Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates Jiang, Yueyu Tabaghi, Puoya Mirarab, Siavash Biology (Basel) Article SIMPLE SUMMARY: We show how the conventional (Euclidean) deep learning methods developed for phylogenetics can benefit from using hyperbolic geometry. The results point to lowered distance distortion and better accuracy in updating trees but not necessarily for phylogenetic placement. ABSTRACT: Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional space that preserves species tree distances. They then use a distance-based placement method to place the queries on that species tree. In this paper, we examine the appropriate geometry for faithfully representing tree distances while embedding gene sequences. Theory predicts that hyperbolic spaces should provide a drastic reduction in distance distortion compared to the conventional Euclidean space. Nevertheless, hyperbolic embedding imposes its own unique challenges related to arithmetic operations, exponentially-growing functions, and limited bit precision, and we address these challenges. Our results confirm that hyperbolic embeddings have substantially lower distance errors than Euclidean space. However, these better-estimated distances do not always lead to better phylogenetic placement. We then show that the deep learning framework can be used not just to place on a backbone tree but to update it to obtain a fully resolved tree. With our hyperbolic embedding framework, species trees can be updated remarkably accurately with only a handful of genes. MDPI 2022-08-24 /pmc/articles/PMC9495508/ /pubmed/36138735 http://dx.doi.org/10.3390/biology11091256 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jiang, Yueyu
Tabaghi, Puoya
Mirarab, Siavash
Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
title Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
title_full Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
title_fullStr Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
title_full_unstemmed Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
title_short Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates
title_sort learning hyperbolic embedding for phylogenetic tree placement and updates
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9495508/
https://www.ncbi.nlm.nih.gov/pubmed/36138735
http://dx.doi.org/10.3390/biology11091256
work_keys_str_mv AT jiangyueyu learninghyperbolicembeddingforphylogenetictreeplacementandupdates
AT tabaghipuoya learninghyperbolicembeddingforphylogenetictreeplacementandupdates
AT mirarabsiavash learninghyperbolicembeddingforphylogenetictreeplacementandupdates