Cargando…

Geodesics to characterize the phylogenetic landscape

Phylogenetic trees are fundamental for understanding evolutionary history. However, finding maximum likelihood trees is challenging due to the complexity of the likelihood landscape and the size of tree space. Based on the Billera-Holmes-Vogtmann (BHV) distance between trees, we describe a method to...

Descripción completa

Detalles Bibliográficos
Autores principales: Khodaei, Marzieh, Owen, Megan, Beerli, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289362/
https://www.ncbi.nlm.nih.gov/pubmed/37352194
http://dx.doi.org/10.1371/journal.pone.0287350
_version_ 1785062260423524352
author Khodaei, Marzieh
Owen, Megan
Beerli, Peter
author_facet Khodaei, Marzieh
Owen, Megan
Beerli, Peter
author_sort Khodaei, Marzieh
collection PubMed
description Phylogenetic trees are fundamental for understanding evolutionary history. However, finding maximum likelihood trees is challenging due to the complexity of the likelihood landscape and the size of tree space. Based on the Billera-Holmes-Vogtmann (BHV) distance between trees, we describe a method to generate intermediate trees on the shortest path between two trees, called pathtrees. These pathtrees give a structured way to generate and visualize part of treespace. They allow investigating intermediate regions between trees of interest, exploring locally optimal trees in topological clusters of treespace, and potentially finding trees of high likelihood unexplored by tree search algorithms. We compared our approach against other tree search tools (Paup*, RAxML, and RevBayes) using the highest likelihood trees and number of new topologies found, and validated the accuracy of the generated treespace. We assess our method using two datasets. The first consists of 23 primate species (CytB, 1141 bp), leading to well-resolved relationships. The second is a dataset of 182 milksnakes (CytB, 1117 bp), containing many similar sequences and complex relationships among individuals. Our method visualizes the treespace using log likelihood as a fitness function. It finds similarly optimal trees as heuristic methods and presents the likelihood landscape at different scales. It found relevant trees that were not found with MCMC methods. The validation measures indicated that our method performed well mapping treespace into lower dimensions. Our method complements heuristic search analyses, and the visualization allows the inspection of likelihood terraces and exploration of treespace areas not visited by heuristic searches.
format Online
Article
Text
id pubmed-10289362
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-102893622023-06-24 Geodesics to characterize the phylogenetic landscape Khodaei, Marzieh Owen, Megan Beerli, Peter PLoS One Research Article Phylogenetic trees are fundamental for understanding evolutionary history. However, finding maximum likelihood trees is challenging due to the complexity of the likelihood landscape and the size of tree space. Based on the Billera-Holmes-Vogtmann (BHV) distance between trees, we describe a method to generate intermediate trees on the shortest path between two trees, called pathtrees. These pathtrees give a structured way to generate and visualize part of treespace. They allow investigating intermediate regions between trees of interest, exploring locally optimal trees in topological clusters of treespace, and potentially finding trees of high likelihood unexplored by tree search algorithms. We compared our approach against other tree search tools (Paup*, RAxML, and RevBayes) using the highest likelihood trees and number of new topologies found, and validated the accuracy of the generated treespace. We assess our method using two datasets. The first consists of 23 primate species (CytB, 1141 bp), leading to well-resolved relationships. The second is a dataset of 182 milksnakes (CytB, 1117 bp), containing many similar sequences and complex relationships among individuals. Our method visualizes the treespace using log likelihood as a fitness function. It finds similarly optimal trees as heuristic methods and presents the likelihood landscape at different scales. It found relevant trees that were not found with MCMC methods. The validation measures indicated that our method performed well mapping treespace into lower dimensions. Our method complements heuristic search analyses, and the visualization allows the inspection of likelihood terraces and exploration of treespace areas not visited by heuristic searches. Public Library of Science 2023-06-23 /pmc/articles/PMC10289362/ /pubmed/37352194 http://dx.doi.org/10.1371/journal.pone.0287350 Text en © 2023 Khodaei et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Khodaei, Marzieh
Owen, Megan
Beerli, Peter
Geodesics to characterize the phylogenetic landscape
title Geodesics to characterize the phylogenetic landscape
title_full Geodesics to characterize the phylogenetic landscape
title_fullStr Geodesics to characterize the phylogenetic landscape
title_full_unstemmed Geodesics to characterize the phylogenetic landscape
title_short Geodesics to characterize the phylogenetic landscape
title_sort geodesics to characterize the phylogenetic landscape
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289362/
https://www.ncbi.nlm.nih.gov/pubmed/37352194
http://dx.doi.org/10.1371/journal.pone.0287350
work_keys_str_mv AT khodaeimarzieh geodesicstocharacterizethephylogeneticlandscape
AT owenmegan geodesicstocharacterizethephylogeneticlandscape
AT beerlipeter geodesicstocharacterizethephylogeneticlandscape