Cargando…

Phylogenetic diversity statistics for all clades in a phylogeny

The classic quantitative measure of phylogenetic diversity (PD) has been used to address problems in conservation biology, microbial ecology, and evolutionary biology. PD is the minimum total length of the branches in a phylogeny required to cover a specified set of taxa on the phylogeny. A general...

Descripción completa

Detalles Bibliográficos
Autores principales: Grover, Siddhant, Markin, Alexey, Anderson, Tavis K, Eulenstein, Oliver
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311342/
https://www.ncbi.nlm.nih.gov/pubmed/37387175
http://dx.doi.org/10.1093/bioinformatics/btad263
_version_ 1785066723375841280
author Grover, Siddhant
Markin, Alexey
Anderson, Tavis K
Eulenstein, Oliver
author_facet Grover, Siddhant
Markin, Alexey
Anderson, Tavis K
Eulenstein, Oliver
author_sort Grover, Siddhant
collection PubMed
description The classic quantitative measure of phylogenetic diversity (PD) has been used to address problems in conservation biology, microbial ecology, and evolutionary biology. PD is the minimum total length of the branches in a phylogeny required to cover a specified set of taxa on the phylogeny. A general goal in the application of PD has been identifying a set of taxa of size k that maximize PD on a given phylogeny; this has been mirrored in active research to develop efficient algorithms for the problem. Other descriptive statistics, such as the minimum PD, average PD, and standard deviation of PD, can provide invaluable insight into the distribution of PD across a phylogeny (relative to a fixed value of k). However, there has been limited or no research on computing these statistics, especially when required for each clade in a phylogeny, enabling direct comparisons of PD between clades. We introduce efficient algorithms for computing PD and the associated descriptive statistics for a given phylogeny and each of its clades. In simulation studies, we demonstrate the ability of our algorithms to analyze large-scale phylogenies with applications in ecology and evolutionary biology. The software is available at https://github.com/flu-crew/PD_stats.
format Online
Article
Text
id pubmed-10311342
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103113422023-07-01 Phylogenetic diversity statistics for all clades in a phylogeny Grover, Siddhant Markin, Alexey Anderson, Tavis K Eulenstein, Oliver Bioinformatics Evolutionary, Comparative and Population Genomics The classic quantitative measure of phylogenetic diversity (PD) has been used to address problems in conservation biology, microbial ecology, and evolutionary biology. PD is the minimum total length of the branches in a phylogeny required to cover a specified set of taxa on the phylogeny. A general goal in the application of PD has been identifying a set of taxa of size k that maximize PD on a given phylogeny; this has been mirrored in active research to develop efficient algorithms for the problem. Other descriptive statistics, such as the minimum PD, average PD, and standard deviation of PD, can provide invaluable insight into the distribution of PD across a phylogeny (relative to a fixed value of k). However, there has been limited or no research on computing these statistics, especially when required for each clade in a phylogeny, enabling direct comparisons of PD between clades. We introduce efficient algorithms for computing PD and the associated descriptive statistics for a given phylogeny and each of its clades. In simulation studies, we demonstrate the ability of our algorithms to analyze large-scale phylogenies with applications in ecology and evolutionary biology. The software is available at https://github.com/flu-crew/PD_stats. Oxford University Press 2023-06-30 /pmc/articles/PMC10311342/ /pubmed/37387175 http://dx.doi.org/10.1093/bioinformatics/btad263 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Evolutionary, Comparative and Population Genomics
Grover, Siddhant
Markin, Alexey
Anderson, Tavis K
Eulenstein, Oliver
Phylogenetic diversity statistics for all clades in a phylogeny
title Phylogenetic diversity statistics for all clades in a phylogeny
title_full Phylogenetic diversity statistics for all clades in a phylogeny
title_fullStr Phylogenetic diversity statistics for all clades in a phylogeny
title_full_unstemmed Phylogenetic diversity statistics for all clades in a phylogeny
title_short Phylogenetic diversity statistics for all clades in a phylogeny
title_sort phylogenetic diversity statistics for all clades in a phylogeny
topic Evolutionary, Comparative and Population Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311342/
https://www.ncbi.nlm.nih.gov/pubmed/37387175
http://dx.doi.org/10.1093/bioinformatics/btad263
work_keys_str_mv AT groversiddhant phylogeneticdiversitystatisticsforallcladesinaphylogeny
AT markinalexey phylogeneticdiversitystatisticsforallcladesinaphylogeny
AT andersontavisk phylogeneticdiversitystatisticsforallcladesinaphylogeny
AT eulensteinoliver phylogeneticdiversitystatisticsforallcladesinaphylogeny