Cargando…
Network science inspires novel tree shape statistics
The shape of phylogenetic trees can be used to gain evolutionary insights. A tree’s shape specifies the connectivity of a tree, while its branch lengths reflect either the time or genetic distance between branching events; well-known measures of tree shape include the Colless and Sackin imbalance, w...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8699983/ https://www.ncbi.nlm.nih.gov/pubmed/34941890 http://dx.doi.org/10.1371/journal.pone.0259877 |
_version_ | 1784620646123175936 |
---|---|
author | Chindelevitch, Leonid Hayati, Maryam Poon, Art F. Y. Colijn, Caroline |
author_facet | Chindelevitch, Leonid Hayati, Maryam Poon, Art F. Y. Colijn, Caroline |
author_sort | Chindelevitch, Leonid |
collection | PubMed |
description | The shape of phylogenetic trees can be used to gain evolutionary insights. A tree’s shape specifies the connectivity of a tree, while its branch lengths reflect either the time or genetic distance between branching events; well-known measures of tree shape include the Colless and Sackin imbalance, which describe the asymmetry of a tree. In other contexts, network science has become an important paradigm for describing structural features of networks and using them to understand complex systems, ranging from protein interactions to social systems. Network science is thus a potential source of many novel ways to characterize tree shape, as trees are also networks. Here, we tailor tools from network science, including diameter, average path length, and betweenness, closeness, and eigenvector centrality, to summarize phylogenetic tree shapes. We thereby propose tree shape summaries that are complementary to both asymmetry and the frequencies of small configurations. These new statistics can be computed in linear time and scale well to describe the shapes of large trees. We apply these statistics, alongside some conventional tree statistics, to phylogenetic trees from three very different viruses (HIV, dengue fever and measles), from the same virus in different epidemiological scenarios (influenza A and HIV) and from simulation models known to produce trees with different shapes. Using mutual information and supervised learning algorithms, we find that the statistics adapted from network science perform as well as or better than conventional statistics. We describe their distributions and prove some basic results about their extreme values in a tree. We conclude that network science-based tree shape summaries are a promising addition to the toolkit of tree shape features. All our shape summaries, as well as functions to select the most discriminating ones for two sets of trees, are freely available as an R package at http://github.com/Leonardini/treeCentrality. |
format | Online Article Text |
id | pubmed-8699983 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-86999832021-12-24 Network science inspires novel tree shape statistics Chindelevitch, Leonid Hayati, Maryam Poon, Art F. Y. Colijn, Caroline PLoS One Research Article The shape of phylogenetic trees can be used to gain evolutionary insights. A tree’s shape specifies the connectivity of a tree, while its branch lengths reflect either the time or genetic distance between branching events; well-known measures of tree shape include the Colless and Sackin imbalance, which describe the asymmetry of a tree. In other contexts, network science has become an important paradigm for describing structural features of networks and using them to understand complex systems, ranging from protein interactions to social systems. Network science is thus a potential source of many novel ways to characterize tree shape, as trees are also networks. Here, we tailor tools from network science, including diameter, average path length, and betweenness, closeness, and eigenvector centrality, to summarize phylogenetic tree shapes. We thereby propose tree shape summaries that are complementary to both asymmetry and the frequencies of small configurations. These new statistics can be computed in linear time and scale well to describe the shapes of large trees. We apply these statistics, alongside some conventional tree statistics, to phylogenetic trees from three very different viruses (HIV, dengue fever and measles), from the same virus in different epidemiological scenarios (influenza A and HIV) and from simulation models known to produce trees with different shapes. Using mutual information and supervised learning algorithms, we find that the statistics adapted from network science perform as well as or better than conventional statistics. We describe their distributions and prove some basic results about their extreme values in a tree. We conclude that network science-based tree shape summaries are a promising addition to the toolkit of tree shape features. All our shape summaries, as well as functions to select the most discriminating ones for two sets of trees, are freely available as an R package at http://github.com/Leonardini/treeCentrality. Public Library of Science 2021-12-23 /pmc/articles/PMC8699983/ /pubmed/34941890 http://dx.doi.org/10.1371/journal.pone.0259877 Text en © 2021 Chindelevitch et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chindelevitch, Leonid Hayati, Maryam Poon, Art F. Y. Colijn, Caroline Network science inspires novel tree shape statistics |
title | Network science inspires novel tree shape statistics |
title_full | Network science inspires novel tree shape statistics |
title_fullStr | Network science inspires novel tree shape statistics |
title_full_unstemmed | Network science inspires novel tree shape statistics |
title_short | Network science inspires novel tree shape statistics |
title_sort | network science inspires novel tree shape statistics |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8699983/ https://www.ncbi.nlm.nih.gov/pubmed/34941890 http://dx.doi.org/10.1371/journal.pone.0259877 |
work_keys_str_mv | AT chindelevitchleonid networkscienceinspiresnoveltreeshapestatistics AT hayatimaryam networkscienceinspiresnoveltreeshapestatistics AT poonartfy networkscienceinspiresnoveltreeshapestatistics AT colijncaroline networkscienceinspiresnoveltreeshapestatistics |