Cargando…

PhyBin: binning trees by topology

A major goal of many evolutionary analyses is to determine the true evolutionary history of an organism. Molecular methods that rely on the phylogenetic signal generated by a few to a handful of loci can be used to approximate the evolution of the entire organism but fall short of providing a global...

Descripción completa

Detalles Bibliográficos
Autores principales: Newton, Ryan R., Newton, Irene L.G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3807594/
https://www.ncbi.nlm.nih.gov/pubmed/24167782
http://dx.doi.org/10.7717/peerj.187
_version_ 1782288492469944320
author Newton, Ryan R.
Newton, Irene L.G.
author_facet Newton, Ryan R.
Newton, Irene L.G.
author_sort Newton, Ryan R.
collection PubMed
description A major goal of many evolutionary analyses is to determine the true evolutionary history of an organism. Molecular methods that rely on the phylogenetic signal generated by a few to a handful of loci can be used to approximate the evolution of the entire organism but fall short of providing a global, genome-wide, perspective on evolutionary processes. Indeed, individual genes in a genome may have different evolutionary histories. Therefore, it is informative to analyze the number and kind of phylogenetic topologies found within an orthologous set of genes across a genome. Here we present PhyBin: a flexible program for clustering gene trees based on topological structure. PhyBin can generate bins of topologies corresponding to exactly identical trees or can utilize Robinson-Fould’s distance matrices to generate clusters of similar trees, using a user-defined threshold. Additionally, PhyBin allows the user to adjust for potential noise in the dataset (as may be produced when comparing very closely related organisms) by pre-processing trees to collapse very short branches or those nodes not meeting a defined bootstrap threshold. As a test case, we generated individual trees based on an orthologous gene set from 10 Wolbachia species across four different supergroups (A–D) and utilized PhyBin to categorize the complete set of topologies produced from this dataset. Using this approach, we were able to show that although a single topology generally dominated the analysis, confirming the separation of the supergroups, many genes supported alternative evolutionary histories. Because PhyBin’s output provides the user with lists of gene trees in each topological cluster, it can be used to explore potential reasons for discrepancies between phylogenies including homoplasies, long-branch attraction, or horizontal gene transfer events.
format Online
Article
Text
id pubmed-3807594
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-38075942013-10-28 PhyBin: binning trees by topology Newton, Ryan R. Newton, Irene L.G. PeerJ Bioinformatics A major goal of many evolutionary analyses is to determine the true evolutionary history of an organism. Molecular methods that rely on the phylogenetic signal generated by a few to a handful of loci can be used to approximate the evolution of the entire organism but fall short of providing a global, genome-wide, perspective on evolutionary processes. Indeed, individual genes in a genome may have different evolutionary histories. Therefore, it is informative to analyze the number and kind of phylogenetic topologies found within an orthologous set of genes across a genome. Here we present PhyBin: a flexible program for clustering gene trees based on topological structure. PhyBin can generate bins of topologies corresponding to exactly identical trees or can utilize Robinson-Fould’s distance matrices to generate clusters of similar trees, using a user-defined threshold. Additionally, PhyBin allows the user to adjust for potential noise in the dataset (as may be produced when comparing very closely related organisms) by pre-processing trees to collapse very short branches or those nodes not meeting a defined bootstrap threshold. As a test case, we generated individual trees based on an orthologous gene set from 10 Wolbachia species across four different supergroups (A–D) and utilized PhyBin to categorize the complete set of topologies produced from this dataset. Using this approach, we were able to show that although a single topology generally dominated the analysis, confirming the separation of the supergroups, many genes supported alternative evolutionary histories. Because PhyBin’s output provides the user with lists of gene trees in each topological cluster, it can be used to explore potential reasons for discrepancies between phylogenies including homoplasies, long-branch attraction, or horizontal gene transfer events. PeerJ Inc. 2013-10-22 /pmc/articles/PMC3807594/ /pubmed/24167782 http://dx.doi.org/10.7717/peerj.187 Text en © 2013 Newton and Newton http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Bioinformatics
Newton, Ryan R.
Newton, Irene L.G.
PhyBin: binning trees by topology
title PhyBin: binning trees by topology
title_full PhyBin: binning trees by topology
title_fullStr PhyBin: binning trees by topology
title_full_unstemmed PhyBin: binning trees by topology
title_short PhyBin: binning trees by topology
title_sort phybin: binning trees by topology
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3807594/
https://www.ncbi.nlm.nih.gov/pubmed/24167782
http://dx.doi.org/10.7717/peerj.187
work_keys_str_mv AT newtonryanr phybinbinningtreesbytopology
AT newtonirenelg phybinbinningtreesbytopology