Cargando…

Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis

Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in bo...

Descripción completa

Detalles Bibliográficos
Autores principales: Teixeira, Andreia Sofia, Monteiro, Pedro T., Carriço, João A, Ramirez, Mário, Francisco, Alexandre P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4370493/
https://www.ncbi.nlm.nih.gov/pubmed/25799056
http://dx.doi.org/10.1371/journal.pone.0119315
_version_ 1782362882187460608
author Teixeira, Andreia Sofia
Monteiro, Pedro T.
Carriço, João A
Ramirez, Mário
Francisco, Alexandre P.
author_facet Teixeira, Andreia Sofia
Monteiro, Pedro T.
Carriço, João A
Ramirez, Mário
Francisco, Alexandre P.
author_sort Teixeira, Andreia Sofia
collection PubMed
description Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff’s matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models.
format Online
Article
Text
id pubmed-4370493
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43704932015-04-04 Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis Teixeira, Andreia Sofia Monteiro, Pedro T. Carriço, João A Ramirez, Mário Francisco, Alexandre P. PLoS One Research Article Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff’s matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models. Public Library of Science 2015-03-23 /pmc/articles/PMC4370493/ /pubmed/25799056 http://dx.doi.org/10.1371/journal.pone.0119315 Text en © 2015 Teixeira et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Teixeira, Andreia Sofia
Monteiro, Pedro T.
Carriço, João A
Ramirez, Mário
Francisco, Alexandre P.
Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis
title Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis
title_full Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis
title_fullStr Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis
title_full_unstemmed Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis
title_short Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis
title_sort not seeing the forest for the trees: size of the minimum spanning trees (msts) forest and branch significance in mst-based phylogenetic analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4370493/
https://www.ncbi.nlm.nih.gov/pubmed/25799056
http://dx.doi.org/10.1371/journal.pone.0119315
work_keys_str_mv AT teixeiraandreiasofia notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT monteiropedrot notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT carricojoaoa notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT ramirezmario notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT franciscoalexandrep notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis