Cargando…

Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees

BACKGROUND: Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-pa...

Descripción completa

Detalles Bibliográficos
Autores principales: Makarenkov, Vladimir, Boc, Alix, Xie, Jingxin, Peres-Neto, Pedro, Lapointe, François-Joseph, Legendre, Pierre
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2939571/
https://www.ncbi.nlm.nih.gov/pubmed/20716358
http://dx.doi.org/10.1186/1471-2148-10-250
_version_ 1782186740279148544
author Makarenkov, Vladimir
Boc, Alix
Xie, Jingxin
Peres-Neto, Pedro
Lapointe, François-Joseph
Legendre, Pierre
author_facet Makarenkov, Vladimir
Boc, Alix
Xie, Jingxin
Peres-Neto, Pedro
Lapointe, François-Joseph
Legendre, Pierre
author_sort Makarenkov, Vladimir
collection PubMed
description BACKGROUND: Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. Hence, the quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. As a consequence, traditionally, the trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny. RESULTS: In this article, we discuss the idea of applying weighted bootstrapping to phylogenetic reconstruction by weighting each phylogeny inferred from resampled sequences. Tree weights can be based either on the least-squares (LS) tree estimate or on the average secondary bootstrap score (SBS) associated with each resampled tree. Secondary bootstrapping consists of the estimation of bootstrap scores of the trees inferred from resampled data. The LS and SBS-based bootstrapping procedures were designed to take into account the quality of each "pseudo-replicated" phylogeny in the final tree estimation. A simulation study was carried out to evaluate the performances of the five weighting strategies which are as follows: LS and SBS-based bootstrapping, LS and SBS-based bootstrapping with data normalization and the traditional unweighted bootstrapping. CONCLUSIONS: The simulations conducted with two real data sets and the five weighting strategies suggest that the SBS-based bootstrapping with the data normalization usually exhibits larger bootstrap scores and a higher robustness compared to the four other competing strategies, including the traditional bootstrapping. The high robustness of the normalized SBS could be particularly useful in situations where observed sequences have been affected by noise or have undergone massive insertion or deletion events. The results provided by the four other strategies were very similar regardless the noise level, thus also demonstrating the stability of the traditional bootstrapping method.
format Text
id pubmed-2939571
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29395712010-09-16 Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees Makarenkov, Vladimir Boc, Alix Xie, Jingxin Peres-Neto, Pedro Lapointe, François-Joseph Legendre, Pierre BMC Evol Biol Methodology Article BACKGROUND: Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. Hence, the quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. As a consequence, traditionally, the trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny. RESULTS: In this article, we discuss the idea of applying weighted bootstrapping to phylogenetic reconstruction by weighting each phylogeny inferred from resampled sequences. Tree weights can be based either on the least-squares (LS) tree estimate or on the average secondary bootstrap score (SBS) associated with each resampled tree. Secondary bootstrapping consists of the estimation of bootstrap scores of the trees inferred from resampled data. The LS and SBS-based bootstrapping procedures were designed to take into account the quality of each "pseudo-replicated" phylogeny in the final tree estimation. A simulation study was carried out to evaluate the performances of the five weighting strategies which are as follows: LS and SBS-based bootstrapping, LS and SBS-based bootstrapping with data normalization and the traditional unweighted bootstrapping. CONCLUSIONS: The simulations conducted with two real data sets and the five weighting strategies suggest that the SBS-based bootstrapping with the data normalization usually exhibits larger bootstrap scores and a higher robustness compared to the four other competing strategies, including the traditional bootstrapping. The high robustness of the normalized SBS could be particularly useful in situations where observed sequences have been affected by noise or have undergone massive insertion or deletion events. The results provided by the four other strategies were very similar regardless the noise level, thus also demonstrating the stability of the traditional bootstrapping method. BioMed Central 2010-08-17 /pmc/articles/PMC2939571/ /pubmed/20716358 http://dx.doi.org/10.1186/1471-2148-10-250 Text en Copyright ©2010 Makarenkov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Makarenkov, Vladimir
Boc, Alix
Xie, Jingxin
Peres-Neto, Pedro
Lapointe, François-Joseph
Legendre, Pierre
Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
title Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
title_full Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
title_fullStr Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
title_full_unstemmed Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
title_short Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
title_sort weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2939571/
https://www.ncbi.nlm.nih.gov/pubmed/20716358
http://dx.doi.org/10.1186/1471-2148-10-250
work_keys_str_mv AT makarenkovvladimir weightedbootstrappingacorrectionmethodforassessingtherobustnessofphylogenetictrees
AT bocalix weightedbootstrappingacorrectionmethodforassessingtherobustnessofphylogenetictrees
AT xiejingxin weightedbootstrappingacorrectionmethodforassessingtherobustnessofphylogenetictrees
AT peresnetopedro weightedbootstrappingacorrectionmethodforassessingtherobustnessofphylogenetictrees
AT lapointefrancoisjoseph weightedbootstrappingacorrectionmethodforassessingtherobustnessofphylogenetictrees
AT legendrepierre weightedbootstrappingacorrectionmethodforassessingtherobustnessofphylogenetictrees