Cargando…

Bootstrapping phylogenies inferred from rearrangement data

BACKGROUND: Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Yu, Rajan, Vaibhav, Moret, Bernard ME
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3487984/
https://www.ncbi.nlm.nih.gov/pubmed/22931958
http://dx.doi.org/10.1186/1748-7188-7-21
_version_ 1782248559410675712
author Lin, Yu
Rajan, Vaibhav
Moret, Bernard ME
author_facet Lin, Yu
Rajan, Vaibhav
Moret, Bernard ME
author_sort Lin, Yu
collection PubMed
description BACKGROUND: Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. RESULTS: We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. CONCLUSIONS: Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver-operating characteristics are nearly identical, indicating that it provides similar levels of sensitivity and specificity. Thus our assessment method makes it possible to conduct phylogenetic analyses on whole genomes with the same degree of confidence as for analyses on aligned sequences. Extensions to search-based inference methods such as maximum parsimony and maximum likelihood are possible, but remain to be thoroughly tested.
format Online
Article
Text
id pubmed-3487984
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34879842012-11-08 Bootstrapping phylogenies inferred from rearrangement data Lin, Yu Rajan, Vaibhav Moret, Bernard ME Algorithms Mol Biol Research BACKGROUND: Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. RESULTS: We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. CONCLUSIONS: Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver-operating characteristics are nearly identical, indicating that it provides similar levels of sensitivity and specificity. Thus our assessment method makes it possible to conduct phylogenetic analyses on whole genomes with the same degree of confidence as for analyses on aligned sequences. Extensions to search-based inference methods such as maximum parsimony and maximum likelihood are possible, but remain to be thoroughly tested. BioMed Central 2012-08-29 /pmc/articles/PMC3487984/ /pubmed/22931958 http://dx.doi.org/10.1186/1748-7188-7-21 Text en Copyright ©2012 Lin et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Lin, Yu
Rajan, Vaibhav
Moret, Bernard ME
Bootstrapping phylogenies inferred from rearrangement data
title Bootstrapping phylogenies inferred from rearrangement data
title_full Bootstrapping phylogenies inferred from rearrangement data
title_fullStr Bootstrapping phylogenies inferred from rearrangement data
title_full_unstemmed Bootstrapping phylogenies inferred from rearrangement data
title_short Bootstrapping phylogenies inferred from rearrangement data
title_sort bootstrapping phylogenies inferred from rearrangement data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3487984/
https://www.ncbi.nlm.nih.gov/pubmed/22931958
http://dx.doi.org/10.1186/1748-7188-7-21
work_keys_str_mv AT linyu bootstrappingphylogeniesinferredfromrearrangementdata
AT rajanvaibhav bootstrappingphylogeniesinferredfromrearrangementdata
AT moretbernardme bootstrappingphylogeniesinferredfromrearrangementdata