Cargando…

Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics

The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for fi...

Descripción completa

Detalles Bibliográficos
Autores principales: Westesson, Oscar, Lunter, Gerton, Paten, Benedict, Holmes, Ian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3335033/
https://www.ncbi.nlm.nih.gov/pubmed/22536326
http://dx.doi.org/10.1371/journal.pone.0034572
_version_ 1782230732633014272
author Westesson, Oscar
Lunter, Gerton
Paten, Benedict
Holmes, Ian
author_facet Westesson, Oscar
Lunter, Gerton
Paten, Benedict
Holmes, Ian
author_sort Westesson, Oscar
collection PubMed
description The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.
format Online
Article
Text
id pubmed-3335033
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33350332012-04-25 Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics Westesson, Oscar Lunter, Gerton Paten, Benedict Holmes, Ian PLoS One Research Article The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes. Public Library of Science 2012-04-20 /pmc/articles/PMC3335033/ /pubmed/22536326 http://dx.doi.org/10.1371/journal.pone.0034572 Text en Westesson et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Westesson, Oscar
Lunter, Gerton
Paten, Benedict
Holmes, Ian
Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics
title Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics
title_full Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics
title_fullStr Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics
title_full_unstemmed Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics
title_short Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics
title_sort accurate reconstruction of insertion-deletion histories by statistical phylogenetics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3335033/
https://www.ncbi.nlm.nih.gov/pubmed/22536326
http://dx.doi.org/10.1371/journal.pone.0034572
work_keys_str_mv AT westessonoscar accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics
AT luntergerton accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics
AT patenbenedict accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics
AT holmesian accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics