Cargando…

Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments

Probabilistic inference of a phylogenetic tree from molecular sequence data is predicated on a substitution model describing the relative rates of change between character states along the tree for each site in the multiple sequence alignment. Commonly, one assumes that the substitution model is hom...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Chieh-Hsi, Suchard, Marc A., Drummond, Alexei J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3563969/
https://www.ncbi.nlm.nih.gov/pubmed/23233462
http://dx.doi.org/10.1093/molbev/mss258
_version_ 1782258251895668736
author Wu, Chieh-Hsi
Suchard, Marc A.
Drummond, Alexei J.
author_facet Wu, Chieh-Hsi
Suchard, Marc A.
Drummond, Alexei J.
author_sort Wu, Chieh-Hsi
collection PubMed
description Probabilistic inference of a phylogenetic tree from molecular sequence data is predicated on a substitution model describing the relative rates of change between character states along the tree for each site in the multiple sequence alignment. Commonly, one assumes that the substitution model is homogeneous across sites within large partitions of the alignment, assigns these partitions a priori, and then fixes their underlying substitution model to the best-fitting model from a hierarchy of named models. Here, we introduce an automatic model selection and model averaging approach within a Bayesian framework that simultaneously estimates the number of partitions, the assignment of sites to partitions, the substitution model for each partition, and the uncertainty in these selections. This new approach is implemented as an add-on to the BEAST 2 software platform. We find that this approach dramatically improves the fit of the nucleotide substitution model compared with existing approaches, and we show, using a number of example data sets, that as many as nine partitions are required to explain the heterogeneity in nucleotide substitution process across sites in a single gene analysis. In some instances, this improved modeling of the substitution process can have a measurable effect on downstream inference, including the estimated phylogeny, relative divergence times, and effective population size histories.
format Online
Article
Text
id pubmed-3563969
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35639692013-02-05 Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments Wu, Chieh-Hsi Suchard, Marc A. Drummond, Alexei J. Mol Biol Evol Methods Probabilistic inference of a phylogenetic tree from molecular sequence data is predicated on a substitution model describing the relative rates of change between character states along the tree for each site in the multiple sequence alignment. Commonly, one assumes that the substitution model is homogeneous across sites within large partitions of the alignment, assigns these partitions a priori, and then fixes their underlying substitution model to the best-fitting model from a hierarchy of named models. Here, we introduce an automatic model selection and model averaging approach within a Bayesian framework that simultaneously estimates the number of partitions, the assignment of sites to partitions, the substitution model for each partition, and the uncertainty in these selections. This new approach is implemented as an add-on to the BEAST 2 software platform. We find that this approach dramatically improves the fit of the nucleotide substitution model compared with existing approaches, and we show, using a number of example data sets, that as many as nine partitions are required to explain the heterogeneity in nucleotide substitution process across sites in a single gene analysis. In some instances, this improved modeling of the substitution process can have a measurable effect on downstream inference, including the estimated phylogeny, relative divergence times, and effective population size histories. Oxford University Press 2013-03 2012-12-11 /pmc/articles/PMC3563969/ /pubmed/23233462 http://dx.doi.org/10.1093/molbev/mss258 Text en © The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Wu, Chieh-Hsi
Suchard, Marc A.
Drummond, Alexei J.
Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments
title Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments
title_full Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments
title_fullStr Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments
title_full_unstemmed Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments
title_short Bayesian Selection of Nucleotide Substitution Models and Their Site Assignments
title_sort bayesian selection of nucleotide substitution models and their site assignments
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3563969/
https://www.ncbi.nlm.nih.gov/pubmed/23233462
http://dx.doi.org/10.1093/molbev/mss258
work_keys_str_mv AT wuchiehhsi bayesianselectionofnucleotidesubstitutionmodelsandtheirsiteassignments
AT suchardmarca bayesianselectionofnucleotidesubstitutionmodelsandtheirsiteassignments
AT drummondalexeij bayesianselectionofnucleotidesubstitutionmodelsandtheirsiteassignments