Cargando…

Bayesian Inference of Species Trees from Multilocus Data

Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With technological advances, it is now becoming more common to collect data sets containing multiple gene loci and multiple indivi...

Descripción completa

Detalles Bibliográficos
Autores principales: Heled, Joseph, Drummond, Alexei J.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2822290/
https://www.ncbi.nlm.nih.gov/pubmed/19906793
http://dx.doi.org/10.1093/molbev/msp274
_version_ 1782177515473731584
author Heled, Joseph
Drummond, Alexei J.
author_facet Heled, Joseph
Drummond, Alexei J.
author_sort Heled, Joseph
collection PubMed
description Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With technological advances, it is now becoming more common to collect data sets containing multiple gene loci and multiple individuals per species. These data sets often reveal the need to directly model intraspecies polymorphism and incomplete lineage sorting in phylogenetic estimation procedures. For a single species, coalescent theory is widely used in contemporary population genetics to model intraspecific gene trees. Here, we present a Bayesian Markov chain Monte Carlo method for the multispecies coalescent. Our method coestimates multiple gene trees embedded in a shared species tree along with the effective population size of both extant and ancestral species. The inference is made possible by multilocus data from multiple individuals per species. Using a multiindividual data set and a series of simulations of rapid species radiations, we demonstrate the efficacy of our new method. These simulations give some insight into the behavior of the method as a function of sampled individuals, sampled loci, and sequence length. Finally, we compare our new method to both an existing method (BEST 2.2) with similar goals and the supermatrix (concatenation) method. We demonstrate that both BEST and our method have much better estimation accuracy for species tree topology than concatenation, and our method outperforms BEST in divergence time and population size estimation.
format Text
id pubmed-2822290
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28222902010-02-17 Bayesian Inference of Species Trees from Multilocus Data Heled, Joseph Drummond, Alexei J. Mol Biol Evol Research Articles Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With technological advances, it is now becoming more common to collect data sets containing multiple gene loci and multiple individuals per species. These data sets often reveal the need to directly model intraspecies polymorphism and incomplete lineage sorting in phylogenetic estimation procedures. For a single species, coalescent theory is widely used in contemporary population genetics to model intraspecific gene trees. Here, we present a Bayesian Markov chain Monte Carlo method for the multispecies coalescent. Our method coestimates multiple gene trees embedded in a shared species tree along with the effective population size of both extant and ancestral species. The inference is made possible by multilocus data from multiple individuals per species. Using a multiindividual data set and a series of simulations of rapid species radiations, we demonstrate the efficacy of our new method. These simulations give some insight into the behavior of the method as a function of sampled individuals, sampled loci, and sequence length. Finally, we compare our new method to both an existing method (BEST 2.2) with similar goals and the supermatrix (concatenation) method. We demonstrate that both BEST and our method have much better estimation accuracy for species tree topology than concatenation, and our method outperforms BEST in divergence time and population size estimation. Oxford University Press 2010-03 2009-11-11 /pmc/articles/PMC2822290/ /pubmed/19906793 http://dx.doi.org/10.1093/molbev/msp274 Text en © 2009 The Authors This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Heled, Joseph
Drummond, Alexei J.
Bayesian Inference of Species Trees from Multilocus Data
title Bayesian Inference of Species Trees from Multilocus Data
title_full Bayesian Inference of Species Trees from Multilocus Data
title_fullStr Bayesian Inference of Species Trees from Multilocus Data
title_full_unstemmed Bayesian Inference of Species Trees from Multilocus Data
title_short Bayesian Inference of Species Trees from Multilocus Data
title_sort bayesian inference of species trees from multilocus data
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2822290/
https://www.ncbi.nlm.nih.gov/pubmed/19906793
http://dx.doi.org/10.1093/molbev/msp274
work_keys_str_mv AT heledjoseph bayesianinferenceofspeciestreesfrommultilocusdata
AT drummondalexeij bayesianinferenceofspeciestreesfrommultilocusdata