Cargando…
The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales
Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often model...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4010409/ https://www.ncbi.nlm.nih.gov/pubmed/24798481 http://dx.doi.org/10.1371/journal.pone.0095722 |
_version_ | 1782479847495303168 |
---|---|
author | Jia, Fangzhi Lo, Nathan Ho, Simon Y. W. |
author_facet | Jia, Fangzhi Lo, Nathan Ho, Simon Y. W. |
author_sort | Jia, Fangzhi |
collection | PubMed |
description | Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while substitution rates at the remaining sites are gamma-distributed. For data sampled at the intraspecific level, however, biological assumptions involved in the invariable-sites model are commonly violated. We examined the use of these models in analyses of five intraspecific data sets. We show that using 6–10 rate categories for the discrete gamma distribution of rates among sites is sufficient to provide a good approximation of the marginal likelihood. Increasing the number of gamma rate categories did not have a substantial effect on estimates of the substitution rate or coalescence time, unless rates varied strongly among sites in a non-gamma-distributed manner. The assumption of a proportion of invariable sites provided a better approximation of the asymptotic marginal likelihood when the number of gamma categories was small, but had minimal impact on estimates of rates and coalescence times. However, the estimated proportion of invariable sites was highly susceptible to changes in the number of gamma rate categories. The concurrent use of gamma and invariable-site models for intraspecific data is not biologically meaningful and has been challenged on statistical grounds; here we have found that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates and timescales from intraspecific data. |
format | Online Article Text |
id | pubmed-4010409 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-40104092014-05-09 The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales Jia, Fangzhi Lo, Nathan Ho, Simon Y. W. PLoS One Research Article Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while substitution rates at the remaining sites are gamma-distributed. For data sampled at the intraspecific level, however, biological assumptions involved in the invariable-sites model are commonly violated. We examined the use of these models in analyses of five intraspecific data sets. We show that using 6–10 rate categories for the discrete gamma distribution of rates among sites is sufficient to provide a good approximation of the marginal likelihood. Increasing the number of gamma rate categories did not have a substantial effect on estimates of the substitution rate or coalescence time, unless rates varied strongly among sites in a non-gamma-distributed manner. The assumption of a proportion of invariable sites provided a better approximation of the asymptotic marginal likelihood when the number of gamma categories was small, but had minimal impact on estimates of rates and coalescence times. However, the estimated proportion of invariable sites was highly susceptible to changes in the number of gamma rate categories. The concurrent use of gamma and invariable-site models for intraspecific data is not biologically meaningful and has been challenged on statistical grounds; here we have found that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates and timescales from intraspecific data. Public Library of Science 2014-05-05 /pmc/articles/PMC4010409/ /pubmed/24798481 http://dx.doi.org/10.1371/journal.pone.0095722 Text en © 2014 Jia et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Jia, Fangzhi Lo, Nathan Ho, Simon Y. W. The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales |
title | The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales |
title_full | The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales |
title_fullStr | The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales |
title_full_unstemmed | The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales |
title_short | The Impact of Modelling Rate Heterogeneity among Sites on Phylogenetic Estimates of Intraspecific Evolutionary Rates and Timescales |
title_sort | impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4010409/ https://www.ncbi.nlm.nih.gov/pubmed/24798481 http://dx.doi.org/10.1371/journal.pone.0095722 |
work_keys_str_mv | AT jiafangzhi theimpactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales AT lonathan theimpactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales AT hosimonyw theimpactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales AT jiafangzhi impactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales AT lonathan impactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales AT hosimonyw impactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales |