Cargando…
Assessing the State of Substitution Models Describing Noncoding RNA Evolution
Phylogenetic inference is widely used to investigate the relationships between homologous sequences. RNA molecules have played a key role in these studies because they are present throughout life and tend to evolve slowly. Phylogenetic inference has been shown to be dependent on the substitution mod...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914692/ https://www.ncbi.nlm.nih.gov/pubmed/24391153 http://dx.doi.org/10.1093/gbe/evt206 |
_version_ | 1782302448480681984 |
---|---|
author | Allen, James E. Whelan, Simon |
author_facet | Allen, James E. Whelan, Simon |
author_sort | Allen, James E. |
collection | PubMed |
description | Phylogenetic inference is widely used to investigate the relationships between homologous sequences. RNA molecules have played a key role in these studies because they are present throughout life and tend to evolve slowly. Phylogenetic inference has been shown to be dependent on the substitution model used. A wide range of models have been developed to describe RNA evolution, either with 16 states describing all possible canonical base pairs or with 7 states where the 10 mismatched nucleotides are reduced to a single state. Formal model selection has become a standard practice for choosing an inferential model and works well for comparing models of a specific type, such as comparisons within nucleotide models or within amino acid models. Model selection cannot function across different sized state spaces because the likelihoods are conditioned on different data. Here, we introduce statistical state-space projection methods that allow the direct comparison of likelihoods between nucleotide models and 7-state and 16-state RNA models. To demonstrate the general applicability of our new methods, we extract 287 RNA families from genomic alignments and perform model selection. We find that in 281/287 families, RNA models are selected in preference to nucleotide models, with simple 7-state RNA models selected for more conserved families with shorter stems and more complex 16-state RNA models selected for more divergent families with longer stems. Other factors, such as the function of the RNA molecule or the GC-content, have limited impact on model selection. Our models and model selection methods are freely available in the open-source PHASE 3.0 software. |
format | Online Article Text |
id | pubmed-3914692 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-39146922014-02-06 Assessing the State of Substitution Models Describing Noncoding RNA Evolution Allen, James E. Whelan, Simon Genome Biol Evol Research Article Phylogenetic inference is widely used to investigate the relationships between homologous sequences. RNA molecules have played a key role in these studies because they are present throughout life and tend to evolve slowly. Phylogenetic inference has been shown to be dependent on the substitution model used. A wide range of models have been developed to describe RNA evolution, either with 16 states describing all possible canonical base pairs or with 7 states where the 10 mismatched nucleotides are reduced to a single state. Formal model selection has become a standard practice for choosing an inferential model and works well for comparing models of a specific type, such as comparisons within nucleotide models or within amino acid models. Model selection cannot function across different sized state spaces because the likelihoods are conditioned on different data. Here, we introduce statistical state-space projection methods that allow the direct comparison of likelihoods between nucleotide models and 7-state and 16-state RNA models. To demonstrate the general applicability of our new methods, we extract 287 RNA families from genomic alignments and perform model selection. We find that in 281/287 families, RNA models are selected in preference to nucleotide models, with simple 7-state RNA models selected for more conserved families with shorter stems and more complex 16-state RNA models selected for more divergent families with longer stems. Other factors, such as the function of the RNA molecule or the GC-content, have limited impact on model selection. Our models and model selection methods are freely available in the open-source PHASE 3.0 software. Oxford University Press 2014-01-03 /pmc/articles/PMC3914692/ /pubmed/24391153 http://dx.doi.org/10.1093/gbe/evt206 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Allen, James E. Whelan, Simon Assessing the State of Substitution Models Describing Noncoding RNA Evolution |
title | Assessing the State of Substitution Models Describing Noncoding RNA Evolution |
title_full | Assessing the State of Substitution Models Describing Noncoding RNA Evolution |
title_fullStr | Assessing the State of Substitution Models Describing Noncoding RNA Evolution |
title_full_unstemmed | Assessing the State of Substitution Models Describing Noncoding RNA Evolution |
title_short | Assessing the State of Substitution Models Describing Noncoding RNA Evolution |
title_sort | assessing the state of substitution models describing noncoding rna evolution |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914692/ https://www.ncbi.nlm.nih.gov/pubmed/24391153 http://dx.doi.org/10.1093/gbe/evt206 |
work_keys_str_mv | AT allenjamese assessingthestateofsubstitutionmodelsdescribingnoncodingrnaevolution AT whelansimon assessingthestateofsubstitutionmodelsdescribingnoncodingrnaevolution |