Cargando…

Assessing the State of Substitution Models Describing Noncoding RNA Evolution

Phylogenetic inference is widely used to investigate the relationships between homologous sequences. RNA molecules have played a key role in these studies because they are present throughout life and tend to evolve slowly. Phylogenetic inference has been shown to be dependent on the substitution mod...

Descripción completa

Detalles Bibliográficos
Autores principales: Allen, James E., Whelan, Simon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914692/
https://www.ncbi.nlm.nih.gov/pubmed/24391153
http://dx.doi.org/10.1093/gbe/evt206
_version_ 1782302448480681984
author Allen, James E.
Whelan, Simon
author_facet Allen, James E.
Whelan, Simon
author_sort Allen, James E.
collection PubMed
description Phylogenetic inference is widely used to investigate the relationships between homologous sequences. RNA molecules have played a key role in these studies because they are present throughout life and tend to evolve slowly. Phylogenetic inference has been shown to be dependent on the substitution model used. A wide range of models have been developed to describe RNA evolution, either with 16 states describing all possible canonical base pairs or with 7 states where the 10 mismatched nucleotides are reduced to a single state. Formal model selection has become a standard practice for choosing an inferential model and works well for comparing models of a specific type, such as comparisons within nucleotide models or within amino acid models. Model selection cannot function across different sized state spaces because the likelihoods are conditioned on different data. Here, we introduce statistical state-space projection methods that allow the direct comparison of likelihoods between nucleotide models and 7-state and 16-state RNA models. To demonstrate the general applicability of our new methods, we extract 287 RNA families from genomic alignments and perform model selection. We find that in 281/287 families, RNA models are selected in preference to nucleotide models, with simple 7-state RNA models selected for more conserved families with shorter stems and more complex 16-state RNA models selected for more divergent families with longer stems. Other factors, such as the function of the RNA molecule or the GC-content, have limited impact on model selection. Our models and model selection methods are freely available in the open-source PHASE 3.0 software.
format Online
Article
Text
id pubmed-3914692
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-39146922014-02-06 Assessing the State of Substitution Models Describing Noncoding RNA Evolution Allen, James E. Whelan, Simon Genome Biol Evol Research Article Phylogenetic inference is widely used to investigate the relationships between homologous sequences. RNA molecules have played a key role in these studies because they are present throughout life and tend to evolve slowly. Phylogenetic inference has been shown to be dependent on the substitution model used. A wide range of models have been developed to describe RNA evolution, either with 16 states describing all possible canonical base pairs or with 7 states where the 10 mismatched nucleotides are reduced to a single state. Formal model selection has become a standard practice for choosing an inferential model and works well for comparing models of a specific type, such as comparisons within nucleotide models or within amino acid models. Model selection cannot function across different sized state spaces because the likelihoods are conditioned on different data. Here, we introduce statistical state-space projection methods that allow the direct comparison of likelihoods between nucleotide models and 7-state and 16-state RNA models. To demonstrate the general applicability of our new methods, we extract 287 RNA families from genomic alignments and perform model selection. We find that in 281/287 families, RNA models are selected in preference to nucleotide models, with simple 7-state RNA models selected for more conserved families with shorter stems and more complex 16-state RNA models selected for more divergent families with longer stems. Other factors, such as the function of the RNA molecule or the GC-content, have limited impact on model selection. Our models and model selection methods are freely available in the open-source PHASE 3.0 software. Oxford University Press 2014-01-03 /pmc/articles/PMC3914692/ /pubmed/24391153 http://dx.doi.org/10.1093/gbe/evt206 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Allen, James E.
Whelan, Simon
Assessing the State of Substitution Models Describing Noncoding RNA Evolution
title Assessing the State of Substitution Models Describing Noncoding RNA Evolution
title_full Assessing the State of Substitution Models Describing Noncoding RNA Evolution
title_fullStr Assessing the State of Substitution Models Describing Noncoding RNA Evolution
title_full_unstemmed Assessing the State of Substitution Models Describing Noncoding RNA Evolution
title_short Assessing the State of Substitution Models Describing Noncoding RNA Evolution
title_sort assessing the state of substitution models describing noncoding rna evolution
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914692/
https://www.ncbi.nlm.nih.gov/pubmed/24391153
http://dx.doi.org/10.1093/gbe/evt206
work_keys_str_mv AT allenjamese assessingthestateofsubstitutionmodelsdescribingnoncodingrnaevolution
AT whelansimon assessingthestateofsubstitutionmodelsdescribingnoncodingrnaevolution