Cargando…

Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa

BACKGROUND: Failure to account for covariation patterns in helical regions of ribosomal RNA (rRNA) genes has the potential to misdirect the estimation of the phylogenetic signal of the data. Furthermore, the extremes of length variation among taxa, combined with regional substitution rate variation...

Descripción completa

Detalles Bibliográficos
Autores principales: Letsch , Harald O, Kjer, Karl M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3123606/
https://www.ncbi.nlm.nih.gov/pubmed/21619604
http://dx.doi.org/10.1186/1471-2148-11-146
_version_ 1782207002010714112
author Letsch , Harald O
Kjer, Karl M
author_facet Letsch , Harald O
Kjer, Karl M
author_sort Letsch , Harald O
collection PubMed
description BACKGROUND: Failure to account for covariation patterns in helical regions of ribosomal RNA (rRNA) genes has the potential to misdirect the estimation of the phylogenetic signal of the data. Furthermore, the extremes of length variation among taxa, combined with regional substitution rate variation can mislead the alignment of rRNA sequences and thus distort subsequent tree reconstructions. However, recent developments in phylogenetic methodology now allow a comprehensive integration of secondary structures in alignment and tree reconstruction analyses based on rRNA sequences, which has been shown to correct some of these problems. Here, we explore the potentials of RNA substitution models and the interactions of specific model setups with the inherent pattern of covariation in rRNA stems and substitution rate variation among loop regions. RESULTS: We found an explicit impact of RNA substitution models on tree reconstruction analyses. The application of specific RNA models in tree reconstructions is hampered by interaction between the appropriate modelling of covarying sites in stem regions, and excessive homoplasy in some loop regions. RNA models often failed to recover reasonable trees when single-stranded regions are excessively homoplastic, because these regions contribute a greater proportion of the data when covarying sites are essentially downweighted. In this context, the RNA6A model outperformed all other models, including the more parametrized RNA7 and RNA16 models. CONCLUSIONS: Our results depict a trade-off between increased accuracy in estimation of interdependencies in helical regions with the risk of magnifying positions lacking phylogenetic signal. We can therefore conclude that caution is warranted when applying rRNA covariation models, and suggest that loop regions be independently screened for phylogenetic signal, and eliminated when they are indistinguishable from random noise. In addition to covariation and homoplasy, other factors, like non-stationarity of substitution rates and base compositional heterogeneity, can disrupt the signal of ribosomal RNA data. All these factors dictate sophisticated estimation of evolutionary pattern in rRNA data, just as other molecular data require similarly complicated (but different) corrections.
format Online
Article
Text
id pubmed-3123606
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31236062011-06-26 Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa Letsch , Harald O Kjer, Karl M BMC Evol Biol Research Article BACKGROUND: Failure to account for covariation patterns in helical regions of ribosomal RNA (rRNA) genes has the potential to misdirect the estimation of the phylogenetic signal of the data. Furthermore, the extremes of length variation among taxa, combined with regional substitution rate variation can mislead the alignment of rRNA sequences and thus distort subsequent tree reconstructions. However, recent developments in phylogenetic methodology now allow a comprehensive integration of secondary structures in alignment and tree reconstruction analyses based on rRNA sequences, which has been shown to correct some of these problems. Here, we explore the potentials of RNA substitution models and the interactions of specific model setups with the inherent pattern of covariation in rRNA stems and substitution rate variation among loop regions. RESULTS: We found an explicit impact of RNA substitution models on tree reconstruction analyses. The application of specific RNA models in tree reconstructions is hampered by interaction between the appropriate modelling of covarying sites in stem regions, and excessive homoplasy in some loop regions. RNA models often failed to recover reasonable trees when single-stranded regions are excessively homoplastic, because these regions contribute a greater proportion of the data when covarying sites are essentially downweighted. In this context, the RNA6A model outperformed all other models, including the more parametrized RNA7 and RNA16 models. CONCLUSIONS: Our results depict a trade-off between increased accuracy in estimation of interdependencies in helical regions with the risk of magnifying positions lacking phylogenetic signal. We can therefore conclude that caution is warranted when applying rRNA covariation models, and suggest that loop regions be independently screened for phylogenetic signal, and eliminated when they are indistinguishable from random noise. In addition to covariation and homoplasy, other factors, like non-stationarity of substitution rates and base compositional heterogeneity, can disrupt the signal of ribosomal RNA data. All these factors dictate sophisticated estimation of evolutionary pattern in rRNA data, just as other molecular data require similarly complicated (but different) corrections. BioMed Central 2011-05-27 /pmc/articles/PMC3123606/ /pubmed/21619604 http://dx.doi.org/10.1186/1471-2148-11-146 Text en Copyright ©2011 Letsch and Kjer; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Letsch , Harald O
Kjer, Karl M
Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa
title Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa
title_full Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa
title_fullStr Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa
title_full_unstemmed Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa
title_short Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa
title_sort potential pitfalls of modelling ribosomal rna data in phylogenetic tree reconstruction: evidence from case studies in the metazoa
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3123606/
https://www.ncbi.nlm.nih.gov/pubmed/21619604
http://dx.doi.org/10.1186/1471-2148-11-146
work_keys_str_mv AT letschharaldo potentialpitfallsofmodellingribosomalrnadatainphylogenetictreereconstructionevidencefromcasestudiesinthemetazoa
AT kjerkarlm potentialpitfallsofmodellingribosomalrnadatainphylogenetictreereconstructionevidencefromcasestudiesinthemetazoa