Cargando…

Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures

Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not...

Descripción completa

Detalles Bibliográficos
Autores principales: Sabarinathan, Radhakrishnan, Anthon, Christian, Gorodkin, Jan, Seemann, Stefan E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6315940/
https://www.ncbi.nlm.nih.gov/pubmed/30518121
http://dx.doi.org/10.3390/genes9120604
_version_ 1783384412620062720
author Sabarinathan, Radhakrishnan
Anthon, Christian
Gorodkin, Jan
Seemann, Stefan E.
author_facet Sabarinathan, Radhakrishnan
Anthon, Christian
Gorodkin, Jan
Seemann, Stefan E.
author_sort Sabarinathan, Radhakrishnan
collection PubMed
description Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not necessarily provide precise information about the boundaries or the location of the RNA structure inside the predicted ncRNA. Even without having a structure prediction, it is of interest to search for structured domains, such as for finding common RNA motifs in RNA-protein binding assays. The precise definition of the boundaries are essential for downstream analyses such as RNA structure modelling, e.g., through covariance models, and RNA structure clustering for the search of common motifs. Such efforts have so far been focused on single sequences, thus here we present a comparison for boundary definition between single sequence and multiple sequence alignments. We also present a novel approach, named RNAbound, for finding the boundaries that are based on probabilities of evolutionarily conserved base pairings. We tested the performance of two different methods on a limited number of Rfam families using the annotated structured RNA regions in the human genome and their multiple sequence alignments created from 14 species. The results show that multiple sequence alignments improve the boundary prediction for branched structures compared to single sequences independent of the chosen method. The actual performance of the two methods differs on single hairpin structures and branched structures. For the RNA families with branched structures, including transfer RNA (tRNA) and small nucleolar RNAs (snoRNAs), RNAbound improves the boundary predictions using multiple sequence alignments to median differences of −6 and −11.5 nucleotides (nts) for left and right boundary, respectively (window size of 200 nts).
format Online
Article
Text
id pubmed-6315940
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-63159402019-01-09 Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures Sabarinathan, Radhakrishnan Anthon, Christian Gorodkin, Jan Seemann, Stefan E. Genes (Basel) Article Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not necessarily provide precise information about the boundaries or the location of the RNA structure inside the predicted ncRNA. Even without having a structure prediction, it is of interest to search for structured domains, such as for finding common RNA motifs in RNA-protein binding assays. The precise definition of the boundaries are essential for downstream analyses such as RNA structure modelling, e.g., through covariance models, and RNA structure clustering for the search of common motifs. Such efforts have so far been focused on single sequences, thus here we present a comparison for boundary definition between single sequence and multiple sequence alignments. We also present a novel approach, named RNAbound, for finding the boundaries that are based on probabilities of evolutionarily conserved base pairings. We tested the performance of two different methods on a limited number of Rfam families using the annotated structured RNA regions in the human genome and their multiple sequence alignments created from 14 species. The results show that multiple sequence alignments improve the boundary prediction for branched structures compared to single sequences independent of the chosen method. The actual performance of the two methods differs on single hairpin structures and branched structures. For the RNA families with branched structures, including transfer RNA (tRNA) and small nucleolar RNAs (snoRNAs), RNAbound improves the boundary predictions using multiple sequence alignments to median differences of −6 and −11.5 nucleotides (nts) for left and right boundary, respectively (window size of 200 nts). MDPI 2018-12-04 /pmc/articles/PMC6315940/ /pubmed/30518121 http://dx.doi.org/10.3390/genes9120604 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sabarinathan, Radhakrishnan
Anthon, Christian
Gorodkin, Jan
Seemann, Stefan E.
Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
title Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
title_full Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
title_fullStr Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
title_full_unstemmed Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
title_short Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
title_sort multiple sequence alignments enhance boundary definition of rna structures
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6315940/
https://www.ncbi.nlm.nih.gov/pubmed/30518121
http://dx.doi.org/10.3390/genes9120604
work_keys_str_mv AT sabarinathanradhakrishnan multiplesequencealignmentsenhanceboundarydefinitionofrnastructures
AT anthonchristian multiplesequencealignmentsenhanceboundarydefinitionofrnastructures
AT gorodkinjan multiplesequencealignmentsenhanceboundarydefinitionofrnastructures
AT seemannstefane multiplesequencealignmentsenhanceboundarydefinitionofrnastructures