Cargando…
Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our ex...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2659441/ https://www.ncbi.nlm.nih.gov/pubmed/19343219 http://dx.doi.org/10.1371/journal.pcbi.1000338 |
_version_ | 1782165677072711680 |
---|---|
author | Xu, Xing Ji, Yongmei Stormo, Gary D. |
author_facet | Xu, Xing Ji, Yongmei Stormo, Gary D. |
author_sort | Xu, Xing |
collection | PubMed |
description | An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM. |
format | Text |
id | pubmed-2659441 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-26594412009-04-03 Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines Xu, Xing Ji, Yongmei Stormo, Gary D. PLoS Comput Biol Research Article An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM. Public Library of Science 2009-04-03 /pmc/articles/PMC2659441/ /pubmed/19343219 http://dx.doi.org/10.1371/journal.pcbi.1000338 Text en Xu et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Xu, Xing Ji, Yongmei Stormo, Gary D. Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines |
title | Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines |
title_full | Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines |
title_fullStr | Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines |
title_full_unstemmed | Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines |
title_short | Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines |
title_sort | discovering cis-regulatory rnas in shewanella genomes by support vector machines |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2659441/ https://www.ncbi.nlm.nih.gov/pubmed/19343219 http://dx.doi.org/10.1371/journal.pcbi.1000338 |
work_keys_str_mv | AT xuxing discoveringcisregulatoryrnasinshewanellagenomesbysupportvectormachines AT jiyongmei discoveringcisregulatoryrnasinshewanellagenomesbysupportvectormachines AT stormogaryd discoveringcisregulatoryrnasinshewanellagenomesbysupportvectormachines |