Cargando…

Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Xing, Ji, Yongmei, Stormo, Gary D.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2659441/
https://www.ncbi.nlm.nih.gov/pubmed/19343219
http://dx.doi.org/10.1371/journal.pcbi.1000338
_version_ 1782165677072711680
author Xu, Xing
Ji, Yongmei
Stormo, Gary D.
author_facet Xu, Xing
Ji, Yongmei
Stormo, Gary D.
author_sort Xu, Xing
collection PubMed
description An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM.
format Text
id pubmed-2659441
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-26594412009-04-03 Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines Xu, Xing Ji, Yongmei Stormo, Gary D. PLoS Comput Biol Research Article An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM. Public Library of Science 2009-04-03 /pmc/articles/PMC2659441/ /pubmed/19343219 http://dx.doi.org/10.1371/journal.pcbi.1000338 Text en Xu et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xu, Xing
Ji, Yongmei
Stormo, Gary D.
Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
title Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
title_full Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
title_fullStr Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
title_full_unstemmed Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
title_short Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines
title_sort discovering cis-regulatory rnas in shewanella genomes by support vector machines
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2659441/
https://www.ncbi.nlm.nih.gov/pubmed/19343219
http://dx.doi.org/10.1371/journal.pcbi.1000338
work_keys_str_mv AT xuxing discoveringcisregulatoryrnasinshewanellagenomesbysupportvectormachines
AT jiyongmei discoveringcisregulatoryrnasinshewanellagenomesbysupportvectormachines
AT stormogaryd discoveringcisregulatoryrnasinshewanellagenomesbysupportvectormachines