Cargando…

De novo computational prediction of non-coding RNA genes in prokaryotic genomes

Motivation: The computational identification of non-coding RNA (ncRNA) genes represents one of the most important and challenging problems in computational biology. Existing methods for ncRNA gene prediction rely mostly on homology information, thus limiting their applications to ncRNA genes with kn...

Descripción completa

Detalles Bibliográficos
Autores principales: Tran, Thao T., Zhou, Fengfeng, Marshburn, Sarah, Stead, Mark, Kushner, Sidney R., Xu, Ying
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2773258/
https://www.ncbi.nlm.nih.gov/pubmed/19744996
http://dx.doi.org/10.1093/bioinformatics/btp537
_version_ 1782173857211219968
author Tran, Thao T.
Zhou, Fengfeng
Marshburn, Sarah
Stead, Mark
Kushner, Sidney R.
Xu, Ying
author_facet Tran, Thao T.
Zhou, Fengfeng
Marshburn, Sarah
Stead, Mark
Kushner, Sidney R.
Xu, Ying
author_sort Tran, Thao T.
collection PubMed
description Motivation: The computational identification of non-coding RNA (ncRNA) genes represents one of the most important and challenging problems in computational biology. Existing methods for ncRNA gene prediction rely mostly on homology information, thus limiting their applications to ncRNA genes with known homologues. Results: We present a novel de novo prediction algorithm for ncRNA genes using features derived from the sequences and structures of known ncRNA genes in comparison to decoys. Using these features, we have trained a neural network-based classifier and have applied it to Escherichia coli and Sulfolobus solfataricus for genome-wide prediction of ncRNAs. Our method has an average prediction sensitivity and specificity of 68% and 70%, respectively, for identifying windows with potential for ncRNA genes in E.coli. By combining windows of different sizes and using positional filtering strategies, we predicted 601 candidate ncRNAs and recovered 41% of known ncRNAs in E.coli. We experimentally investigated six novel candidates using Northern blot analysis and found expression of three candidates: one represents a potential new ncRNA, one is associated with stable mRNA decay intermediates and one is a case of either a potential riboswitch or transcription attenuator involved in the regulation of cell division. In general, our approach enables the identification of both cis- and trans-acting ncRNAs in partially or completely sequenced microbial genomes without requiring homology or structural conservation. Availability: The source code and results are available at http://csbl.bmb.uga.edu/publications/materials/tran/. Contact: xyn@bmb.uga.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2773258
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27732582009-11-05 De novo computational prediction of non-coding RNA genes in prokaryotic genomes Tran, Thao T. Zhou, Fengfeng Marshburn, Sarah Stead, Mark Kushner, Sidney R. Xu, Ying Bioinformatics Original Papers Motivation: The computational identification of non-coding RNA (ncRNA) genes represents one of the most important and challenging problems in computational biology. Existing methods for ncRNA gene prediction rely mostly on homology information, thus limiting their applications to ncRNA genes with known homologues. Results: We present a novel de novo prediction algorithm for ncRNA genes using features derived from the sequences and structures of known ncRNA genes in comparison to decoys. Using these features, we have trained a neural network-based classifier and have applied it to Escherichia coli and Sulfolobus solfataricus for genome-wide prediction of ncRNAs. Our method has an average prediction sensitivity and specificity of 68% and 70%, respectively, for identifying windows with potential for ncRNA genes in E.coli. By combining windows of different sizes and using positional filtering strategies, we predicted 601 candidate ncRNAs and recovered 41% of known ncRNAs in E.coli. We experimentally investigated six novel candidates using Northern blot analysis and found expression of three candidates: one represents a potential new ncRNA, one is associated with stable mRNA decay intermediates and one is a case of either a potential riboswitch or transcription attenuator involved in the regulation of cell division. In general, our approach enables the identification of both cis- and trans-acting ncRNAs in partially or completely sequenced microbial genomes without requiring homology or structural conservation. Availability: The source code and results are available at http://csbl.bmb.uga.edu/publications/materials/tran/. Contact: xyn@bmb.uga.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2009-11-15 2009-09-10 /pmc/articles/PMC2773258/ /pubmed/19744996 http://dx.doi.org/10.1093/bioinformatics/btp537 Text en http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Tran, Thao T.
Zhou, Fengfeng
Marshburn, Sarah
Stead, Mark
Kushner, Sidney R.
Xu, Ying
De novo computational prediction of non-coding RNA genes in prokaryotic genomes
title De novo computational prediction of non-coding RNA genes in prokaryotic genomes
title_full De novo computational prediction of non-coding RNA genes in prokaryotic genomes
title_fullStr De novo computational prediction of non-coding RNA genes in prokaryotic genomes
title_full_unstemmed De novo computational prediction of non-coding RNA genes in prokaryotic genomes
title_short De novo computational prediction of non-coding RNA genes in prokaryotic genomes
title_sort de novo computational prediction of non-coding rna genes in prokaryotic genomes
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2773258/
https://www.ncbi.nlm.nih.gov/pubmed/19744996
http://dx.doi.org/10.1093/bioinformatics/btp537
work_keys_str_mv AT tranthaot denovocomputationalpredictionofnoncodingrnagenesinprokaryoticgenomes
AT zhoufengfeng denovocomputationalpredictionofnoncodingrnagenesinprokaryoticgenomes
AT marshburnsarah denovocomputationalpredictionofnoncodingrnagenesinprokaryoticgenomes
AT steadmark denovocomputationalpredictionofnoncodingrnagenesinprokaryoticgenomes
AT kushnersidneyr denovocomputationalpredictionofnoncodingrnagenesinprokaryoticgenomes
AT xuying denovocomputationalpredictionofnoncodingrnagenesinprokaryoticgenomes