Cargando…

Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change

BACKGROUND: Non-coding RNAs (ncRNAs) have a multitude of roles in the cell, many of which remain to be discovered. However, it is difficult to detect novel ncRNAs in biochemical screens. To advance biological knowledge, computational methods that can accurately detect ncRNAs in sequenced genomes are...

Descripción completa

Detalles Bibliográficos
Autores principales: Uzilov, Andrew V, Keegan, Joshua M, Mathews, David H
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1570369/
https://www.ncbi.nlm.nih.gov/pubmed/16566836
http://dx.doi.org/10.1186/1471-2105-7-173
_version_ 1782130260486127616
author Uzilov, Andrew V
Keegan, Joshua M
Mathews, David H
author_facet Uzilov, Andrew V
Keegan, Joshua M
Mathews, David H
author_sort Uzilov, Andrew V
collection PubMed
description BACKGROUND: Non-coding RNAs (ncRNAs) have a multitude of roles in the cell, many of which remain to be discovered. However, it is difficult to detect novel ncRNAs in biochemical screens. To advance biological knowledge, computational methods that can accurately detect ncRNAs in sequenced genomes are therefore desirable. The increasing number of genomic sequences provides a rich dataset for computational comparative sequence analysis and detection of novel ncRNAs. RESULTS: Here, Dynalign, a program for predicting secondary structures common to two RNA sequences on the basis of minimizing folding free energy change, is utilized as a computational ncRNA detection tool. The Dynalign-computed optimal total free energy change, which scores the structural alignment and the free energy change of folding into a common structure for two RNA sequences, is shown to be an effective measure for distinguishing ncRNA from randomized sequences. To make the classification as a ncRNA, the total free energy change of an input sequence pair can either be compared with the total free energy changes of a set of control sequence pairs, or be used in combination with sequence length and nucleotide frequencies as input to a classification support vector machine. The latter method is much faster, but slightly less sensitive at a given specificity. Additionally, the classification support vector machine method is shown to be sensitive and specific on genomic ncRNA screens of two different Escherichia coli and Salmonella typhi genome alignments, in which many ncRNAs are known. The Dynalign computational experiments are also compared with two other ncRNA detection programs, RNAz and QRNA. CONCLUSION: The Dynalign-based support vector machine method is more sensitive for known ncRNAs in the test genomic screens than RNAz and QRNA. Additionally, both Dynalign-based methods are more sensitive than RNAz and QRNA at low sequence pair identities. Dynalign can be used as a comparable or more accurate tool than RNAz or QRNA in genomic screens, especially for low-identity regions. Dynalign provides a method for discovering ncRNAs in sequenced genomes that other methods may not identify. Significant improvements in Dynalign runtime have also been achieved.
format Text
id pubmed-1570369
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15703692006-09-26 Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change Uzilov, Andrew V Keegan, Joshua M Mathews, David H BMC Bioinformatics Research Article BACKGROUND: Non-coding RNAs (ncRNAs) have a multitude of roles in the cell, many of which remain to be discovered. However, it is difficult to detect novel ncRNAs in biochemical screens. To advance biological knowledge, computational methods that can accurately detect ncRNAs in sequenced genomes are therefore desirable. The increasing number of genomic sequences provides a rich dataset for computational comparative sequence analysis and detection of novel ncRNAs. RESULTS: Here, Dynalign, a program for predicting secondary structures common to two RNA sequences on the basis of minimizing folding free energy change, is utilized as a computational ncRNA detection tool. The Dynalign-computed optimal total free energy change, which scores the structural alignment and the free energy change of folding into a common structure for two RNA sequences, is shown to be an effective measure for distinguishing ncRNA from randomized sequences. To make the classification as a ncRNA, the total free energy change of an input sequence pair can either be compared with the total free energy changes of a set of control sequence pairs, or be used in combination with sequence length and nucleotide frequencies as input to a classification support vector machine. The latter method is much faster, but slightly less sensitive at a given specificity. Additionally, the classification support vector machine method is shown to be sensitive and specific on genomic ncRNA screens of two different Escherichia coli and Salmonella typhi genome alignments, in which many ncRNAs are known. The Dynalign computational experiments are also compared with two other ncRNA detection programs, RNAz and QRNA. CONCLUSION: The Dynalign-based support vector machine method is more sensitive for known ncRNAs in the test genomic screens than RNAz and QRNA. Additionally, both Dynalign-based methods are more sensitive than RNAz and QRNA at low sequence pair identities. Dynalign can be used as a comparable or more accurate tool than RNAz or QRNA in genomic screens, especially for low-identity regions. Dynalign provides a method for discovering ncRNAs in sequenced genomes that other methods may not identify. Significant improvements in Dynalign runtime have also been achieved. BioMed Central 2006-03-27 /pmc/articles/PMC1570369/ /pubmed/16566836 http://dx.doi.org/10.1186/1471-2105-7-173 Text en Copyright © 2006 Uzilov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Uzilov, Andrew V
Keegan, Joshua M
Mathews, David H
Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change
title Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change
title_full Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change
title_fullStr Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change
title_full_unstemmed Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change
title_short Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change
title_sort detection of non-coding rnas on the basis of predicted secondary structure formation free energy change
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1570369/
https://www.ncbi.nlm.nih.gov/pubmed/16566836
http://dx.doi.org/10.1186/1471-2105-7-173
work_keys_str_mv AT uzilovandrewv detectionofnoncodingrnasonthebasisofpredictedsecondarystructureformationfreeenergychange
AT keeganjoshuam detectionofnoncodingrnasonthebasisofpredictedsecondarystructureformationfreeenergychange
AT mathewsdavidh detectionofnoncodingrnasonthebasisofpredictedsecondarystructureformationfreeenergychange