Cargando…

Noncoding RNA gene detection using comparative sequence analysis

BACKGROUND: Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Rivas, Elena, Eddy, Sean R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2001
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC64605/
https://www.ncbi.nlm.nih.gov/pubmed/11801179
http://dx.doi.org/10.1186/1471-2105-2-8
_version_ 1782120135526449152
author Rivas, Elena
Eddy, Sean R
author_facet Rivas, Elena
Eddy, Sean R
author_sort Rivas, Elena
collection PubMed
description BACKGROUND: Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusive. RESULTS: We describe a comparative sequence analysis algorithm for detecting novel structural RNA genes. The key idea is to test the pattern of substitutions observed in a pairwise alignment of two homologous sequences. A conserved coding region tends to show a pattern of synonymous substitutions, whereas a conserved structural RNA tends to show a pattern of compensatory mutations consistent with some base-paired secondary structure. We formalize this intuition using three probabilistic "pair-grammars": a pair stochastic context free grammar modeling alignments constrained by structural RNA evolution, a pair hidden Markov model modeling alignments constrained by coding sequence evolution, and a pair hidden Markov model modeling a null hypothesis of position-independent evolution. Given an input pairwise sequence alignment (e.g. from a BLASTN comparison of two related genomes) we classify the alignment into the coding, RNA, or null class according to the posterior probability of each class. CONCLUSIONS: We have implemented this approach as a program, QRNA, which we consider to be a prototype structural noncoding RNA genefinder. Tests suggest that this approach detects noncoding RNA genes with a fair degree of reliability.
format Text
id pubmed-64605
institution National Center for Biotechnology Information
language English
publishDate 2001
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-646052002-01-23 Noncoding RNA gene detection using comparative sequence analysis Rivas, Elena Eddy, Sean R BMC Bioinformatics Methodology Article BACKGROUND: Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusive. RESULTS: We describe a comparative sequence analysis algorithm for detecting novel structural RNA genes. The key idea is to test the pattern of substitutions observed in a pairwise alignment of two homologous sequences. A conserved coding region tends to show a pattern of synonymous substitutions, whereas a conserved structural RNA tends to show a pattern of compensatory mutations consistent with some base-paired secondary structure. We formalize this intuition using three probabilistic "pair-grammars": a pair stochastic context free grammar modeling alignments constrained by structural RNA evolution, a pair hidden Markov model modeling alignments constrained by coding sequence evolution, and a pair hidden Markov model modeling a null hypothesis of position-independent evolution. Given an input pairwise sequence alignment (e.g. from a BLASTN comparison of two related genomes) we classify the alignment into the coding, RNA, or null class according to the posterior probability of each class. CONCLUSIONS: We have implemented this approach as a program, QRNA, which we consider to be a prototype structural noncoding RNA genefinder. Tests suggest that this approach detects noncoding RNA genes with a fair degree of reliability. BioMed Central 2001-10-10 /pmc/articles/PMC64605/ /pubmed/11801179 http://dx.doi.org/10.1186/1471-2105-2-8 Text en Copyright © 2001 Rivas and Eddy; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Methodology Article
Rivas, Elena
Eddy, Sean R
Noncoding RNA gene detection using comparative sequence analysis
title Noncoding RNA gene detection using comparative sequence analysis
title_full Noncoding RNA gene detection using comparative sequence analysis
title_fullStr Noncoding RNA gene detection using comparative sequence analysis
title_full_unstemmed Noncoding RNA gene detection using comparative sequence analysis
title_short Noncoding RNA gene detection using comparative sequence analysis
title_sort noncoding rna gene detection using comparative sequence analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC64605/
https://www.ncbi.nlm.nih.gov/pubmed/11801179
http://dx.doi.org/10.1186/1471-2105-2-8
work_keys_str_mv AT rivaselena noncodingrnagenedetectionusingcomparativesequenceanalysis
AT eddyseanr noncodingrnagenedetectionusingcomparativesequenceanalysis