Cargando…
Noncoding RNA gene detection using comparative sequence analysis
BACKGROUND: Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusiv...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2001
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC64605/ https://www.ncbi.nlm.nih.gov/pubmed/11801179 http://dx.doi.org/10.1186/1471-2105-2-8 |
_version_ | 1782120135526449152 |
---|---|
author | Rivas, Elena Eddy, Sean R |
author_facet | Rivas, Elena Eddy, Sean R |
author_sort | Rivas, Elena |
collection | PubMed |
description | BACKGROUND: Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusive. RESULTS: We describe a comparative sequence analysis algorithm for detecting novel structural RNA genes. The key idea is to test the pattern of substitutions observed in a pairwise alignment of two homologous sequences. A conserved coding region tends to show a pattern of synonymous substitutions, whereas a conserved structural RNA tends to show a pattern of compensatory mutations consistent with some base-paired secondary structure. We formalize this intuition using three probabilistic "pair-grammars": a pair stochastic context free grammar modeling alignments constrained by structural RNA evolution, a pair hidden Markov model modeling alignments constrained by coding sequence evolution, and a pair hidden Markov model modeling a null hypothesis of position-independent evolution. Given an input pairwise sequence alignment (e.g. from a BLASTN comparison of two related genomes) we classify the alignment into the coding, RNA, or null class according to the posterior probability of each class. CONCLUSIONS: We have implemented this approach as a program, QRNA, which we consider to be a prototype structural noncoding RNA genefinder. Tests suggest that this approach detects noncoding RNA genes with a fair degree of reliability. |
format | Text |
id | pubmed-64605 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2001 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-646052002-01-23 Noncoding RNA gene detection using comparative sequence analysis Rivas, Elena Eddy, Sean R BMC Bioinformatics Methodology Article BACKGROUND: Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusive. RESULTS: We describe a comparative sequence analysis algorithm for detecting novel structural RNA genes. The key idea is to test the pattern of substitutions observed in a pairwise alignment of two homologous sequences. A conserved coding region tends to show a pattern of synonymous substitutions, whereas a conserved structural RNA tends to show a pattern of compensatory mutations consistent with some base-paired secondary structure. We formalize this intuition using three probabilistic "pair-grammars": a pair stochastic context free grammar modeling alignments constrained by structural RNA evolution, a pair hidden Markov model modeling alignments constrained by coding sequence evolution, and a pair hidden Markov model modeling a null hypothesis of position-independent evolution. Given an input pairwise sequence alignment (e.g. from a BLASTN comparison of two related genomes) we classify the alignment into the coding, RNA, or null class according to the posterior probability of each class. CONCLUSIONS: We have implemented this approach as a program, QRNA, which we consider to be a prototype structural noncoding RNA genefinder. Tests suggest that this approach detects noncoding RNA genes with a fair degree of reliability. BioMed Central 2001-10-10 /pmc/articles/PMC64605/ /pubmed/11801179 http://dx.doi.org/10.1186/1471-2105-2-8 Text en Copyright © 2001 Rivas and Eddy; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. |
spellingShingle | Methodology Article Rivas, Elena Eddy, Sean R Noncoding RNA gene detection using comparative sequence analysis |
title | Noncoding RNA gene detection using comparative sequence analysis |
title_full | Noncoding RNA gene detection using comparative sequence analysis |
title_fullStr | Noncoding RNA gene detection using comparative sequence analysis |
title_full_unstemmed | Noncoding RNA gene detection using comparative sequence analysis |
title_short | Noncoding RNA gene detection using comparative sequence analysis |
title_sort | noncoding rna gene detection using comparative sequence analysis |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC64605/ https://www.ncbi.nlm.nih.gov/pubmed/11801179 http://dx.doi.org/10.1186/1471-2105-2-8 |
work_keys_str_mv | AT rivaselena noncodingrnagenedetectionusingcomparativesequenceanalysis AT eddyseanr noncodingrnagenedetectionusingcomparativesequenceanalysis |