Cargando…
Predictive modeling of plant messenger RNA polyadenylation sites
BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and tra...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1805453/ https://www.ncbi.nlm.nih.gov/pubmed/17286857 http://dx.doi.org/10.1186/1471-2105-8-43 |
_version_ | 1782132481004142592 |
---|---|
author | Ji, Guoli Zheng, Jianti Shen, Yingjia Wu, Xiaohui Jiang, Ronghan Lin, Yun Loke, Johnny C Davis, Kimberly M Reese, Greg J Li, Qingshun Quinn |
author_facet | Ji, Guoli Zheng, Jianti Shen, Yingjia Wu, Xiaohui Jiang, Ronghan Lin, Yun Loke, Johnny C Davis, Kimberly M Reese, Greg J Li, Qingshun Quinn |
author_sort | Ji, Guoli |
collection | PubMed |
description | BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem. RESULTS: Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences. CONCLUSION: Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites. |
format | Text |
id | pubmed-1805453 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18054532007-03-13 Predictive modeling of plant messenger RNA polyadenylation sites Ji, Guoli Zheng, Jianti Shen, Yingjia Wu, Xiaohui Jiang, Ronghan Lin, Yun Loke, Johnny C Davis, Kimberly M Reese, Greg J Li, Qingshun Quinn BMC Bioinformatics Research Article BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem. RESULTS: Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences. CONCLUSION: Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites. BioMed Central 2007-02-07 /pmc/articles/PMC1805453/ /pubmed/17286857 http://dx.doi.org/10.1186/1471-2105-8-43 Text en Copyright © 2007 Ji et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Ji, Guoli Zheng, Jianti Shen, Yingjia Wu, Xiaohui Jiang, Ronghan Lin, Yun Loke, Johnny C Davis, Kimberly M Reese, Greg J Li, Qingshun Quinn Predictive modeling of plant messenger RNA polyadenylation sites |
title | Predictive modeling of plant messenger RNA polyadenylation sites |
title_full | Predictive modeling of plant messenger RNA polyadenylation sites |
title_fullStr | Predictive modeling of plant messenger RNA polyadenylation sites |
title_full_unstemmed | Predictive modeling of plant messenger RNA polyadenylation sites |
title_short | Predictive modeling of plant messenger RNA polyadenylation sites |
title_sort | predictive modeling of plant messenger rna polyadenylation sites |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1805453/ https://www.ncbi.nlm.nih.gov/pubmed/17286857 http://dx.doi.org/10.1186/1471-2105-8-43 |
work_keys_str_mv | AT jiguoli predictivemodelingofplantmessengerrnapolyadenylationsites AT zhengjianti predictivemodelingofplantmessengerrnapolyadenylationsites AT shenyingjia predictivemodelingofplantmessengerrnapolyadenylationsites AT wuxiaohui predictivemodelingofplantmessengerrnapolyadenylationsites AT jiangronghan predictivemodelingofplantmessengerrnapolyadenylationsites AT linyun predictivemodelingofplantmessengerrnapolyadenylationsites AT lokejohnnyc predictivemodelingofplantmessengerrnapolyadenylationsites AT daviskimberlym predictivemodelingofplantmessengerrnapolyadenylationsites AT reesegregj predictivemodelingofplantmessengerrnapolyadenylationsites AT liqingshunquinn predictivemodelingofplantmessengerrnapolyadenylationsites |