Cargando…

Predictive modeling of plant messenger RNA polyadenylation sites

BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and tra...

Descripción completa

Detalles Bibliográficos
Autores principales: Ji, Guoli, Zheng, Jianti, Shen, Yingjia, Wu, Xiaohui, Jiang, Ronghan, Lin, Yun, Loke, Johnny C, Davis, Kimberly M, Reese, Greg J, Li, Qingshun Quinn
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1805453/
https://www.ncbi.nlm.nih.gov/pubmed/17286857
http://dx.doi.org/10.1186/1471-2105-8-43
_version_ 1782132481004142592
author Ji, Guoli
Zheng, Jianti
Shen, Yingjia
Wu, Xiaohui
Jiang, Ronghan
Lin, Yun
Loke, Johnny C
Davis, Kimberly M
Reese, Greg J
Li, Qingshun Quinn
author_facet Ji, Guoli
Zheng, Jianti
Shen, Yingjia
Wu, Xiaohui
Jiang, Ronghan
Lin, Yun
Loke, Johnny C
Davis, Kimberly M
Reese, Greg J
Li, Qingshun Quinn
author_sort Ji, Guoli
collection PubMed
description BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem. RESULTS: Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences. CONCLUSION: Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites.
format Text
id pubmed-1805453
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18054532007-03-13 Predictive modeling of plant messenger RNA polyadenylation sites Ji, Guoli Zheng, Jianti Shen, Yingjia Wu, Xiaohui Jiang, Ronghan Lin, Yun Loke, Johnny C Davis, Kimberly M Reese, Greg J Li, Qingshun Quinn BMC Bioinformatics Research Article BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem. RESULTS: Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences. CONCLUSION: Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites. BioMed Central 2007-02-07 /pmc/articles/PMC1805453/ /pubmed/17286857 http://dx.doi.org/10.1186/1471-2105-8-43 Text en Copyright © 2007 Ji et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ji, Guoli
Zheng, Jianti
Shen, Yingjia
Wu, Xiaohui
Jiang, Ronghan
Lin, Yun
Loke, Johnny C
Davis, Kimberly M
Reese, Greg J
Li, Qingshun Quinn
Predictive modeling of plant messenger RNA polyadenylation sites
title Predictive modeling of plant messenger RNA polyadenylation sites
title_full Predictive modeling of plant messenger RNA polyadenylation sites
title_fullStr Predictive modeling of plant messenger RNA polyadenylation sites
title_full_unstemmed Predictive modeling of plant messenger RNA polyadenylation sites
title_short Predictive modeling of plant messenger RNA polyadenylation sites
title_sort predictive modeling of plant messenger rna polyadenylation sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1805453/
https://www.ncbi.nlm.nih.gov/pubmed/17286857
http://dx.doi.org/10.1186/1471-2105-8-43
work_keys_str_mv AT jiguoli predictivemodelingofplantmessengerrnapolyadenylationsites
AT zhengjianti predictivemodelingofplantmessengerrnapolyadenylationsites
AT shenyingjia predictivemodelingofplantmessengerrnapolyadenylationsites
AT wuxiaohui predictivemodelingofplantmessengerrnapolyadenylationsites
AT jiangronghan predictivemodelingofplantmessengerrnapolyadenylationsites
AT linyun predictivemodelingofplantmessengerrnapolyadenylationsites
AT lokejohnnyc predictivemodelingofplantmessengerrnapolyadenylationsites
AT daviskimberlym predictivemodelingofplantmessengerrnapolyadenylationsites
AT reesegregj predictivemodelingofplantmessengerrnapolyadenylationsites
AT liqingshunquinn predictivemodelingofplantmessengerrnapolyadenylationsites