Cargando…

Plant promoter prediction with confidence estimation

Accurate prediction of promoters is fundamental to understanding gene expression patterns, where confidence estimation is one of the main requirements. Using recently developed transductive confidence machine (TCM) techniques, we developed a new program TSSP-TCM for the prediction of plant promoters...

Descripción completa

Detalles Bibliográficos
Autores principales: Shahmuradov, I. A., Solovyev, V. V., Gammerman, A. J.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC549412/
https://www.ncbi.nlm.nih.gov/pubmed/15722481
http://dx.doi.org/10.1093/nar/gki247
_version_ 1782122414925152256
author Shahmuradov, I. A.
Solovyev, V. V.
Gammerman, A. J.
author_facet Shahmuradov, I. A.
Solovyev, V. V.
Gammerman, A. J.
author_sort Shahmuradov, I. A.
collection PubMed
description Accurate prediction of promoters is fundamental to understanding gene expression patterns, where confidence estimation is one of the main requirements. Using recently developed transductive confidence machine (TCM) techniques, we developed a new program TSSP-TCM for the prediction of plant promoters that also provides confidence of the prediction. The program was trained on 132 and 104 sequences and tested on 40 and 25 sequences (containing TATA and TATA-less promoters, respectively) with known transcription start sites (TSSs). As negative training samples for TCM learning we used coding and intron sequences of plant genes annotated in the GenBank. In the test set of TATA promoters, the program correctly predicted TSS for 35 out of 40 (87.5%) genes with a median deviation of several base pairs from the true site location. For 25 TATA-less promoters, TSSs were predicted for 21 out of 25 (84%) genes, including 14 cases of 5 bp distance between annotated and predicted TSSs. Using TSSP-TCM program we annotated promoters in the whole Arabidopsis genome. The predicted promoters were in good agreement with the start position of known Arabidopsis mRNAs. Thus, TCM technique has produced a plant-oriented promoter prediction tool of high accuracy. TSSP-TCM program and annotated promoters are available at .
format Text
id pubmed-549412
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-5494122005-02-24 Plant promoter prediction with confidence estimation Shahmuradov, I. A. Solovyev, V. V. Gammerman, A. J. Nucleic Acids Res Article Accurate prediction of promoters is fundamental to understanding gene expression patterns, where confidence estimation is one of the main requirements. Using recently developed transductive confidence machine (TCM) techniques, we developed a new program TSSP-TCM for the prediction of plant promoters that also provides confidence of the prediction. The program was trained on 132 and 104 sequences and tested on 40 and 25 sequences (containing TATA and TATA-less promoters, respectively) with known transcription start sites (TSSs). As negative training samples for TCM learning we used coding and intron sequences of plant genes annotated in the GenBank. In the test set of TATA promoters, the program correctly predicted TSS for 35 out of 40 (87.5%) genes with a median deviation of several base pairs from the true site location. For 25 TATA-less promoters, TSSs were predicted for 21 out of 25 (84%) genes, including 14 cases of 5 bp distance between annotated and predicted TSSs. Using TSSP-TCM program we annotated promoters in the whole Arabidopsis genome. The predicted promoters were in good agreement with the start position of known Arabidopsis mRNAs. Thus, TCM technique has produced a plant-oriented promoter prediction tool of high accuracy. TSSP-TCM program and annotated promoters are available at . Oxford University Press 2005 2005-02-18 /pmc/articles/PMC549412/ /pubmed/15722481 http://dx.doi.org/10.1093/nar/gki247 Text en © The Author 2005. Published by Oxford University Press. All rights reserved
spellingShingle Article
Shahmuradov, I. A.
Solovyev, V. V.
Gammerman, A. J.
Plant promoter prediction with confidence estimation
title Plant promoter prediction with confidence estimation
title_full Plant promoter prediction with confidence estimation
title_fullStr Plant promoter prediction with confidence estimation
title_full_unstemmed Plant promoter prediction with confidence estimation
title_short Plant promoter prediction with confidence estimation
title_sort plant promoter prediction with confidence estimation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC549412/
https://www.ncbi.nlm.nih.gov/pubmed/15722481
http://dx.doi.org/10.1093/nar/gki247
work_keys_str_mv AT shahmuradovia plantpromoterpredictionwithconfidenceestimation
AT solovyevvv plantpromoterpredictionwithconfidenceestimation
AT gammermanaj plantpromoterpredictionwithconfidenceestimation