Cargando…
A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region
An important step in understanding gene regulation is to identify the promoter regions where the transcription factor binding takes place. Predicting a promoter region de novo has been a theoretical goal for many researchers for a long time. There exists a number of in silico methods to predict the...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3577817/ https://www.ncbi.nlm.nih.gov/pubmed/23437045 http://dx.doi.org/10.1371/journal.pone.0054843 |
_version_ | 1782259979579817984 |
---|---|
author | Datta, Sutapa Mukhopadhyay, Subhasis |
author_facet | Datta, Sutapa Mukhopadhyay, Subhasis |
author_sort | Datta, Sutapa |
collection | PubMed |
description | An important step in understanding gene regulation is to identify the promoter regions where the transcription factor binding takes place. Predicting a promoter region de novo has been a theoretical goal for many researchers for a long time. There exists a number of in silico methods to predict the promoter region de novo but most of these methods are still suffering from various shortcomings, a major one being the selection of appropriate features of promoter region distinguishing them from non-promoters. In this communication, we have proposed a new composite method that predicts promoter sequences based on the interrelationship between structural profiles of DNA and primary sequence elements of the promoter regions. We have shown that a Context Free Grammar (CFG) can formalize the relationships between different primary sequence features and by utilizing the CFG, we demonstrate that an efficient parser can be constructed for extracting these relationships from DNA sequences to distinguish the true promoter sequences from non-promoter sequences. Along with CFG, we have extracted the structural features of the promoter region to improve upon the efficiency of our prediction system. Extensive experiments performed on different datasets reveals that our method is effective in predicting promoter sequences on a genome-wide scale and performs satisfactorily as compared to other promoter prediction techniques. |
format | Online Article Text |
id | pubmed-3577817 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-35778172013-02-22 A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region Datta, Sutapa Mukhopadhyay, Subhasis PLoS One Research Article An important step in understanding gene regulation is to identify the promoter regions where the transcription factor binding takes place. Predicting a promoter region de novo has been a theoretical goal for many researchers for a long time. There exists a number of in silico methods to predict the promoter region de novo but most of these methods are still suffering from various shortcomings, a major one being the selection of appropriate features of promoter region distinguishing them from non-promoters. In this communication, we have proposed a new composite method that predicts promoter sequences based on the interrelationship between structural profiles of DNA and primary sequence elements of the promoter regions. We have shown that a Context Free Grammar (CFG) can formalize the relationships between different primary sequence features and by utilizing the CFG, we demonstrate that an efficient parser can be constructed for extracting these relationships from DNA sequences to distinguish the true promoter sequences from non-promoter sequences. Along with CFG, we have extracted the structural features of the promoter region to improve upon the efficiency of our prediction system. Extensive experiments performed on different datasets reveals that our method is effective in predicting promoter sequences on a genome-wide scale and performs satisfactorily as compared to other promoter prediction techniques. Public Library of Science 2013-02-20 /pmc/articles/PMC3577817/ /pubmed/23437045 http://dx.doi.org/10.1371/journal.pone.0054843 Text en © 2013 Datta, Mukhopadhyay http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Datta, Sutapa Mukhopadhyay, Subhasis A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region |
title | A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region |
title_full | A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region |
title_fullStr | A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region |
title_full_unstemmed | A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region |
title_short | A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region |
title_sort | composite method based on formal grammar and dna structural features in detecting human polymerase ii promoter region |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3577817/ https://www.ncbi.nlm.nih.gov/pubmed/23437045 http://dx.doi.org/10.1371/journal.pone.0054843 |
work_keys_str_mv | AT dattasutapa acompositemethodbasedonformalgrammaranddnastructuralfeaturesindetectinghumanpolymeraseiipromoterregion AT mukhopadhyaysubhasis acompositemethodbasedonformalgrammaranddnastructuralfeaturesindetectinghumanpolymeraseiipromoterregion AT dattasutapa compositemethodbasedonformalgrammaranddnastructuralfeaturesindetectinghumanpolymeraseiipromoterregion AT mukhopadhyaysubhasis compositemethodbasedonformalgrammaranddnastructuralfeaturesindetectinghumanpolymeraseiipromoterregion |