Cargando…

A comparison study on feature selection of DNA structural properties for promoter prediction

BACKGROUND: Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering...

Descripción completa

Detalles Bibliográficos
Autores principales: Gan, Yanglan, Guan, Jihong, Zhou, Shuigeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280155/
https://www.ncbi.nlm.nih.gov/pubmed/22226192
http://dx.doi.org/10.1186/1471-2105-13-4
_version_ 1782223779259219968
author Gan, Yanglan
Guan, Jihong
Zhou, Shuigeng
author_facet Gan, Yanglan
Guan, Jihong
Zhou, Shuigeng
author_sort Gan, Yanglan
collection PubMed
description BACKGROUND: Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task. RESULTS: This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches. CONCLUSIONS: Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction.
format Online
Article
Text
id pubmed-3280155
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32801552012-02-16 A comparison study on feature selection of DNA structural properties for promoter prediction Gan, Yanglan Guan, Jihong Zhou, Shuigeng BMC Bioinformatics Research Article BACKGROUND: Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task. RESULTS: This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches. CONCLUSIONS: Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction. BioMed Central 2012-01-07 /pmc/articles/PMC3280155/ /pubmed/22226192 http://dx.doi.org/10.1186/1471-2105-13-4 Text en Copyright ©2012 Gan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Gan, Yanglan
Guan, Jihong
Zhou, Shuigeng
A comparison study on feature selection of DNA structural properties for promoter prediction
title A comparison study on feature selection of DNA structural properties for promoter prediction
title_full A comparison study on feature selection of DNA structural properties for promoter prediction
title_fullStr A comparison study on feature selection of DNA structural properties for promoter prediction
title_full_unstemmed A comparison study on feature selection of DNA structural properties for promoter prediction
title_short A comparison study on feature selection of DNA structural properties for promoter prediction
title_sort comparison study on feature selection of dna structural properties for promoter prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280155/
https://www.ncbi.nlm.nih.gov/pubmed/22226192
http://dx.doi.org/10.1186/1471-2105-13-4
work_keys_str_mv AT ganyanglan acomparisonstudyonfeatureselectionofdnastructuralpropertiesforpromoterprediction
AT guanjihong acomparisonstudyonfeatureselectionofdnastructuralpropertiesforpromoterprediction
AT zhoushuigeng acomparisonstudyonfeatureselectionofdnastructuralpropertiesforpromoterprediction
AT ganyanglan comparisonstudyonfeatureselectionofdnastructuralpropertiesforpromoterprediction
AT guanjihong comparisonstudyonfeatureselectionofdnastructuralpropertiesforpromoterprediction
AT zhoushuigeng comparisonstudyonfeatureselectionofdnastructuralpropertiesforpromoterprediction