Cargando…
Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information
DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding prot...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3787635/ https://www.ncbi.nlm.nih.gov/pubmed/24151525 http://dx.doi.org/10.1155/2013/524502 |
_version_ | 1782286209468334080 |
---|---|
author | Ma, Xin Wu, Jiansheng Xue, Xiaoyun |
author_facet | Ma, Xin Wu, Jiansheng Xue, Xiaoyun |
author_sort | Ma, Xin |
collection | PubMed |
description | DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed by support vector machine-sequential minimal optimization (SVM-SMO) algorithm in conjunction with a hybrid feature. The hybrid feature is incorporating evolutionary information feature, physicochemical property feature, and two novel attributes. These two attributes use DNA-binding residues and nonbinding residues in a query protein to obtain DNA-binding propensity and nonbinding propensity. The results demonstrate that our SVM-SMO model achieves 0.67 Matthew's correlation coefficient (MCC) and 89.6% overall accuracy with 88.4% sensitivity and 90.8% specificity, respectively. Performance comparisons on various features indicate that two novel attributes contribute to the performance improvement. In addition, our SVM-SMO model achieves the best performance than state-of-the-art methods on independent test dataset. |
format | Online Article Text |
id | pubmed-3787635 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-37876352013-10-22 Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information Ma, Xin Wu, Jiansheng Xue, Xiaoyun Comput Math Methods Med Research Article DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed by support vector machine-sequential minimal optimization (SVM-SMO) algorithm in conjunction with a hybrid feature. The hybrid feature is incorporating evolutionary information feature, physicochemical property feature, and two novel attributes. These two attributes use DNA-binding residues and nonbinding residues in a query protein to obtain DNA-binding propensity and nonbinding propensity. The results demonstrate that our SVM-SMO model achieves 0.67 Matthew's correlation coefficient (MCC) and 89.6% overall accuracy with 88.4% sensitivity and 90.8% specificity, respectively. Performance comparisons on various features indicate that two novel attributes contribute to the performance improvement. In addition, our SVM-SMO model achieves the best performance than state-of-the-art methods on independent test dataset. Hindawi Publishing Corporation 2013 2013-09-16 /pmc/articles/PMC3787635/ /pubmed/24151525 http://dx.doi.org/10.1155/2013/524502 Text en Copyright © 2013 Xin Ma et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Ma, Xin Wu, Jiansheng Xue, Xiaoyun Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information |
title | Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information |
title_full | Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information |
title_fullStr | Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information |
title_full_unstemmed | Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information |
title_short | Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information |
title_sort | identification of dna-binding proteins using support vector machine with sequence information |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3787635/ https://www.ncbi.nlm.nih.gov/pubmed/24151525 http://dx.doi.org/10.1155/2013/524502 |
work_keys_str_mv | AT maxin identificationofdnabindingproteinsusingsupportvectormachinewithsequenceinformation AT wujiansheng identificationofdnabindingproteinsusingsupportvectormachinewithsequenceinformation AT xuexiaoyun identificationofdnabindingproteinsusingsupportvectormachinewithsequenceinformation |