Cargando…

Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm

BACKGROUND: DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable. RESULTS: In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jian, Gao, Bo, Chai, Haiting, Ma, Zhiqiang, Yang, Guifu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5002159/
https://www.ncbi.nlm.nih.gov/pubmed/27565741
http://dx.doi.org/10.1186/s12859-016-1201-8
_version_ 1782450527527763968
author Zhang, Jian
Gao, Bo
Chai, Haiting
Ma, Zhiqiang
Yang, Guifu
author_facet Zhang, Jian
Gao, Bo
Chai, Haiting
Ma, Zhiqiang
Yang, Guifu
author_sort Zhang, Jian
collection PubMed
description BACKGROUND: DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable. RESULTS: In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we focused on the challenge of improving DBP prediction accuracy with information solely from the sequence. Secondly, we used multiple informative features to encode the protein. These features included evolutionary conservation profile, secondary structure motifs, and physicochemical properties. Thirdly, we introduced a novel improved Binary Firefly Algorithm (BFA) to remove redundant or noisy features as well as select optimal parameters for the classifier. The experimental results of our predictor on two benchmark datasets outperformed many state-of-the-art predictors, which revealed the effectiveness of our method. The promising prediction performance on a new-compiled independent testing dataset from PDB and a large-scale dataset from UniProt proved the good generalization ability of our method. In addition, the BFA forged in this research would be of great potential in practical applications in optimization fields, especially in feature selection problems. CONCLUSIONS: A highly accurate method was proposed for the identification of DBPs. A user-friendly web-server named iDbP (identification of DNA-binding Proteins) was constructed and provided for academic use. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1201-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5002159
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50021592016-09-06 Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm Zhang, Jian Gao, Bo Chai, Haiting Ma, Zhiqiang Yang, Guifu BMC Bioinformatics Research Article BACKGROUND: DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable. RESULTS: In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we focused on the challenge of improving DBP prediction accuracy with information solely from the sequence. Secondly, we used multiple informative features to encode the protein. These features included evolutionary conservation profile, secondary structure motifs, and physicochemical properties. Thirdly, we introduced a novel improved Binary Firefly Algorithm (BFA) to remove redundant or noisy features as well as select optimal parameters for the classifier. The experimental results of our predictor on two benchmark datasets outperformed many state-of-the-art predictors, which revealed the effectiveness of our method. The promising prediction performance on a new-compiled independent testing dataset from PDB and a large-scale dataset from UniProt proved the good generalization ability of our method. In addition, the BFA forged in this research would be of great potential in practical applications in optimization fields, especially in feature selection problems. CONCLUSIONS: A highly accurate method was proposed for the identification of DBPs. A user-friendly web-server named iDbP (identification of DNA-binding Proteins) was constructed and provided for academic use. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1201-8) contains supplementary material, which is available to authorized users. BioMed Central 2016-08-26 /pmc/articles/PMC5002159/ /pubmed/27565741 http://dx.doi.org/10.1186/s12859-016-1201-8 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zhang, Jian
Gao, Bo
Chai, Haiting
Ma, Zhiqiang
Yang, Guifu
Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm
title Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm
title_full Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm
title_fullStr Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm
title_full_unstemmed Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm
title_short Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm
title_sort identification of dna-binding proteins using multi-features fusion and binary firefly optimization algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5002159/
https://www.ncbi.nlm.nih.gov/pubmed/27565741
http://dx.doi.org/10.1186/s12859-016-1201-8
work_keys_str_mv AT zhangjian identificationofdnabindingproteinsusingmultifeaturesfusionandbinaryfireflyoptimizationalgorithm
AT gaobo identificationofdnabindingproteinsusingmultifeaturesfusionandbinaryfireflyoptimizationalgorithm
AT chaihaiting identificationofdnabindingproteinsusingmultifeaturesfusionandbinaryfireflyoptimizationalgorithm
AT mazhiqiang identificationofdnabindingproteinsusingmultifeaturesfusionandbinaryfireflyoptimizationalgorithm
AT yangguifu identificationofdnabindingproteinsusingmultifeaturesfusionandbinaryfireflyoptimizationalgorithm