Cargando…

PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine

BACKGROUND: Identifying specific residues for protein-DNA interactions are of considerable importance to better recognize the binding mechanism of protein-DNA complexes. Despite the fact that many computational DNA-binding residue prediction approaches have been developed, there is still significant...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Lei, Pan, Juan, Xu, Xiaojie, Yang, Wenyi, Liu, Chuyao, Liu, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311926/
https://www.ncbi.nlm.nih.gov/pubmed/30598073
http://dx.doi.org/10.1186/s12859-018-2527-1
_version_ 1783383703007789056
author Deng, Lei
Pan, Juan
Xu, Xiaojie
Yang, Wenyi
Liu, Chuyao
Liu, Hui
author_facet Deng, Lei
Pan, Juan
Xu, Xiaojie
Yang, Wenyi
Liu, Chuyao
Liu, Hui
author_sort Deng, Lei
collection PubMed
description BACKGROUND: Identifying specific residues for protein-DNA interactions are of considerable importance to better recognize the binding mechanism of protein-DNA complexes. Despite the fact that many computational DNA-binding residue prediction approaches have been developed, there is still significant room for improvement concerning overall performance and availability. RESULTS: Here, we present an efficient approach termed PDRLGB that uses a light gradient boosting machine (LightGBM) to predict binding residues in protein-DNA complexes. Initially, we extract a wide variety of 913 sequence and structure features with a sliding window of 11. Then, we apply the random forest algorithm to sort the features in descending order of importance and obtain the optimal subset of features using incremental feature selection. Based on the selected feature set, we use a light gradient boosting machine to build the prediction model for DNA-binding residues. Our PDRLGB method shows better overall predictive accuracy and relatively less training time than other widely used machine learning (ML) methods such as random forest (RF), Adaboost and support vector machine (SVM). We further compare PDRLGB with various existing approaches on the independent test datasets and show improvement in results over the existing state-of-the-art approaches. CONCLUSIONS: PDRLGB is an efficient approach to predict specific residues for protein-DNA interactions.
format Online
Article
Text
id pubmed-6311926
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63119262019-01-07 PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine Deng, Lei Pan, Juan Xu, Xiaojie Yang, Wenyi Liu, Chuyao Liu, Hui BMC Bioinformatics Research BACKGROUND: Identifying specific residues for protein-DNA interactions are of considerable importance to better recognize the binding mechanism of protein-DNA complexes. Despite the fact that many computational DNA-binding residue prediction approaches have been developed, there is still significant room for improvement concerning overall performance and availability. RESULTS: Here, we present an efficient approach termed PDRLGB that uses a light gradient boosting machine (LightGBM) to predict binding residues in protein-DNA complexes. Initially, we extract a wide variety of 913 sequence and structure features with a sliding window of 11. Then, we apply the random forest algorithm to sort the features in descending order of importance and obtain the optimal subset of features using incremental feature selection. Based on the selected feature set, we use a light gradient boosting machine to build the prediction model for DNA-binding residues. Our PDRLGB method shows better overall predictive accuracy and relatively less training time than other widely used machine learning (ML) methods such as random forest (RF), Adaboost and support vector machine (SVM). We further compare PDRLGB with various existing approaches on the independent test datasets and show improvement in results over the existing state-of-the-art approaches. CONCLUSIONS: PDRLGB is an efficient approach to predict specific residues for protein-DNA interactions. BioMed Central 2018-12-31 /pmc/articles/PMC6311926/ /pubmed/30598073 http://dx.doi.org/10.1186/s12859-018-2527-1 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Deng, Lei
Pan, Juan
Xu, Xiaojie
Yang, Wenyi
Liu, Chuyao
Liu, Hui
PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
title PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
title_full PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
title_fullStr PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
title_full_unstemmed PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
title_short PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
title_sort pdrlgb: precise dna-binding residue prediction using a light gradient boosting machine
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311926/
https://www.ncbi.nlm.nih.gov/pubmed/30598073
http://dx.doi.org/10.1186/s12859-018-2527-1
work_keys_str_mv AT denglei pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine
AT panjuan pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine
AT xuxiaojie pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine
AT yangwenyi pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine
AT liuchuyao pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine
AT liuhui pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine