Cargando…
PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine
BACKGROUND: Identifying specific residues for protein-DNA interactions are of considerable importance to better recognize the binding mechanism of protein-DNA complexes. Despite the fact that many computational DNA-binding residue prediction approaches have been developed, there is still significant...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311926/ https://www.ncbi.nlm.nih.gov/pubmed/30598073 http://dx.doi.org/10.1186/s12859-018-2527-1 |
_version_ | 1783383703007789056 |
---|---|
author | Deng, Lei Pan, Juan Xu, Xiaojie Yang, Wenyi Liu, Chuyao Liu, Hui |
author_facet | Deng, Lei Pan, Juan Xu, Xiaojie Yang, Wenyi Liu, Chuyao Liu, Hui |
author_sort | Deng, Lei |
collection | PubMed |
description | BACKGROUND: Identifying specific residues for protein-DNA interactions are of considerable importance to better recognize the binding mechanism of protein-DNA complexes. Despite the fact that many computational DNA-binding residue prediction approaches have been developed, there is still significant room for improvement concerning overall performance and availability. RESULTS: Here, we present an efficient approach termed PDRLGB that uses a light gradient boosting machine (LightGBM) to predict binding residues in protein-DNA complexes. Initially, we extract a wide variety of 913 sequence and structure features with a sliding window of 11. Then, we apply the random forest algorithm to sort the features in descending order of importance and obtain the optimal subset of features using incremental feature selection. Based on the selected feature set, we use a light gradient boosting machine to build the prediction model for DNA-binding residues. Our PDRLGB method shows better overall predictive accuracy and relatively less training time than other widely used machine learning (ML) methods such as random forest (RF), Adaboost and support vector machine (SVM). We further compare PDRLGB with various existing approaches on the independent test datasets and show improvement in results over the existing state-of-the-art approaches. CONCLUSIONS: PDRLGB is an efficient approach to predict specific residues for protein-DNA interactions. |
format | Online Article Text |
id | pubmed-6311926 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63119262019-01-07 PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine Deng, Lei Pan, Juan Xu, Xiaojie Yang, Wenyi Liu, Chuyao Liu, Hui BMC Bioinformatics Research BACKGROUND: Identifying specific residues for protein-DNA interactions are of considerable importance to better recognize the binding mechanism of protein-DNA complexes. Despite the fact that many computational DNA-binding residue prediction approaches have been developed, there is still significant room for improvement concerning overall performance and availability. RESULTS: Here, we present an efficient approach termed PDRLGB that uses a light gradient boosting machine (LightGBM) to predict binding residues in protein-DNA complexes. Initially, we extract a wide variety of 913 sequence and structure features with a sliding window of 11. Then, we apply the random forest algorithm to sort the features in descending order of importance and obtain the optimal subset of features using incremental feature selection. Based on the selected feature set, we use a light gradient boosting machine to build the prediction model for DNA-binding residues. Our PDRLGB method shows better overall predictive accuracy and relatively less training time than other widely used machine learning (ML) methods such as random forest (RF), Adaboost and support vector machine (SVM). We further compare PDRLGB with various existing approaches on the independent test datasets and show improvement in results over the existing state-of-the-art approaches. CONCLUSIONS: PDRLGB is an efficient approach to predict specific residues for protein-DNA interactions. BioMed Central 2018-12-31 /pmc/articles/PMC6311926/ /pubmed/30598073 http://dx.doi.org/10.1186/s12859-018-2527-1 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Deng, Lei Pan, Juan Xu, Xiaojie Yang, Wenyi Liu, Chuyao Liu, Hui PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine |
title | PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine |
title_full | PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine |
title_fullStr | PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine |
title_full_unstemmed | PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine |
title_short | PDRLGB: precise DNA-binding residue prediction using a light gradient boosting machine |
title_sort | pdrlgb: precise dna-binding residue prediction using a light gradient boosting machine |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311926/ https://www.ncbi.nlm.nih.gov/pubmed/30598073 http://dx.doi.org/10.1186/s12859-018-2527-1 |
work_keys_str_mv | AT denglei pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine AT panjuan pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine AT xuxiaojie pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine AT yangwenyi pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine AT liuchuyao pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine AT liuhui pdrlgbprecisednabindingresiduepredictionusingalightgradientboostingmachine |