Cargando…

Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm

Accurately identifying protein–ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein–ATP binding residues; however, as new machine-learni...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Jiazhi, Liu, Guixia, Jiang, Jingqing, Zhang, Ping, Liang, Yanchun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832895/
https://www.ncbi.nlm.nih.gov/pubmed/33477866
http://dx.doi.org/10.3390/ijms22020939
_version_ 1783641937140514816
author Song, Jiazhi
Liu, Guixia
Jiang, Jingqing
Zhang, Ping
Liang, Yanchun
author_facet Song, Jiazhi
Liu, Guixia
Jiang, Jingqing
Zhang, Ping
Liang, Yanchun
author_sort Song, Jiazhi
collection PubMed
description Accurately identifying protein–ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein–ATP binding residues; however, as new machine-learning techniques are being developed, the prediction performance could be further improved. In this paper, an ensemble predictor that combines deep convolutional neural network and LightGBM with ensemble learning algorithm is proposed. Three subclassifiers have been developed, including a multi-incepResNet-based predictor, a multi-Xception-based predictor, and a LightGBM predictor. The final prediction result is the combination of outputs from three subclassifiers with optimized weight distribution. We examined the performance of our proposed predictor using two datasets: a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. Our predictor achieved area under the curve (AUC) values of 0.925 and 0.902 and Matthews Correlation Coefficient (MCC) values of 0.639 and 0.642, respectively, which are both better than other state-of-art prediction methods.
format Online
Article
Text
id pubmed-7832895
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-78328952021-01-26 Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm Song, Jiazhi Liu, Guixia Jiang, Jingqing Zhang, Ping Liang, Yanchun Int J Mol Sci Article Accurately identifying protein–ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein–ATP binding residues; however, as new machine-learning techniques are being developed, the prediction performance could be further improved. In this paper, an ensemble predictor that combines deep convolutional neural network and LightGBM with ensemble learning algorithm is proposed. Three subclassifiers have been developed, including a multi-incepResNet-based predictor, a multi-Xception-based predictor, and a LightGBM predictor. The final prediction result is the combination of outputs from three subclassifiers with optimized weight distribution. We examined the performance of our proposed predictor using two datasets: a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. Our predictor achieved area under the curve (AUC) values of 0.925 and 0.902 and Matthews Correlation Coefficient (MCC) values of 0.639 and 0.642, respectively, which are both better than other state-of-art prediction methods. MDPI 2021-01-19 /pmc/articles/PMC7832895/ /pubmed/33477866 http://dx.doi.org/10.3390/ijms22020939 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Song, Jiazhi
Liu, Guixia
Jiang, Jingqing
Zhang, Ping
Liang, Yanchun
Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm
title Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm
title_full Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm
title_fullStr Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm
title_full_unstemmed Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm
title_short Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm
title_sort prediction of protein–atp binding residues based on ensemble of deep convolutional neural networks and lightgbm algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832895/
https://www.ncbi.nlm.nih.gov/pubmed/33477866
http://dx.doi.org/10.3390/ijms22020939
work_keys_str_mv AT songjiazhi predictionofproteinatpbindingresiduesbasedonensembleofdeepconvolutionalneuralnetworksandlightgbmalgorithm
AT liuguixia predictionofproteinatpbindingresiduesbasedonensembleofdeepconvolutionalneuralnetworksandlightgbmalgorithm
AT jiangjingqing predictionofproteinatpbindingresiduesbasedonensembleofdeepconvolutionalneuralnetworksandlightgbmalgorithm
AT zhangping predictionofproteinatpbindingresiduesbasedonensembleofdeepconvolutionalneuralnetworksandlightgbmalgorithm
AT liangyanchun predictionofproteinatpbindingresiduesbasedonensembleofdeepconvolutionalneuralnetworksandlightgbmalgorithm