Cargando…

XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set

Accurate identification of drug-targets in human body has great significance for designing novel drugs. Compared with traditional experimental methods, prediction of drug-targets via machine learning algorithms has enhanced the attention of many researchers due to fast and accurate prediction. In th...

Descripción completa

Detalles Bibliográficos
Autores principales: Sikander, Rahu, Ghulam, Ali, Ali, Farman
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8976041/
https://www.ncbi.nlm.nih.gov/pubmed/35365726
http://dx.doi.org/10.1038/s41598-022-09484-3
_version_ 1784680481167507456
author Sikander, Rahu
Ghulam, Ali
Ali, Farman
author_facet Sikander, Rahu
Ghulam, Ali
Ali, Farman
author_sort Sikander, Rahu
collection PubMed
description Accurate identification of drug-targets in human body has great significance for designing novel drugs. Compared with traditional experimental methods, prediction of drug-targets via machine learning algorithms has enhanced the attention of many researchers due to fast and accurate prediction. In this study, we propose a machine learning-based method, namely XGB-DrugPred for accurate prediction of druggable proteins. The features from primary protein sequences are extracted by group dipeptide composition, reduced amino acid alphabet, and novel encoder pseudo amino acid composition segmentation. To select the best feature set, eXtreme Gradient Boosting-recursive feature elimination is implemented. The best feature set is provided to eXtreme Gradient Boosting (XGB), Random Forest, and Extremely Randomized Tree classifiers for model training and prediction. The performance of these classifiers is evaluated by tenfold cross-validation. The empirical results show that XGB-based predictor achieves the best results compared with other classifiers and existing methods in the literature.
format Online
Article
Text
id pubmed-8976041
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-89760412022-04-05 XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set Sikander, Rahu Ghulam, Ali Ali, Farman Sci Rep Article Accurate identification of drug-targets in human body has great significance for designing novel drugs. Compared with traditional experimental methods, prediction of drug-targets via machine learning algorithms has enhanced the attention of many researchers due to fast and accurate prediction. In this study, we propose a machine learning-based method, namely XGB-DrugPred for accurate prediction of druggable proteins. The features from primary protein sequences are extracted by group dipeptide composition, reduced amino acid alphabet, and novel encoder pseudo amino acid composition segmentation. To select the best feature set, eXtreme Gradient Boosting-recursive feature elimination is implemented. The best feature set is provided to eXtreme Gradient Boosting (XGB), Random Forest, and Extremely Randomized Tree classifiers for model training and prediction. The performance of these classifiers is evaluated by tenfold cross-validation. The empirical results show that XGB-based predictor achieves the best results compared with other classifiers and existing methods in the literature. Nature Publishing Group UK 2022-04-01 /pmc/articles/PMC8976041/ /pubmed/35365726 http://dx.doi.org/10.1038/s41598-022-09484-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Sikander, Rahu
Ghulam, Ali
Ali, Farman
XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set
title XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set
title_full XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set
title_fullStr XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set
title_full_unstemmed XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set
title_short XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set
title_sort xgb-drugpred: computational prediction of druggable proteins using extreme gradient boosting and optimized features set
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8976041/
https://www.ncbi.nlm.nih.gov/pubmed/35365726
http://dx.doi.org/10.1038/s41598-022-09484-3
work_keys_str_mv AT sikanderrahu xgbdrugpredcomputationalpredictionofdruggableproteinsusingextremegradientboostingandoptimizedfeaturesset
AT ghulamali xgbdrugpredcomputationalpredictionofdruggableproteinsusingextremegradientboostingandoptimizedfeaturesset
AT alifarman xgbdrugpredcomputationalpredictionofdruggableproteinsusingextremegradientboostingandoptimizedfeaturesset