Cargando…

Bioactive Molecule Prediction Using Extreme Gradient Boosting

Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today’s drug discovery process. In this paper, extreme gradient boosting (Xgboost), which i...

Descripción completa

Detalles Bibliográficos
Autores principales: Babajide Mustapha, Ismail, Saeed, Faisal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6273295/
https://www.ncbi.nlm.nih.gov/pubmed/27483216
http://dx.doi.org/10.3390/molecules21080983
_version_ 1783377351452655616
author Babajide Mustapha, Ismail
Saeed, Faisal
author_facet Babajide Mustapha, Ismail
Saeed, Faisal
author_sort Babajide Mustapha, Ismail
collection PubMed
description Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today’s drug discovery process. In this paper, extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree (CART) and a variant of the Gradient Boosting Machine, was investigated for the prediction of biological activity based on quantitative description of the compound’s molecular structure. Seven datasets, well known in the literature were used in this paper and experimental results show that Xgboost can outperform machine learning algorithms like Random Forest (RF), Support Vector Machines (LSVM), Radial Basis Function Neural Network (RBFN) and Naïve Bayes (NB) for the prediction of biological activities. In addition to its ability to detect minority activity classes in highly imbalanced datasets, it showed remarkable performance on both high and low diversity datasets.
format Online
Article
Text
id pubmed-6273295
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-62732952018-12-28 Bioactive Molecule Prediction Using Extreme Gradient Boosting Babajide Mustapha, Ismail Saeed, Faisal Molecules Article Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today’s drug discovery process. In this paper, extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree (CART) and a variant of the Gradient Boosting Machine, was investigated for the prediction of biological activity based on quantitative description of the compound’s molecular structure. Seven datasets, well known in the literature were used in this paper and experimental results show that Xgboost can outperform machine learning algorithms like Random Forest (RF), Support Vector Machines (LSVM), Radial Basis Function Neural Network (RBFN) and Naïve Bayes (NB) for the prediction of biological activities. In addition to its ability to detect minority activity classes in highly imbalanced datasets, it showed remarkable performance on both high and low diversity datasets. MDPI 2016-07-28 /pmc/articles/PMC6273295/ /pubmed/27483216 http://dx.doi.org/10.3390/molecules21080983 Text en © 2016 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Babajide Mustapha, Ismail
Saeed, Faisal
Bioactive Molecule Prediction Using Extreme Gradient Boosting
title Bioactive Molecule Prediction Using Extreme Gradient Boosting
title_full Bioactive Molecule Prediction Using Extreme Gradient Boosting
title_fullStr Bioactive Molecule Prediction Using Extreme Gradient Boosting
title_full_unstemmed Bioactive Molecule Prediction Using Extreme Gradient Boosting
title_short Bioactive Molecule Prediction Using Extreme Gradient Boosting
title_sort bioactive molecule prediction using extreme gradient boosting
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6273295/
https://www.ncbi.nlm.nih.gov/pubmed/27483216
http://dx.doi.org/10.3390/molecules21080983
work_keys_str_mv AT babajidemustaphaismail bioactivemoleculepredictionusingextremegradientboosting
AT saeedfaisal bioactivemoleculepredictionusingextremegradientboosting