Cargando…

ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches

The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous so...

Descripción completa

Detalles Bibliográficos
Autores principales: Sharma, Ashok K., Srivastava, Gopal N., Roy, Ankita, Sharma, Vineet K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5714866/
https://www.ncbi.nlm.nih.gov/pubmed/29249969
http://dx.doi.org/10.3389/fphar.2017.00880
_version_ 1783283640838389760
author Sharma, Ashok K.
Srivastava, Gopal N.
Roy, Ankita
Sharma, Vineet K.
author_facet Sharma, Ashok K.
Srivastava, Gopal N.
Roy, Ankita
Sharma, Vineet K.
author_sort Sharma, Ashok K.
collection PubMed
description The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R(2) = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R(2) = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
format Online
Article
Text
id pubmed-5714866
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-57148662017-12-15 ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches Sharma, Ashok K. Srivastava, Gopal N. Roy, Ankita Sharma, Vineet K. Front Pharmacol Pharmacology The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R(2) = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R(2) = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. Frontiers Media S.A. 2017-11-30 /pmc/articles/PMC5714866/ /pubmed/29249969 http://dx.doi.org/10.3389/fphar.2017.00880 Text en Copyright © 2017 Sharma, Srivastava, Roy and Sharma. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Sharma, Ashok K.
Srivastava, Gopal N.
Roy, Ankita
Sharma, Vineet K.
ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches
title ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches
title_full ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches
title_fullStr ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches
title_full_unstemmed ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches
title_short ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches
title_sort toxim: a toxicity prediction tool for small molecules developed using machine learning and chemoinformatics approaches
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5714866/
https://www.ncbi.nlm.nih.gov/pubmed/29249969
http://dx.doi.org/10.3389/fphar.2017.00880
work_keys_str_mv AT sharmaashokk toximatoxicitypredictiontoolforsmallmoleculesdevelopedusingmachinelearningandchemoinformaticsapproaches
AT srivastavagopaln toximatoxicitypredictiontoolforsmallmoleculesdevelopedusingmachinelearningandchemoinformaticsapproaches
AT royankita toximatoxicitypredictiontoolforsmallmoleculesdevelopedusingmachinelearningandchemoinformaticsapproaches
AT sharmavineetk toximatoxicitypredictiontoolforsmallmoleculesdevelopedusingmachinelearningandchemoinformaticsapproaches