Cargando…

A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction

CONTEXT: Chemical toxicity prediction at early stage drug discovery phase has been researched for years, and newest methods are always investigated. Research data comprising chemical physicochemical properties, toxicity, assay, and activity details create massive data which are becoming difficult to...

Descripción completa

Detalles Bibliográficos
Autores principales: Paulose, Renjith, Jegatheesan, Kalirajan, Balakrishnan, Gopal Samy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Medknow Publications & Media Pvt Ltd 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6234712/
https://www.ncbi.nlm.nih.gov/pubmed/30505052
http://dx.doi.org/10.4103/ijp.IJP_304_17
_version_ 1783370759614234624
author Paulose, Renjith
Jegatheesan, Kalirajan
Balakrishnan, Gopal Samy
author_facet Paulose, Renjith
Jegatheesan, Kalirajan
Balakrishnan, Gopal Samy
author_sort Paulose, Renjith
collection PubMed
description CONTEXT: Chemical toxicity prediction at early stage drug discovery phase has been researched for years, and newest methods are always investigated. Research data comprising chemical physicochemical properties, toxicity, assay, and activity details create massive data which are becoming difficult to manage. Identifying the desired featured chemical with the desired biological activity from millions of chemicals is a challenging task. AIMS: In this study, we investigate and explore big data technologies and machine learning approaches to do an efficient chemical data mining for endocrine receptor disruption prediction and virtual compound screening. The power of artificial neural network (ANN) in predicting chemicals' activity toward androgen receptor (AR) and estrogen receptor (ER) and thereby classifying into human endocrine disruptor or nondisruptor is investigated. SUBJECTS AND METHODS: Molecules are collected along with their Inhibitory Concentration (IC (50)) values toward AR and ER. Training and test datasets are created with active and inactive classes of molecules. Molecular fingerprints of Electro Topological State (E-State) are generated for describing every compound. ANN machine learning model is created using Apache Spark and implemented in Hadoop big data environment. Test chemical's structural similarity toward active class of training compounds is estimated and combined with ANN model for improving prediction accuracy. RESULTS: AR and ER predictive models applied on corresponding test datasets gave 86.31% and 89.57% accuracies, respectively, in correctly classifying molecules as disruptor or nondisruptor. Molecular fragments and functional groups are ranked based on their importance in forming ANN model and influence toward the AR and ER disruption behavior. Training molecules that are specific to the test molecules' endocrine disruption prediction are retrieved based on the structural similarity values. CONCLUSIONS: The current study demonstrates a new approach of chemical endocrine receptor disruption prediction combining ANN machine learning method and molecular similarity in a big data environment. This method of predictive modeling can be further tested with more receptors and hormones and predictive power can be examined.
format Online
Article
Text
id pubmed-6234712
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Medknow Publications & Media Pvt Ltd
record_format MEDLINE/PubMed
spelling pubmed-62347122018-11-30 A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction Paulose, Renjith Jegatheesan, Kalirajan Balakrishnan, Gopal Samy Indian J Pharmacol Research Article CONTEXT: Chemical toxicity prediction at early stage drug discovery phase has been researched for years, and newest methods are always investigated. Research data comprising chemical physicochemical properties, toxicity, assay, and activity details create massive data which are becoming difficult to manage. Identifying the desired featured chemical with the desired biological activity from millions of chemicals is a challenging task. AIMS: In this study, we investigate and explore big data technologies and machine learning approaches to do an efficient chemical data mining for endocrine receptor disruption prediction and virtual compound screening. The power of artificial neural network (ANN) in predicting chemicals' activity toward androgen receptor (AR) and estrogen receptor (ER) and thereby classifying into human endocrine disruptor or nondisruptor is investigated. SUBJECTS AND METHODS: Molecules are collected along with their Inhibitory Concentration (IC (50)) values toward AR and ER. Training and test datasets are created with active and inactive classes of molecules. Molecular fingerprints of Electro Topological State (E-State) are generated for describing every compound. ANN machine learning model is created using Apache Spark and implemented in Hadoop big data environment. Test chemical's structural similarity toward active class of training compounds is estimated and combined with ANN model for improving prediction accuracy. RESULTS: AR and ER predictive models applied on corresponding test datasets gave 86.31% and 89.57% accuracies, respectively, in correctly classifying molecules as disruptor or nondisruptor. Molecular fragments and functional groups are ranked based on their importance in forming ANN model and influence toward the AR and ER disruption behavior. Training molecules that are specific to the test molecules' endocrine disruption prediction are retrieved based on the structural similarity values. CONCLUSIONS: The current study demonstrates a new approach of chemical endocrine receptor disruption prediction combining ANN machine learning method and molecular similarity in a big data environment. This method of predictive modeling can be further tested with more receptors and hormones and predictive power can be examined. Medknow Publications & Media Pvt Ltd 2018 /pmc/articles/PMC6234712/ /pubmed/30505052 http://dx.doi.org/10.4103/ijp.IJP_304_17 Text en Copyright: © 2018 Indian Journal of Pharmacology http://creativecommons.org/licenses/by-nc-sa/4.0 This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
spellingShingle Research Article
Paulose, Renjith
Jegatheesan, Kalirajan
Balakrishnan, Gopal Samy
A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
title A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
title_full A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
title_fullStr A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
title_full_unstemmed A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
title_short A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
title_sort big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6234712/
https://www.ncbi.nlm.nih.gov/pubmed/30505052
http://dx.doi.org/10.4103/ijp.IJP_304_17
work_keys_str_mv AT pauloserenjith abigdataapproachwithartificialneuralnetworkandmolecularsimilarityforchemicaldataminingandendocrinedisruptionprediction
AT jegatheesankalirajan abigdataapproachwithartificialneuralnetworkandmolecularsimilarityforchemicaldataminingandendocrinedisruptionprediction
AT balakrishnangopalsamy abigdataapproachwithartificialneuralnetworkandmolecularsimilarityforchemicaldataminingandendocrinedisruptionprediction
AT pauloserenjith bigdataapproachwithartificialneuralnetworkandmolecularsimilarityforchemicaldataminingandendocrinedisruptionprediction
AT jegatheesankalirajan bigdataapproachwithartificialneuralnetworkandmolecularsimilarityforchemicaldataminingandendocrinedisruptionprediction
AT balakrishnangopalsamy bigdataapproachwithartificialneuralnetworkandmolecularsimilarityforchemicaldataminingandendocrinedisruptionprediction