Cargando…

Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks

Current practice of building QSAR models usually involves computing a set of descriptors for the training set compounds, applying a descriptor selection algorithm and finally using a statistical fitting method to build the model. In this study, we explored the prospects of building good quality inte...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chakravarti, Suman K., Alla, Sai Radha Mani
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2019
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861338/ https://www.ncbi.nlm.nih.gov/pubmed/33733106 http://dx.doi.org/10.3389/frai.2019.00017

_version_	1783647065350340608
author	Chakravarti, Suman K. Alla, Sai Radha Mani
author_facet	Chakravarti, Suman K. Alla, Sai Radha Mani
author_sort	Chakravarti, Suman K.
collection	PubMed
description	Current practice of building QSAR models usually involves computing a set of descriptors for the training set compounds, applying a descriptor selection algorithm and finally using a statistical fitting method to build the model. In this study, we explored the prospects of building good quality interpretable QSARs for big and diverse datasets, without using any pre-calculated descriptors. We have used different forms of Long Short-Term Memory (LSTM) neural networks to achieve this, trained directly using either traditional SMILES codes or a new linear molecular notation developed as part of this work. Three endpoints were modeled: Ames mutagenicity, inhibition of P. falciparum Dd2 and inhibition of Hepatitis C Virus, with training sets ranging from 7,866 to 31,919 compounds. To boost the interpretability of the prediction results, attention-based machine learning mechanism, jointly with a bidirectional LSTM was used to detect structural alerts for the mutagenicity data set. Traditional fragment descriptor-based models were used for comparison. As per the results of the external and cross-validation experiments, overall prediction accuracies of the LSTM models were close to the fragment-based models. However, LSTM models were superior in predicting test chemicals that are dissimilar to the training set compounds, a coveted quality of QSAR models in real world applications. In summary, it is possible to build QSAR models using LSTMs without using pre-computed traditional descriptors, and models are far from being “black box.” We wish that this study will be helpful in bringing large, descriptor-less QSARs to mainstream use.
format	Online Article Text
id	pubmed-7861338
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-78613382021-03-16 Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks Chakravarti, Suman K. Alla, Sai Radha Mani Front Artif Intell Artificial Intelligence Current practice of building QSAR models usually involves computing a set of descriptors for the training set compounds, applying a descriptor selection algorithm and finally using a statistical fitting method to build the model. In this study, we explored the prospects of building good quality interpretable QSARs for big and diverse datasets, without using any pre-calculated descriptors. We have used different forms of Long Short-Term Memory (LSTM) neural networks to achieve this, trained directly using either traditional SMILES codes or a new linear molecular notation developed as part of this work. Three endpoints were modeled: Ames mutagenicity, inhibition of P. falciparum Dd2 and inhibition of Hepatitis C Virus, with training sets ranging from 7,866 to 31,919 compounds. To boost the interpretability of the prediction results, attention-based machine learning mechanism, jointly with a bidirectional LSTM was used to detect structural alerts for the mutagenicity data set. Traditional fragment descriptor-based models were used for comparison. As per the results of the external and cross-validation experiments, overall prediction accuracies of the LSTM models were close to the fragment-based models. However, LSTM models were superior in predicting test chemicals that are dissimilar to the training set compounds, a coveted quality of QSAR models in real world applications. In summary, it is possible to build QSAR models using LSTMs without using pre-computed traditional descriptors, and models are far from being “black box.” We wish that this study will be helpful in bringing large, descriptor-less QSARs to mainstream use. Frontiers Media S.A. 2019-09-06 /pmc/articles/PMC7861338/ /pubmed/33733106 http://dx.doi.org/10.3389/frai.2019.00017 Text en Copyright © 2019 Chakravarti and Alla. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Chakravarti, Suman K. Alla, Sai Radha Mani Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks
title	Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks
title_full	Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks
title_fullStr	Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks
title_full_unstemmed	Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks
title_short	Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks
title_sort	descriptor free qsar modeling using deep learning with long short-term memory neural networks
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861338/ https://www.ncbi.nlm.nih.gov/pubmed/33733106 http://dx.doi.org/10.3389/frai.2019.00017
work_keys_str_mv	AT chakravartisumank descriptorfreeqsarmodelingusingdeeplearningwithlongshorttermmemoryneuralnetworks AT allasairadhamani descriptorfreeqsarmodelingusingdeeplearningwithlongshorttermmemoryneuralnetworks

Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks

Ejemplares similares