Cargando…

Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity

Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniqu...

Descripción completa

Detalles Bibliográficos
Autores principales: Kalian, Alexander D., Benfenati, Emilio, Osborne, Olivia J., Gott, David, Potter, Claire, Dorne, Jean-Lou C. M., Guo, Miao, Hogstrand, Christer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10384850/
https://www.ncbi.nlm.nih.gov/pubmed/37505541
http://dx.doi.org/10.3390/toxics11070572
_version_ 1785081258481549312
author Kalian, Alexander D.
Benfenati, Emilio
Osborne, Olivia J.
Gott, David
Potter, Claire
Dorne, Jean-Lou C. M.
Guo, Miao
Hogstrand, Christer
author_facet Kalian, Alexander D.
Benfenati, Emilio
Osborne, Olivia J.
Gott, David
Potter, Claire
Dorne, Jean-Lou C. M.
Guo, Miao
Hogstrand, Christer
author_sort Kalian, Alexander D.
collection PubMed
description Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover’s theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space.
format Online
Article
Text
id pubmed-10384850
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-103848502023-07-30 Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity Kalian, Alexander D. Benfenati, Emilio Osborne, Olivia J. Gott, David Potter, Claire Dorne, Jean-Lou C. M. Guo, Miao Hogstrand, Christer Toxics Article Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover’s theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space. MDPI 2023-06-30 /pmc/articles/PMC10384850/ /pubmed/37505541 http://dx.doi.org/10.3390/toxics11070572 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kalian, Alexander D.
Benfenati, Emilio
Osborne, Olivia J.
Gott, David
Potter, Claire
Dorne, Jean-Lou C. M.
Guo, Miao
Hogstrand, Christer
Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity
title Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity
title_full Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity
title_fullStr Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity
title_full_unstemmed Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity
title_short Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity
title_sort exploring dimensionality reduction techniques for deep learning driven qsar models of mutagenicity
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10384850/
https://www.ncbi.nlm.nih.gov/pubmed/37505541
http://dx.doi.org/10.3390/toxics11070572
work_keys_str_mv AT kalianalexanderd exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT benfenatiemilio exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT osborneoliviaj exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT gottdavid exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT potterclaire exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT dornejeanloucm exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT guomiao exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity
AT hogstrandchrister exploringdimensionalityreductiontechniquesfordeeplearningdrivenqsarmodelsofmutagenicity