Cargando…

Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening

Ultrafast Shape Recognition (USR), along with its derivatives, are Ligand-Based Virtual Screening (LBVS) methods that condense 3-dimensional information about molecular shape, as well as other properties, into a small set of numeric descriptors. These can be used to efficiently compute a measure of...

Descripción completa

Detalles Bibliográficos
Autores principales: Bonanno, Etienne, Ebejer, Jean-Paul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7042174/
https://www.ncbi.nlm.nih.gov/pubmed/32140104
http://dx.doi.org/10.3389/fphar.2019.01675
_version_ 1783501255932379136
author Bonanno, Etienne
Ebejer, Jean-Paul
author_facet Bonanno, Etienne
Ebejer, Jean-Paul
author_sort Bonanno, Etienne
collection PubMed
description Ultrafast Shape Recognition (USR), along with its derivatives, are Ligand-Based Virtual Screening (LBVS) methods that condense 3-dimensional information about molecular shape, as well as other properties, into a small set of numeric descriptors. These can be used to efficiently compute a measure of similarity between pairs of molecules using a simple inverse Manhattan Distance metric. In this study we explore the use of suitable Machine Learning techniques that can be trained using USR descriptors, so as to improve the similarity detection of potential new leads. We use molecules from the Directory for Useful Decoys-Enhanced to construct machine learning models based on three different algorithms: Gaussian Mixture Models (GMMs), Isolation Forests and Artificial Neural Networks (ANNs). We train models based on full molecule conformer models, as well as the Lowest Energy Conformations (LECs) only. We also investigate the performance of our models when trained on smaller datasets so as to model virtual screening scenarios when only a small number of actives are known a priori. Our results indicate significant performance gains over a state of the art USR-derived method, ElectroShape 5D, with GMMs obtaining a mean performance up to 430% better than that of ElectroShape 5D in terms of Enrichment Factor with a maximum improvement of up to 940%. Additionally, we demonstrate that our models are capable of maintaining their performance, in terms of enrichment factor, within 10% of the mean as the size of the training dataset is successively reduced. Furthermore, we also demonstrate that running times for retrospective screening using the machine learning models we selected are faster than standard USR, on average by a factor of 10, including the time required for training. Our results show that machine learning techniques can significantly improve the virtual screening performance and efficiency of the USR family of methods.
format Online
Article
Text
id pubmed-7042174
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70421742020-03-05 Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening Bonanno, Etienne Ebejer, Jean-Paul Front Pharmacol Pharmacology Ultrafast Shape Recognition (USR), along with its derivatives, are Ligand-Based Virtual Screening (LBVS) methods that condense 3-dimensional information about molecular shape, as well as other properties, into a small set of numeric descriptors. These can be used to efficiently compute a measure of similarity between pairs of molecules using a simple inverse Manhattan Distance metric. In this study we explore the use of suitable Machine Learning techniques that can be trained using USR descriptors, so as to improve the similarity detection of potential new leads. We use molecules from the Directory for Useful Decoys-Enhanced to construct machine learning models based on three different algorithms: Gaussian Mixture Models (GMMs), Isolation Forests and Artificial Neural Networks (ANNs). We train models based on full molecule conformer models, as well as the Lowest Energy Conformations (LECs) only. We also investigate the performance of our models when trained on smaller datasets so as to model virtual screening scenarios when only a small number of actives are known a priori. Our results indicate significant performance gains over a state of the art USR-derived method, ElectroShape 5D, with GMMs obtaining a mean performance up to 430% better than that of ElectroShape 5D in terms of Enrichment Factor with a maximum improvement of up to 940%. Additionally, we demonstrate that our models are capable of maintaining their performance, in terms of enrichment factor, within 10% of the mean as the size of the training dataset is successively reduced. Furthermore, we also demonstrate that running times for retrospective screening using the machine learning models we selected are faster than standard USR, on average by a factor of 10, including the time required for training. Our results show that machine learning techniques can significantly improve the virtual screening performance and efficiency of the USR family of methods. Frontiers Media S.A. 2020-02-19 /pmc/articles/PMC7042174/ /pubmed/32140104 http://dx.doi.org/10.3389/fphar.2019.01675 Text en Copyright © 2020 Bonanno and Ebejer http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Bonanno, Etienne
Ebejer, Jean-Paul
Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening
title Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening
title_full Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening
title_fullStr Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening
title_full_unstemmed Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening
title_short Applying Machine Learning to Ultrafast Shape Recognition in Ligand-Based Virtual Screening
title_sort applying machine learning to ultrafast shape recognition in ligand-based virtual screening
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7042174/
https://www.ncbi.nlm.nih.gov/pubmed/32140104
http://dx.doi.org/10.3389/fphar.2019.01675
work_keys_str_mv AT bonannoetienne applyingmachinelearningtoultrafastshaperecognitioninligandbasedvirtualscreening
AT ebejerjeanpaul applyingmachinelearningtoultrafastshaperecognitioninligandbasedvirtualscreening