Cargando…
Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
High-throughput screening (HTS) methods enable the empirical evaluation of a large scale of compounds and can be augmented by virtual screening (VS) techniques to save time and money by using potential active compounds for experimental testing. Structure-based and ligand-based virtual screening appr...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10040869/ https://www.ncbi.nlm.nih.gov/pubmed/36994428 http://dx.doi.org/10.3389/fmolb.2023.1163536 |
_version_ | 1784912580135878656 |
---|---|
author | Bian, Yuemin Kwon, Jason J. Liu, Cong Margiotta, Enrico Shekhar, Mrinal Gould, Alexandra E. |
author_facet | Bian, Yuemin Kwon, Jason J. Liu, Cong Margiotta, Enrico Shekhar, Mrinal Gould, Alexandra E. |
author_sort | Bian, Yuemin |
collection | PubMed |
description | High-throughput screening (HTS) methods enable the empirical evaluation of a large scale of compounds and can be augmented by virtual screening (VS) techniques to save time and money by using potential active compounds for experimental testing. Structure-based and ligand-based virtual screening approaches have been extensively studied and applied in drug discovery practice with proven outcomes in advancing candidate molecules. However, the experimental data required for VS are expensive, and hit identification in an effective and efficient manner is particularly challenging during early-stage drug discovery for novel protein targets. Herein, we present our TArget-driven Machine learning-Enabled VS (TAME-VS) platform, which leverages existing chemical databases of bioactive molecules to modularly facilitate hit finding. Our methodology enables bespoke hit identification campaigns through a user-defined protein target. The input target ID is used to perform a homology-based target expansion, followed by compound retrieval from a large compilation of molecules with experimentally validated activity. Compounds are subsequently vectorized and adopted for machine learning (ML) model training. These machine learning models are deployed to perform model-based inferential virtual screening, and compounds are nominated based on predicted activity. Our platform was retrospectively validated across ten diverse protein targets and demonstrated clear predictive power. The implemented methodology provides a flexible and efficient approach that is accessible to a wide range of users. The TAME-VS platform is publicly available at https://github.com/bymgood/Target-driven-ML-enabled-VS to facilitate early-stage hit identification. |
format | Online Article Text |
id | pubmed-10040869 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-100408692023-03-28 Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification Bian, Yuemin Kwon, Jason J. Liu, Cong Margiotta, Enrico Shekhar, Mrinal Gould, Alexandra E. Front Mol Biosci Molecular Biosciences High-throughput screening (HTS) methods enable the empirical evaluation of a large scale of compounds and can be augmented by virtual screening (VS) techniques to save time and money by using potential active compounds for experimental testing. Structure-based and ligand-based virtual screening approaches have been extensively studied and applied in drug discovery practice with proven outcomes in advancing candidate molecules. However, the experimental data required for VS are expensive, and hit identification in an effective and efficient manner is particularly challenging during early-stage drug discovery for novel protein targets. Herein, we present our TArget-driven Machine learning-Enabled VS (TAME-VS) platform, which leverages existing chemical databases of bioactive molecules to modularly facilitate hit finding. Our methodology enables bespoke hit identification campaigns through a user-defined protein target. The input target ID is used to perform a homology-based target expansion, followed by compound retrieval from a large compilation of molecules with experimentally validated activity. Compounds are subsequently vectorized and adopted for machine learning (ML) model training. These machine learning models are deployed to perform model-based inferential virtual screening, and compounds are nominated based on predicted activity. Our platform was retrospectively validated across ten diverse protein targets and demonstrated clear predictive power. The implemented methodology provides a flexible and efficient approach that is accessible to a wide range of users. The TAME-VS platform is publicly available at https://github.com/bymgood/Target-driven-ML-enabled-VS to facilitate early-stage hit identification. Frontiers Media S.A. 2023-03-13 /pmc/articles/PMC10040869/ /pubmed/36994428 http://dx.doi.org/10.3389/fmolb.2023.1163536 Text en Copyright © 2023 Bian, Kwon, Liu, Margiotta, Shekhar and Gould. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Molecular Biosciences Bian, Yuemin Kwon, Jason J. Liu, Cong Margiotta, Enrico Shekhar, Mrinal Gould, Alexandra E. Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification |
title | Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification |
title_full | Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification |
title_fullStr | Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification |
title_full_unstemmed | Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification |
title_short | Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification |
title_sort | target-driven machine learning-enabled virtual screening (tame-vs) platform for early-stage hit identification |
topic | Molecular Biosciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10040869/ https://www.ncbi.nlm.nih.gov/pubmed/36994428 http://dx.doi.org/10.3389/fmolb.2023.1163536 |
work_keys_str_mv | AT bianyuemin targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification AT kwonjasonj targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification AT liucong targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification AT margiottaenrico targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification AT shekharmrinal targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification AT gouldalexandrae targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification |