Cargando…

Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification

High-throughput screening (HTS) methods enable the empirical evaluation of a large scale of compounds and can be augmented by virtual screening (VS) techniques to save time and money by using potential active compounds for experimental testing. Structure-based and ligand-based virtual screening appr...

Descripción completa

Detalles Bibliográficos
Autores principales: Bian, Yuemin, Kwon, Jason J., Liu, Cong, Margiotta, Enrico, Shekhar, Mrinal, Gould, Alexandra E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10040869/
https://www.ncbi.nlm.nih.gov/pubmed/36994428
http://dx.doi.org/10.3389/fmolb.2023.1163536
_version_ 1784912580135878656
author Bian, Yuemin
Kwon, Jason J.
Liu, Cong
Margiotta, Enrico
Shekhar, Mrinal
Gould, Alexandra E.
author_facet Bian, Yuemin
Kwon, Jason J.
Liu, Cong
Margiotta, Enrico
Shekhar, Mrinal
Gould, Alexandra E.
author_sort Bian, Yuemin
collection PubMed
description High-throughput screening (HTS) methods enable the empirical evaluation of a large scale of compounds and can be augmented by virtual screening (VS) techniques to save time and money by using potential active compounds for experimental testing. Structure-based and ligand-based virtual screening approaches have been extensively studied and applied in drug discovery practice with proven outcomes in advancing candidate molecules. However, the experimental data required for VS are expensive, and hit identification in an effective and efficient manner is particularly challenging during early-stage drug discovery for novel protein targets. Herein, we present our TArget-driven Machine learning-Enabled VS (TAME-VS) platform, which leverages existing chemical databases of bioactive molecules to modularly facilitate hit finding. Our methodology enables bespoke hit identification campaigns through a user-defined protein target. The input target ID is used to perform a homology-based target expansion, followed by compound retrieval from a large compilation of molecules with experimentally validated activity. Compounds are subsequently vectorized and adopted for machine learning (ML) model training. These machine learning models are deployed to perform model-based inferential virtual screening, and compounds are nominated based on predicted activity. Our platform was retrospectively validated across ten diverse protein targets and demonstrated clear predictive power. The implemented methodology provides a flexible and efficient approach that is accessible to a wide range of users. The TAME-VS platform is publicly available at https://github.com/bymgood/Target-driven-ML-enabled-VS to facilitate early-stage hit identification.
format Online
Article
Text
id pubmed-10040869
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-100408692023-03-28 Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification Bian, Yuemin Kwon, Jason J. Liu, Cong Margiotta, Enrico Shekhar, Mrinal Gould, Alexandra E. Front Mol Biosci Molecular Biosciences High-throughput screening (HTS) methods enable the empirical evaluation of a large scale of compounds and can be augmented by virtual screening (VS) techniques to save time and money by using potential active compounds for experimental testing. Structure-based and ligand-based virtual screening approaches have been extensively studied and applied in drug discovery practice with proven outcomes in advancing candidate molecules. However, the experimental data required for VS are expensive, and hit identification in an effective and efficient manner is particularly challenging during early-stage drug discovery for novel protein targets. Herein, we present our TArget-driven Machine learning-Enabled VS (TAME-VS) platform, which leverages existing chemical databases of bioactive molecules to modularly facilitate hit finding. Our methodology enables bespoke hit identification campaigns through a user-defined protein target. The input target ID is used to perform a homology-based target expansion, followed by compound retrieval from a large compilation of molecules with experimentally validated activity. Compounds are subsequently vectorized and adopted for machine learning (ML) model training. These machine learning models are deployed to perform model-based inferential virtual screening, and compounds are nominated based on predicted activity. Our platform was retrospectively validated across ten diverse protein targets and demonstrated clear predictive power. The implemented methodology provides a flexible and efficient approach that is accessible to a wide range of users. The TAME-VS platform is publicly available at https://github.com/bymgood/Target-driven-ML-enabled-VS to facilitate early-stage hit identification. Frontiers Media S.A. 2023-03-13 /pmc/articles/PMC10040869/ /pubmed/36994428 http://dx.doi.org/10.3389/fmolb.2023.1163536 Text en Copyright © 2023 Bian, Kwon, Liu, Margiotta, Shekhar and Gould. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Bian, Yuemin
Kwon, Jason J.
Liu, Cong
Margiotta, Enrico
Shekhar, Mrinal
Gould, Alexandra E.
Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
title Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
title_full Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
title_fullStr Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
title_full_unstemmed Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
title_short Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification
title_sort target-driven machine learning-enabled virtual screening (tame-vs) platform for early-stage hit identification
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10040869/
https://www.ncbi.nlm.nih.gov/pubmed/36994428
http://dx.doi.org/10.3389/fmolb.2023.1163536
work_keys_str_mv AT bianyuemin targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification
AT kwonjasonj targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification
AT liucong targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification
AT margiottaenrico targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification
AT shekharmrinal targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification
AT gouldalexandrae targetdrivenmachinelearningenabledvirtualscreeningtamevsplatformforearlystagehitidentification