Cargando…

REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities

Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed “REDIAL-2020”, a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three dis...

Descripción completa

Detalles Bibliográficos
Autores principales: Govinda, KC, Bocci, Giovanni, Verma, Srijan, Hassan, Mahmudulla, Holmes, Jayme, Yang, Jeremy J., Sirimulla, Suman, Oprea, Tudor I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: ChemRxiv 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7668752/
https://www.ncbi.nlm.nih.gov/pubmed/33200119
http://dx.doi.org/10.26434/chemrxiv.12915779
_version_ 1783610523857715200
author Govinda, KC
Bocci, Giovanni
Verma, Srijan
Hassan, Mahmudulla
Holmes, Jayme
Yang, Jeremy J.
Sirimulla, Suman
Oprea, Tudor I.
author_facet Govinda, KC
Bocci, Giovanni
Verma, Srijan
Hassan, Mahmudulla
Holmes, Jayme
Yang, Jeremy J.
Sirimulla, Suman
Oprea, Tudor I.
author_sort Govinda, KC
collection PubMed
description Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed “REDIAL-2020”, a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (http://drugcentral.org/Redial). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (https://github.com/sirimullalab/redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.
format Online
Article
Text
id pubmed-7668752
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher ChemRxiv
record_format MEDLINE/PubMed
spelling pubmed-76687522020-11-17 REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities Govinda, KC Bocci, Giovanni Verma, Srijan Hassan, Mahmudulla Holmes, Jayme Yang, Jeremy J. Sirimulla, Suman Oprea, Tudor I. ChemRxiv Article Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed “REDIAL-2020”, a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (http://drugcentral.org/Redial). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (https://github.com/sirimullalab/redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version. ChemRxiv 2020-09-16 /pmc/articles/PMC7668752/ /pubmed/33200119 http://dx.doi.org/10.26434/chemrxiv.12915779 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Govinda, KC
Bocci, Giovanni
Verma, Srijan
Hassan, Mahmudulla
Holmes, Jayme
Yang, Jeremy J.
Sirimulla, Suman
Oprea, Tudor I.
REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities
title REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities
title_full REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities
title_fullStr REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities
title_full_unstemmed REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities
title_short REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities
title_sort redial-2020: a suite of machine learning models to estimate anti-sars-cov-2 activities
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7668752/
https://www.ncbi.nlm.nih.gov/pubmed/33200119
http://dx.doi.org/10.26434/chemrxiv.12915779
work_keys_str_mv AT govindakc redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT boccigiovanni redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT vermasrijan redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT hassanmahmudulla redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT holmesjayme redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT yangjeremyj redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT sirimullasuman redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities
AT opreatudori redial2020asuiteofmachinelearningmodelstoestimateantisarscov2activities