Cargando…

Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules

Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Harigua-Souiai, Emna, Heinhane, Mohamed Mahmoud, Abdelkrim, Yosser Zina, Souiai, Oussama, Abdeljaoued-Tej, Ines, Guizani, Ikram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667578/
https://www.ncbi.nlm.nih.gov/pubmed/34912370
http://dx.doi.org/10.3389/fgene.2021.744170
_version_ 1784614408668839936
author Harigua-Souiai, Emna
Heinhane, Mohamed Mahmoud
Abdelkrim, Yosser Zina
Souiai, Oussama
Abdeljaoued-Tej, Ines
Guizani, Ikram
author_facet Harigua-Souiai, Emna
Heinhane, Mohamed Mahmoud
Abdelkrim, Yosser Zina
Souiai, Oussama
Abdeljaoued-Tej, Ines
Guizani, Ikram
author_sort Harigua-Souiai, Emna
collection PubMed
description Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit identification rate. We herein propose a pipeline for Ligand-Based Drug Discovery (LBDD) against SARS-CoV-2. Through an extensive search of the literature and multiple steps of filtering, we integrated information on 2,610 molecules having a validated effect against SARS-CoV and/or SARS-CoV-2. The chemical structures of these molecules were encoded through multiple systems to be readily useful as input to conventional machine learning (ML) algorithms or deep learning (DL) architectures. We assessed the performances of seven ML algorithms and four DL algorithms in achieving molecule classification into two classes: active and inactive. The Random Forests (RF), Graph Convolutional Network (GCN), and Directed Acyclic Graph (DAG) models achieved the best performances. These models were further optimized through hyperparameter tuning and achieved ROC-AUC scores through cross-validation of 85, 83, and 79% for RF, GCN, and DAG models, respectively. An external validation step on the FDA-approved drugs collection revealed a superior potential of DL algorithms to achieve drug repurposing against SARS-CoV-2 based on the dataset herein presented. Namely, GCN and DAG achieved more than 50% of the true positive rate assessed on the confirmed hits of a PubChem bioassay.
format Online
Article
Text
id pubmed-8667578
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-86675782021-12-14 Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules Harigua-Souiai, Emna Heinhane, Mohamed Mahmoud Abdelkrim, Yosser Zina Souiai, Oussama Abdeljaoued-Tej, Ines Guizani, Ikram Front Genet Genetics Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit identification rate. We herein propose a pipeline for Ligand-Based Drug Discovery (LBDD) against SARS-CoV-2. Through an extensive search of the literature and multiple steps of filtering, we integrated information on 2,610 molecules having a validated effect against SARS-CoV and/or SARS-CoV-2. The chemical structures of these molecules were encoded through multiple systems to be readily useful as input to conventional machine learning (ML) algorithms or deep learning (DL) architectures. We assessed the performances of seven ML algorithms and four DL algorithms in achieving molecule classification into two classes: active and inactive. The Random Forests (RF), Graph Convolutional Network (GCN), and Directed Acyclic Graph (DAG) models achieved the best performances. These models were further optimized through hyperparameter tuning and achieved ROC-AUC scores through cross-validation of 85, 83, and 79% for RF, GCN, and DAG models, respectively. An external validation step on the FDA-approved drugs collection revealed a superior potential of DL algorithms to achieve drug repurposing against SARS-CoV-2 based on the dataset herein presented. Namely, GCN and DAG achieved more than 50% of the true positive rate assessed on the confirmed hits of a PubChem bioassay. Frontiers Media S.A. 2021-11-29 /pmc/articles/PMC8667578/ /pubmed/34912370 http://dx.doi.org/10.3389/fgene.2021.744170 Text en Copyright © 2021 Harigua-Souiai, Heinhane, Abdelkrim, Souiai, Abdeljaoued-Tej and Guizani. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Harigua-Souiai, Emna
Heinhane, Mohamed Mahmoud
Abdelkrim, Yosser Zina
Souiai, Oussama
Abdeljaoued-Tej, Ines
Guizani, Ikram
Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_full Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_fullStr Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_full_unstemmed Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_short Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_sort deep learning algorithms achieved satisfactory predictions when trained on a novel collection of anticoronavirus molecules
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667578/
https://www.ncbi.nlm.nih.gov/pubmed/34912370
http://dx.doi.org/10.3389/fgene.2021.744170
work_keys_str_mv AT hariguasouiaiemna deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT heinhanemohamedmahmoud deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT abdelkrimyosserzina deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT souiaioussama deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT abdeljaouedtejines deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT guizaniikram deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules