Cargando…

In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative pathogen of COVID-19, is spreading rapidly and has caused hundreds of millions of infections and millions of deaths worldwide. Due to the lack of specific vaccines and effective treatments for COVID-19, there is an urgent ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Jihao, Zheng, Yang, Tong, Xin, Yang, Naixue, Dai, Shaoxing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9821958/
https://www.ncbi.nlm.nih.gov/pubmed/36615401
http://dx.doi.org/10.3390/molecules28010208
_version_ 1784865827407790080
author Liang, Jihao
Zheng, Yang
Tong, Xin
Yang, Naixue
Dai, Shaoxing
author_facet Liang, Jihao
Zheng, Yang
Tong, Xin
Yang, Naixue
Dai, Shaoxing
author_sort Liang, Jihao
collection PubMed
description Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative pathogen of COVID-19, is spreading rapidly and has caused hundreds of millions of infections and millions of deaths worldwide. Due to the lack of specific vaccines and effective treatments for COVID-19, there is an urgent need to identify effective drugs. Traditional Chinese medicine (TCM) is a valuable resource for identifying novel anti-SARS-CoV-2 drugs based on the important contribution of TCM and its potential benefits in COVID-19 treatment. Herein, we aimed to discover novel anti-SARS-CoV-2 compounds and medicinal plants from TCM by establishing a prediction method of anti-SARS-CoV-2 activity using machine learning methods. We first constructed a benchmark dataset from anti-SARS-CoV-2 bioactivity data collected from the ChEMBL database. Then, we established random forest (RF) and support vector machine (SVM) models that both achieved satisfactory predictive performance with AUC values of 0.90. By using this method, a total of 1011 active anti-SARS-CoV-2 compounds were predicted from the TCMSP database. Among these compounds, six compounds with highly potent activity were confirmed in the anti-SARS-CoV-2 experiments. The molecular fingerprint similarity analysis revealed that only 24 of the 1011 compounds have high similarity to the FDA-approved antiviral drugs, indicating that most of the compounds were structurally novel. Based on the predicted anti-SARS-CoV-2 compounds, we identified 74 anti-SARS-CoV-2 medicinal plants through enrichment analysis. The 74 plants are widely distributed in 68 genera and 43 families, 14 of which belong to antipyretic detoxicate plants. In summary, this study provided several medicinal plants with potential anti-SARS-CoV-2 activity, which offer an attractive starting point and a broader scope to mine for potentially novel anti-SARS-CoV-2 drugs.
format Online
Article
Text
id pubmed-9821958
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98219582023-01-07 In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning Liang, Jihao Zheng, Yang Tong, Xin Yang, Naixue Dai, Shaoxing Molecules Article Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative pathogen of COVID-19, is spreading rapidly and has caused hundreds of millions of infections and millions of deaths worldwide. Due to the lack of specific vaccines and effective treatments for COVID-19, there is an urgent need to identify effective drugs. Traditional Chinese medicine (TCM) is a valuable resource for identifying novel anti-SARS-CoV-2 drugs based on the important contribution of TCM and its potential benefits in COVID-19 treatment. Herein, we aimed to discover novel anti-SARS-CoV-2 compounds and medicinal plants from TCM by establishing a prediction method of anti-SARS-CoV-2 activity using machine learning methods. We first constructed a benchmark dataset from anti-SARS-CoV-2 bioactivity data collected from the ChEMBL database. Then, we established random forest (RF) and support vector machine (SVM) models that both achieved satisfactory predictive performance with AUC values of 0.90. By using this method, a total of 1011 active anti-SARS-CoV-2 compounds were predicted from the TCMSP database. Among these compounds, six compounds with highly potent activity were confirmed in the anti-SARS-CoV-2 experiments. The molecular fingerprint similarity analysis revealed that only 24 of the 1011 compounds have high similarity to the FDA-approved antiviral drugs, indicating that most of the compounds were structurally novel. Based on the predicted anti-SARS-CoV-2 compounds, we identified 74 anti-SARS-CoV-2 medicinal plants through enrichment analysis. The 74 plants are widely distributed in 68 genera and 43 families, 14 of which belong to antipyretic detoxicate plants. In summary, this study provided several medicinal plants with potential anti-SARS-CoV-2 activity, which offer an attractive starting point and a broader scope to mine for potentially novel anti-SARS-CoV-2 drugs. MDPI 2022-12-26 /pmc/articles/PMC9821958/ /pubmed/36615401 http://dx.doi.org/10.3390/molecules28010208 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liang, Jihao
Zheng, Yang
Tong, Xin
Yang, Naixue
Dai, Shaoxing
In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning
title In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning
title_full In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning
title_fullStr In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning
title_full_unstemmed In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning
title_short In Silico Identification of Anti-SARS-CoV-2 Medicinal Plants Using Cheminformatics and Machine Learning
title_sort in silico identification of anti-sars-cov-2 medicinal plants using cheminformatics and machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9821958/
https://www.ncbi.nlm.nih.gov/pubmed/36615401
http://dx.doi.org/10.3390/molecules28010208
work_keys_str_mv AT liangjihao insilicoidentificationofantisarscov2medicinalplantsusingcheminformaticsandmachinelearning
AT zhengyang insilicoidentificationofantisarscov2medicinalplantsusingcheminformaticsandmachinelearning
AT tongxin insilicoidentificationofantisarscov2medicinalplantsusingcheminformaticsandmachinelearning
AT yangnaixue insilicoidentificationofantisarscov2medicinalplantsusingcheminformaticsandmachinelearning
AT daishaoxing insilicoidentificationofantisarscov2medicinalplantsusingcheminformaticsandmachinelearning