Cargando…
Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method
SARS-CoV-2 shows great evolutionary capacity through a high frequency of genomic variation during transmission. Evolved SARS-CoV-2 often demonstrates resistance to previous vaccines and can cause poor clinical status in patients. Mutations in the SARS-CoV-2 genome involve mutations in structural and...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9225528/ https://www.ncbi.nlm.nih.gov/pubmed/35743837 http://dx.doi.org/10.3390/life12060806 |
_version_ | 1784733634653061120 |
---|---|
author | Huang, Feiming Chen, Lei Guo, Wei Zhou, Xianchao Feng, Kaiyan Huang, Tao Cai, Yudong |
author_facet | Huang, Feiming Chen, Lei Guo, Wei Zhou, Xianchao Feng, Kaiyan Huang, Tao Cai, Yudong |
author_sort | Huang, Feiming |
collection | PubMed |
description | SARS-CoV-2 shows great evolutionary capacity through a high frequency of genomic variation during transmission. Evolved SARS-CoV-2 often demonstrates resistance to previous vaccines and can cause poor clinical status in patients. Mutations in the SARS-CoV-2 genome involve mutations in structural and nonstructural proteins, and some of these proteins such as spike proteins have been shown to be directly associated with the clinical status of patients with severe COVID-19 pneumonia. In this study, we collected genome-wide mutation information of virulent strains and the severity of COVID-19 pneumonia in patients varying depending on their clinical status. Important protein mutations and untranslated region mutations were extracted using machine learning methods. First, through Boruta and four ranking algorithms (least absolute shrinkage and selection operator, light gradient boosting machine, max-relevance and min-redundancy, and Monte Carlo feature selection), mutations that were highly correlated with the clinical status of the patients were screened out and sorted in four feature lists. Some mutations such as D614G and V1176F were shown to be associated with viral infectivity. Moreover, previously unreported mutations such as A320V of nsp14 and I164ILV of nsp14 were also identified, which suggests their potential roles. We then applied the incremental feature selection method to each feature list to construct efficient classifiers, which can be directly used to distinguish the clinical status of COVID-19 patients. Meanwhile, four sets of quantitative rules were set up, which can help us to more intuitively understand the role of each mutation in differentiating the clinical status of COVID-19 patients. Identified key mutations linked to virologic properties will help better understand the mechanisms of infection and will aid in the development of antiviral treatments. |
format | Online Article Text |
id | pubmed-9225528 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-92255282022-06-24 Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method Huang, Feiming Chen, Lei Guo, Wei Zhou, Xianchao Feng, Kaiyan Huang, Tao Cai, Yudong Life (Basel) Article SARS-CoV-2 shows great evolutionary capacity through a high frequency of genomic variation during transmission. Evolved SARS-CoV-2 often demonstrates resistance to previous vaccines and can cause poor clinical status in patients. Mutations in the SARS-CoV-2 genome involve mutations in structural and nonstructural proteins, and some of these proteins such as spike proteins have been shown to be directly associated with the clinical status of patients with severe COVID-19 pneumonia. In this study, we collected genome-wide mutation information of virulent strains and the severity of COVID-19 pneumonia in patients varying depending on their clinical status. Important protein mutations and untranslated region mutations were extracted using machine learning methods. First, through Boruta and four ranking algorithms (least absolute shrinkage and selection operator, light gradient boosting machine, max-relevance and min-redundancy, and Monte Carlo feature selection), mutations that were highly correlated with the clinical status of the patients were screened out and sorted in four feature lists. Some mutations such as D614G and V1176F were shown to be associated with viral infectivity. Moreover, previously unreported mutations such as A320V of nsp14 and I164ILV of nsp14 were also identified, which suggests their potential roles. We then applied the incremental feature selection method to each feature list to construct efficient classifiers, which can be directly used to distinguish the clinical status of COVID-19 patients. Meanwhile, four sets of quantitative rules were set up, which can help us to more intuitively understand the role of each mutation in differentiating the clinical status of COVID-19 patients. Identified key mutations linked to virologic properties will help better understand the mechanisms of infection and will aid in the development of antiviral treatments. MDPI 2022-05-28 /pmc/articles/PMC9225528/ /pubmed/35743837 http://dx.doi.org/10.3390/life12060806 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Huang, Feiming Chen, Lei Guo, Wei Zhou, Xianchao Feng, Kaiyan Huang, Tao Cai, Yudong Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method |
title | Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method |
title_full | Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method |
title_fullStr | Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method |
title_full_unstemmed | Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method |
title_short | Identifying COVID-19 Severity-Related SARS-CoV-2 Mutation Using a Machine Learning Method |
title_sort | identifying covid-19 severity-related sars-cov-2 mutation using a machine learning method |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9225528/ https://www.ncbi.nlm.nih.gov/pubmed/35743837 http://dx.doi.org/10.3390/life12060806 |
work_keys_str_mv | AT huangfeiming identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod AT chenlei identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod AT guowei identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod AT zhouxianchao identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod AT fengkaiyan identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod AT huangtao identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod AT caiyudong identifyingcovid19severityrelatedsarscov2mutationusingamachinelearningmethod |