Cargando…

Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks

Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the mos...

Descripción completa

Detalles Bibliográficos
Autores principales: Saldivar-Espinoza, Bryan, Macip, Guillem, Garcia-Segura, Pol, Mestres-Truyol, Júlia, Puigbò, Pere, Cereto-Massagué, Adrià, Pujadas, Gerard, Garcia-Vallve, Santiago
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9736107/
https://www.ncbi.nlm.nih.gov/pubmed/36499005
http://dx.doi.org/10.3390/ijms232314683
_version_ 1784846940849045504
author Saldivar-Espinoza, Bryan
Macip, Guillem
Garcia-Segura, Pol
Mestres-Truyol, Júlia
Puigbò, Pere
Cereto-Massagué, Adrià
Pujadas, Gerard
Garcia-Vallve, Santiago
author_facet Saldivar-Espinoza, Bryan
Macip, Guillem
Garcia-Segura, Pol
Mestres-Truyol, Júlia
Puigbò, Pere
Cereto-Massagué, Adrià
Pujadas, Gerard
Garcia-Vallve, Santiago
author_sort Saldivar-Espinoza, Bryan
collection PubMed
description Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity value of 0.79, and an Area Under the Curve (AUC) of 0.8, showing that the prediction of recurrent SARS-CoV-2 mutations is feasible. Subsequently, we compared our predictions with updated data from January 2022, showing that some of the false positives in our prediction model become true positives later on. The most important variables detected by the model’s Shapley Additive exPlanation (SHAP) are the nucleotide that mutates and RNA reactivity. This is consistent with the SARS-CoV-2 mutational bias pattern and the preference of some host deaminases for specific sequences and RNA secondary structures. We extend our investigation by analyzing the mutations from the variants of concern Alpha, Beta, Delta, Gamma, and Omicron. Finally, we analyzed amino acid changes by looking at the predicted recurrent mutations in the M-pro and spike proteins.
format Online
Article
Text
id pubmed-9736107
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97361072022-12-11 Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks Saldivar-Espinoza, Bryan Macip, Guillem Garcia-Segura, Pol Mestres-Truyol, Júlia Puigbò, Pere Cereto-Massagué, Adrià Pujadas, Gerard Garcia-Vallve, Santiago Int J Mol Sci Article Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity value of 0.79, and an Area Under the Curve (AUC) of 0.8, showing that the prediction of recurrent SARS-CoV-2 mutations is feasible. Subsequently, we compared our predictions with updated data from January 2022, showing that some of the false positives in our prediction model become true positives later on. The most important variables detected by the model’s Shapley Additive exPlanation (SHAP) are the nucleotide that mutates and RNA reactivity. This is consistent with the SARS-CoV-2 mutational bias pattern and the preference of some host deaminases for specific sequences and RNA secondary structures. We extend our investigation by analyzing the mutations from the variants of concern Alpha, Beta, Delta, Gamma, and Omicron. Finally, we analyzed amino acid changes by looking at the predicted recurrent mutations in the M-pro and spike proteins. MDPI 2022-11-24 /pmc/articles/PMC9736107/ /pubmed/36499005 http://dx.doi.org/10.3390/ijms232314683 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Saldivar-Espinoza, Bryan
Macip, Guillem
Garcia-Segura, Pol
Mestres-Truyol, Júlia
Puigbò, Pere
Cereto-Massagué, Adrià
Pujadas, Gerard
Garcia-Vallve, Santiago
Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks
title Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks
title_full Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks
title_fullStr Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks
title_full_unstemmed Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks
title_short Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks
title_sort prediction of recurrent mutations in sars-cov-2 using artificial neural networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9736107/
https://www.ncbi.nlm.nih.gov/pubmed/36499005
http://dx.doi.org/10.3390/ijms232314683
work_keys_str_mv AT saldivarespinozabryan predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT macipguillem predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT garciasegurapol predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT mestrestruyoljulia predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT puigbopere predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT ceretomassagueadria predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT pujadasgerard predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks
AT garciavallvesantiago predictionofrecurrentmutationsinsarscov2usingartificialneuralnetworks