Cargando…

Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations

The efforts of the scientific community to tame the recent pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seem to have been diluted by the emergence of new viral strains. Therefore, it is imperative to understand the effect of mutations on viral evolution. We perform...

Descripción completa

Detalles Bibliográficos
Autores principales: Periwal, Neha, Rathod, Shravan B., Sarma, Sankritya, Johar, Gundeep S., Jain, Avantika, Barnwal, Ravi P., Srivastava, Kinsukh R., Kaur, Baljeet, Arora, Pooja, Sood, Vikas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9603882/
https://www.ncbi.nlm.nih.gov/pubmed/36069583
http://dx.doi.org/10.1128/spectrum.01219-22
_version_ 1784817668094689280
author Periwal, Neha
Rathod, Shravan B.
Sarma, Sankritya
Johar, Gundeep S.
Jain, Avantika
Barnwal, Ravi P.
Srivastava, Kinsukh R.
Kaur, Baljeet
Arora, Pooja
Sood, Vikas
author_facet Periwal, Neha
Rathod, Shravan B.
Sarma, Sankritya
Johar, Gundeep S.
Jain, Avantika
Barnwal, Ravi P.
Srivastava, Kinsukh R.
Kaur, Baljeet
Arora, Pooja
Sood, Vikas
author_sort Periwal, Neha
collection PubMed
description The efforts of the scientific community to tame the recent pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seem to have been diluted by the emergence of new viral strains. Therefore, it is imperative to understand the effect of mutations on viral evolution. We performed a time series analysis on 59,541 SARS-CoV-2 genomic sequences from around the world to gain insights into the kinetics of the mutations arising in the viral genomes. These 59,541 genomes were grouped according to month (January 2020 to March 2021) based on the collection date. Meta-analysis of these data led us to identify significant mutations in viral genomes. Pearson correlation of these mutations led us to the identification of 16 comutations. Among these comutations, some of the individual mutations have been shown to contribute to viral replication and fitness, suggesting a possible role of other unexplored mutations in viral evolution. We observed that the mutations 241C>T in the 5′ untranslated region (UTR), 3037C>T in nsp3, 14408C>T in the RNA-dependent RNA polymerase (RdRp), and 23403A>G in spike are correlated with each other and were grouped in a single cluster by hierarchical clustering. These mutations have replaced the wild-type nucleotides in SARS-CoV-2 sequences. Additionally, we employed a suite of computational tools to investigate the effects of T85I (1059C>T), P323L (14408C>T), and Q57H (25563G>T) mutations in nsp2, RdRp, and the ORF3a protein of SARS-CoV-2, respectively. We observed that the mutations T85I and Q57H tend to be deleterious and destabilize the respective wild-type protein, whereas P323L in RdRp tends to be neutral and has a stabilizing effect. IMPORTANCE We performed a meta-analysis on SARS-CoV-2 genomes categorized by collection month and identified several significant mutations. Pearson correlation analysis of these significant mutations identified 16 comutations having absolute correlation coefficients of >0.4 and a frequency of >30% in the genomes used in this study. The correlation results were further validated by another statistical tool called hierarchical clustering, where mutations were grouped in clusters on the basis of their similarity. We identified several positive and negative correlations among comutations in SARS-CoV-2 isolates from around the world which might contribute to viral pathogenesis. The negative correlations among some of the mutations in SARS-CoV-2 identified in this study warrant further investigations. Further analysis of mutations such as T85I in nsp2 and Q57H in ORF3a protein revealed that these mutations tend to destabilize the protein relative to the wild type, whereas P323L in RdRp is neutral and has a stabilizing effect. Thus, we have identified several comutations which can be further characterized to gain insights into SARS-CoV-2 evolution.
format Online
Article
Text
id pubmed-9603882
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-96038822022-10-27 Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations Periwal, Neha Rathod, Shravan B. Sarma, Sankritya Johar, Gundeep S. Jain, Avantika Barnwal, Ravi P. Srivastava, Kinsukh R. Kaur, Baljeet Arora, Pooja Sood, Vikas Microbiol Spectr Research Article The efforts of the scientific community to tame the recent pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seem to have been diluted by the emergence of new viral strains. Therefore, it is imperative to understand the effect of mutations on viral evolution. We performed a time series analysis on 59,541 SARS-CoV-2 genomic sequences from around the world to gain insights into the kinetics of the mutations arising in the viral genomes. These 59,541 genomes were grouped according to month (January 2020 to March 2021) based on the collection date. Meta-analysis of these data led us to identify significant mutations in viral genomes. Pearson correlation of these mutations led us to the identification of 16 comutations. Among these comutations, some of the individual mutations have been shown to contribute to viral replication and fitness, suggesting a possible role of other unexplored mutations in viral evolution. We observed that the mutations 241C>T in the 5′ untranslated region (UTR), 3037C>T in nsp3, 14408C>T in the RNA-dependent RNA polymerase (RdRp), and 23403A>G in spike are correlated with each other and were grouped in a single cluster by hierarchical clustering. These mutations have replaced the wild-type nucleotides in SARS-CoV-2 sequences. Additionally, we employed a suite of computational tools to investigate the effects of T85I (1059C>T), P323L (14408C>T), and Q57H (25563G>T) mutations in nsp2, RdRp, and the ORF3a protein of SARS-CoV-2, respectively. We observed that the mutations T85I and Q57H tend to be deleterious and destabilize the respective wild-type protein, whereas P323L in RdRp tends to be neutral and has a stabilizing effect. IMPORTANCE We performed a meta-analysis on SARS-CoV-2 genomes categorized by collection month and identified several significant mutations. Pearson correlation analysis of these significant mutations identified 16 comutations having absolute correlation coefficients of >0.4 and a frequency of >30% in the genomes used in this study. The correlation results were further validated by another statistical tool called hierarchical clustering, where mutations were grouped in clusters on the basis of their similarity. We identified several positive and negative correlations among comutations in SARS-CoV-2 isolates from around the world which might contribute to viral pathogenesis. The negative correlations among some of the mutations in SARS-CoV-2 identified in this study warrant further investigations. Further analysis of mutations such as T85I in nsp2 and Q57H in ORF3a protein revealed that these mutations tend to destabilize the protein relative to the wild type, whereas P323L in RdRp is neutral and has a stabilizing effect. Thus, we have identified several comutations which can be further characterized to gain insights into SARS-CoV-2 evolution. American Society for Microbiology 2022-09-07 /pmc/articles/PMC9603882/ /pubmed/36069583 http://dx.doi.org/10.1128/spectrum.01219-22 Text en Copyright © 2022 Periwal et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Periwal, Neha
Rathod, Shravan B.
Sarma, Sankritya
Johar, Gundeep S.
Jain, Avantika
Barnwal, Ravi P.
Srivastava, Kinsukh R.
Kaur, Baljeet
Arora, Pooja
Sood, Vikas
Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
title Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
title_full Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
title_fullStr Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
title_full_unstemmed Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
title_short Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
title_sort time series analysis of sars-cov-2 genomes and correlations among highly prevalent mutations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9603882/
https://www.ncbi.nlm.nih.gov/pubmed/36069583
http://dx.doi.org/10.1128/spectrum.01219-22
work_keys_str_mv AT periwalneha timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT rathodshravanb timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT sarmasankritya timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT johargundeeps timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT jainavantika timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT barnwalravip timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT srivastavakinsukhr timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT kaurbaljeet timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT arorapooja timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations
AT soodvikas timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations