Cargando…
Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations
The efforts of the scientific community to tame the recent pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seem to have been diluted by the emergence of new viral strains. Therefore, it is imperative to understand the effect of mutations on viral evolution. We perform...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9603882/ https://www.ncbi.nlm.nih.gov/pubmed/36069583 http://dx.doi.org/10.1128/spectrum.01219-22 |
_version_ | 1784817668094689280 |
---|---|
author | Periwal, Neha Rathod, Shravan B. Sarma, Sankritya Johar, Gundeep S. Jain, Avantika Barnwal, Ravi P. Srivastava, Kinsukh R. Kaur, Baljeet Arora, Pooja Sood, Vikas |
author_facet | Periwal, Neha Rathod, Shravan B. Sarma, Sankritya Johar, Gundeep S. Jain, Avantika Barnwal, Ravi P. Srivastava, Kinsukh R. Kaur, Baljeet Arora, Pooja Sood, Vikas |
author_sort | Periwal, Neha |
collection | PubMed |
description | The efforts of the scientific community to tame the recent pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seem to have been diluted by the emergence of new viral strains. Therefore, it is imperative to understand the effect of mutations on viral evolution. We performed a time series analysis on 59,541 SARS-CoV-2 genomic sequences from around the world to gain insights into the kinetics of the mutations arising in the viral genomes. These 59,541 genomes were grouped according to month (January 2020 to March 2021) based on the collection date. Meta-analysis of these data led us to identify significant mutations in viral genomes. Pearson correlation of these mutations led us to the identification of 16 comutations. Among these comutations, some of the individual mutations have been shown to contribute to viral replication and fitness, suggesting a possible role of other unexplored mutations in viral evolution. We observed that the mutations 241C>T in the 5′ untranslated region (UTR), 3037C>T in nsp3, 14408C>T in the RNA-dependent RNA polymerase (RdRp), and 23403A>G in spike are correlated with each other and were grouped in a single cluster by hierarchical clustering. These mutations have replaced the wild-type nucleotides in SARS-CoV-2 sequences. Additionally, we employed a suite of computational tools to investigate the effects of T85I (1059C>T), P323L (14408C>T), and Q57H (25563G>T) mutations in nsp2, RdRp, and the ORF3a protein of SARS-CoV-2, respectively. We observed that the mutations T85I and Q57H tend to be deleterious and destabilize the respective wild-type protein, whereas P323L in RdRp tends to be neutral and has a stabilizing effect. IMPORTANCE We performed a meta-analysis on SARS-CoV-2 genomes categorized by collection month and identified several significant mutations. Pearson correlation analysis of these significant mutations identified 16 comutations having absolute correlation coefficients of >0.4 and a frequency of >30% in the genomes used in this study. The correlation results were further validated by another statistical tool called hierarchical clustering, where mutations were grouped in clusters on the basis of their similarity. We identified several positive and negative correlations among comutations in SARS-CoV-2 isolates from around the world which might contribute to viral pathogenesis. The negative correlations among some of the mutations in SARS-CoV-2 identified in this study warrant further investigations. Further analysis of mutations such as T85I in nsp2 and Q57H in ORF3a protein revealed that these mutations tend to destabilize the protein relative to the wild type, whereas P323L in RdRp is neutral and has a stabilizing effect. Thus, we have identified several comutations which can be further characterized to gain insights into SARS-CoV-2 evolution. |
format | Online Article Text |
id | pubmed-9603882 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-96038822022-10-27 Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations Periwal, Neha Rathod, Shravan B. Sarma, Sankritya Johar, Gundeep S. Jain, Avantika Barnwal, Ravi P. Srivastava, Kinsukh R. Kaur, Baljeet Arora, Pooja Sood, Vikas Microbiol Spectr Research Article The efforts of the scientific community to tame the recent pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seem to have been diluted by the emergence of new viral strains. Therefore, it is imperative to understand the effect of mutations on viral evolution. We performed a time series analysis on 59,541 SARS-CoV-2 genomic sequences from around the world to gain insights into the kinetics of the mutations arising in the viral genomes. These 59,541 genomes were grouped according to month (January 2020 to March 2021) based on the collection date. Meta-analysis of these data led us to identify significant mutations in viral genomes. Pearson correlation of these mutations led us to the identification of 16 comutations. Among these comutations, some of the individual mutations have been shown to contribute to viral replication and fitness, suggesting a possible role of other unexplored mutations in viral evolution. We observed that the mutations 241C>T in the 5′ untranslated region (UTR), 3037C>T in nsp3, 14408C>T in the RNA-dependent RNA polymerase (RdRp), and 23403A>G in spike are correlated with each other and were grouped in a single cluster by hierarchical clustering. These mutations have replaced the wild-type nucleotides in SARS-CoV-2 sequences. Additionally, we employed a suite of computational tools to investigate the effects of T85I (1059C>T), P323L (14408C>T), and Q57H (25563G>T) mutations in nsp2, RdRp, and the ORF3a protein of SARS-CoV-2, respectively. We observed that the mutations T85I and Q57H tend to be deleterious and destabilize the respective wild-type protein, whereas P323L in RdRp tends to be neutral and has a stabilizing effect. IMPORTANCE We performed a meta-analysis on SARS-CoV-2 genomes categorized by collection month and identified several significant mutations. Pearson correlation analysis of these significant mutations identified 16 comutations having absolute correlation coefficients of >0.4 and a frequency of >30% in the genomes used in this study. The correlation results were further validated by another statistical tool called hierarchical clustering, where mutations were grouped in clusters on the basis of their similarity. We identified several positive and negative correlations among comutations in SARS-CoV-2 isolates from around the world which might contribute to viral pathogenesis. The negative correlations among some of the mutations in SARS-CoV-2 identified in this study warrant further investigations. Further analysis of mutations such as T85I in nsp2 and Q57H in ORF3a protein revealed that these mutations tend to destabilize the protein relative to the wild type, whereas P323L in RdRp is neutral and has a stabilizing effect. Thus, we have identified several comutations which can be further characterized to gain insights into SARS-CoV-2 evolution. American Society for Microbiology 2022-09-07 /pmc/articles/PMC9603882/ /pubmed/36069583 http://dx.doi.org/10.1128/spectrum.01219-22 Text en Copyright © 2022 Periwal et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Research Article Periwal, Neha Rathod, Shravan B. Sarma, Sankritya Johar, Gundeep S. Jain, Avantika Barnwal, Ravi P. Srivastava, Kinsukh R. Kaur, Baljeet Arora, Pooja Sood, Vikas Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations |
title | Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations |
title_full | Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations |
title_fullStr | Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations |
title_full_unstemmed | Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations |
title_short | Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations |
title_sort | time series analysis of sars-cov-2 genomes and correlations among highly prevalent mutations |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9603882/ https://www.ncbi.nlm.nih.gov/pubmed/36069583 http://dx.doi.org/10.1128/spectrum.01219-22 |
work_keys_str_mv | AT periwalneha timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT rathodshravanb timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT sarmasankritya timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT johargundeeps timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT jainavantika timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT barnwalravip timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT srivastavakinsukhr timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT kaurbaljeet timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT arorapooja timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations AT soodvikas timeseriesanalysisofsarscov2genomesandcorrelationsamonghighlyprevalentmutations |