Cargando…

Understanding the mutational frequency in SARS-CoV-2 proteome using structural features

The prolonged transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus in the human population has led to demographic divergence and the emergence of several location-specific clusters of viral strains. Although the effect of mutation(s) on severity and survival of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Rawat, Puneet, Sharma, Divya, Pandey, Medha, Prabakaran, R., Gromiha, M. Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Authors. Published by Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9173821/
https://www.ncbi.nlm.nih.gov/pubmed/35714506
http://dx.doi.org/10.1016/j.compbiomed.2022.105708
_version_ 1784722103240491008
author Rawat, Puneet
Sharma, Divya
Pandey, Medha
Prabakaran, R.
Gromiha, M. Michael
author_facet Rawat, Puneet
Sharma, Divya
Pandey, Medha
Prabakaran, R.
Gromiha, M. Michael
author_sort Rawat, Puneet
collection PubMed
description The prolonged transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus in the human population has led to demographic divergence and the emergence of several location-specific clusters of viral strains. Although the effect of mutation(s) on severity and survival of the virus is still unclear, it is evident that certain sites in the viral proteome are more/less prone to mutations. In fact, millions of SARS-CoV-2 sequences collected all over the world have provided us a unique opportunity to understand viral protein mutations and develop novel computational approaches to predict mutational patterns. In this study, we have classified the mutation sites into low and high mutability classes based on viral isolates count containing mutations. The physicochemical features and structural analysis of the SARS-CoV-2 proteins showed that features including residue type, surface accessibility, residue bulkiness, stability and sequence conservation at the mutation site were able to classify the low and high mutability sites. We further developed machine learning models using above-mentioned features, to predict low and high mutability sites at different selection thresholds (ranging 5–30% of topmost and bottommost mutated sites) and observed the improvement in performance as the selection threshold is reduced (prediction accuracy ranging from 65 to 77%). The analysis will be useful for early detection of variants of concern for the SARS-CoV-2, which can also be applied to other existing and emerging viruses for another pandemic prevention.
format Online
Article
Text
id pubmed-9173821
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Authors. Published by Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-91738212022-06-08 Understanding the mutational frequency in SARS-CoV-2 proteome using structural features Rawat, Puneet Sharma, Divya Pandey, Medha Prabakaran, R. Gromiha, M. Michael Comput Biol Med Article The prolonged transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus in the human population has led to demographic divergence and the emergence of several location-specific clusters of viral strains. Although the effect of mutation(s) on severity and survival of the virus is still unclear, it is evident that certain sites in the viral proteome are more/less prone to mutations. In fact, millions of SARS-CoV-2 sequences collected all over the world have provided us a unique opportunity to understand viral protein mutations and develop novel computational approaches to predict mutational patterns. In this study, we have classified the mutation sites into low and high mutability classes based on viral isolates count containing mutations. The physicochemical features and structural analysis of the SARS-CoV-2 proteins showed that features including residue type, surface accessibility, residue bulkiness, stability and sequence conservation at the mutation site were able to classify the low and high mutability sites. We further developed machine learning models using above-mentioned features, to predict low and high mutability sites at different selection thresholds (ranging 5–30% of topmost and bottommost mutated sites) and observed the improvement in performance as the selection threshold is reduced (prediction accuracy ranging from 65 to 77%). The analysis will be useful for early detection of variants of concern for the SARS-CoV-2, which can also be applied to other existing and emerging viruses for another pandemic prevention. The Authors. Published by Elsevier Ltd. 2022-08 2022-06-07 /pmc/articles/PMC9173821/ /pubmed/35714506 http://dx.doi.org/10.1016/j.compbiomed.2022.105708 Text en © 2022 The Authors Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Rawat, Puneet
Sharma, Divya
Pandey, Medha
Prabakaran, R.
Gromiha, M. Michael
Understanding the mutational frequency in SARS-CoV-2 proteome using structural features
title Understanding the mutational frequency in SARS-CoV-2 proteome using structural features
title_full Understanding the mutational frequency in SARS-CoV-2 proteome using structural features
title_fullStr Understanding the mutational frequency in SARS-CoV-2 proteome using structural features
title_full_unstemmed Understanding the mutational frequency in SARS-CoV-2 proteome using structural features
title_short Understanding the mutational frequency in SARS-CoV-2 proteome using structural features
title_sort understanding the mutational frequency in sars-cov-2 proteome using structural features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9173821/
https://www.ncbi.nlm.nih.gov/pubmed/35714506
http://dx.doi.org/10.1016/j.compbiomed.2022.105708
work_keys_str_mv AT rawatpuneet understandingthemutationalfrequencyinsarscov2proteomeusingstructuralfeatures
AT sharmadivya understandingthemutationalfrequencyinsarscov2proteomeusingstructuralfeatures
AT pandeymedha understandingthemutationalfrequencyinsarscov2proteomeusingstructuralfeatures
AT prabakaranr understandingthemutationalfrequencyinsarscov2proteomeusingstructuralfeatures
AT gromihammichael understandingthemutationalfrequencyinsarscov2proteomeusingstructuralfeatures