Cargando…

Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach

SARS-CoV-2 is a positive-sense single-stranded RNA virus with emerging mutations, especially on the Spike glycoprotein (S protein). To delineate the genomic diversity in association with geographic dispersion of SARS-CoV-2 variant lineages, we collected 939,591 complete S protein sequences deposited...

Descripción completa

Detalles Bibliográficos
Autores principales: Boon, Siaw Shi, Xia, Chichao, Wang, Maggie Haitian, Yip, Ka Lai, Luk, Ho Yin, Li, Sile, Ng, Rita W. Y., Lai, Christopher K. C., Chan, Paul Kay Sheung, Chen, Zigui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8546546/
https://www.ncbi.nlm.nih.gov/pubmed/34700382
http://dx.doi.org/10.1128/mBio.02687-21
_version_ 1784590206673879040
author Boon, Siaw Shi
Xia, Chichao
Wang, Maggie Haitian
Yip, Ka Lai
Luk, Ho Yin
Li, Sile
Ng, Rita W. Y.
Lai, Christopher K. C.
Chan, Paul Kay Sheung
Chen, Zigui
author_facet Boon, Siaw Shi
Xia, Chichao
Wang, Maggie Haitian
Yip, Ka Lai
Luk, Ho Yin
Li, Sile
Ng, Rita W. Y.
Lai, Christopher K. C.
Chan, Paul Kay Sheung
Chen, Zigui
author_sort Boon, Siaw Shi
collection PubMed
description SARS-CoV-2 is a positive-sense single-stranded RNA virus with emerging mutations, especially on the Spike glycoprotein (S protein). To delineate the genomic diversity in association with geographic dispersion of SARS-CoV-2 variant lineages, we collected 939,591 complete S protein sequences deposited in the Global Initiative on Sharing All Influenza Data (GISAID) from December 2019 to April 2021. An exponential emergence of S protein variants was observed since October 2020 when the four major variants of concern (VOCs), namely, alpha (α) (B.1.1.7), beta (β) (B.1.351), gamma (γ) (P.1), and delta (δ) (B.1.617), started to circulate in various communities. We found that residues 452, 477, 484, and 501, the 4 key amino acids located in the hACE2 binding domain of S protein, were under positive selection. Through in silico protein structure prediction and immunoinformatics tools, we discovered D614G is the key determinant to S protein conformational change, while variations of N439K, T478I, E484K, and N501Y in S1-RBD also had an impact on S protein binding affinity to hACE2 and antigenicity. Finally, we predicted that the yet-to-be-identified hypothetical N439S, T478S, and N501K mutations could confer an even greater binding affinity to hACE2 and evade host immune surveillance more efficiently than the respective native variants. This study documented the evolution of SARS-CoV-2 S protein over the first 16 months of the pandemic and identified several key amino acid changes that are predicted to confer a substantial impact on transmission and immunological recognition. These findings convey crucial information to sequence-based surveillance programs and the design of next-generation vaccines.
format Online
Article
Text
id pubmed-8546546
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-85465462021-11-04 Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach Boon, Siaw Shi Xia, Chichao Wang, Maggie Haitian Yip, Ka Lai Luk, Ho Yin Li, Sile Ng, Rita W. Y. Lai, Christopher K. C. Chan, Paul Kay Sheung Chen, Zigui mBio Research Article SARS-CoV-2 is a positive-sense single-stranded RNA virus with emerging mutations, especially on the Spike glycoprotein (S protein). To delineate the genomic diversity in association with geographic dispersion of SARS-CoV-2 variant lineages, we collected 939,591 complete S protein sequences deposited in the Global Initiative on Sharing All Influenza Data (GISAID) from December 2019 to April 2021. An exponential emergence of S protein variants was observed since October 2020 when the four major variants of concern (VOCs), namely, alpha (α) (B.1.1.7), beta (β) (B.1.351), gamma (γ) (P.1), and delta (δ) (B.1.617), started to circulate in various communities. We found that residues 452, 477, 484, and 501, the 4 key amino acids located in the hACE2 binding domain of S protein, were under positive selection. Through in silico protein structure prediction and immunoinformatics tools, we discovered D614G is the key determinant to S protein conformational change, while variations of N439K, T478I, E484K, and N501Y in S1-RBD also had an impact on S protein binding affinity to hACE2 and antigenicity. Finally, we predicted that the yet-to-be-identified hypothetical N439S, T478S, and N501K mutations could confer an even greater binding affinity to hACE2 and evade host immune surveillance more efficiently than the respective native variants. This study documented the evolution of SARS-CoV-2 S protein over the first 16 months of the pandemic and identified several key amino acid changes that are predicted to confer a substantial impact on transmission and immunological recognition. These findings convey crucial information to sequence-based surveillance programs and the design of next-generation vaccines. American Society for Microbiology 2021-10-26 /pmc/articles/PMC8546546/ /pubmed/34700382 http://dx.doi.org/10.1128/mBio.02687-21 Text en Copyright © 2021 Boon et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Boon, Siaw Shi
Xia, Chichao
Wang, Maggie Haitian
Yip, Ka Lai
Luk, Ho Yin
Li, Sile
Ng, Rita W. Y.
Lai, Christopher K. C.
Chan, Paul Kay Sheung
Chen, Zigui
Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach
title Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach
title_full Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach
title_fullStr Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach
title_full_unstemmed Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach
title_short Temporal-Geographical Dispersion of SARS-CoV-2 Spike Glycoprotein Variant Lineages and Their Functional Prediction Using in Silico Approach
title_sort temporal-geographical dispersion of sars-cov-2 spike glycoprotein variant lineages and their functional prediction using in silico approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8546546/
https://www.ncbi.nlm.nih.gov/pubmed/34700382
http://dx.doi.org/10.1128/mBio.02687-21
work_keys_str_mv AT boonsiawshi temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT xiachichao temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT wangmaggiehaitian temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT yipkalai temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT lukhoyin temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT lisile temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT ngritawy temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT laichristopherkc temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT chanpaulkaysheung temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach
AT chenzigui temporalgeographicaldispersionofsarscov2spikeglycoproteinvariantlineagesandtheirfunctionalpredictionusinginsilicoapproach