Cargando…

Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide

The coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) poses innumerous challenges, like understanding what triggered the emergence of this new human virus, how this RNA virus is evolving or how the variability of viral genome may...

Descripción completa

Detalles Bibliográficos
Autores principales: Vale, Filipa F., Vítor, Jorge M.B., Marques, Andreia T., Azevedo-Pereira, José Miguel, Anes, Elsa, Goncalves, Joao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Published by Elsevier B.V. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323504/
https://www.ncbi.nlm.nih.gov/pubmed/34339772
http://dx.doi.org/10.1016/j.virusres.2021.198526
_version_ 1783731255697735680
author Vale, Filipa F.
Vítor, Jorge M.B.
Marques, Andreia T.
Azevedo-Pereira, José Miguel
Anes, Elsa
Goncalves, Joao
author_facet Vale, Filipa F.
Vítor, Jorge M.B.
Marques, Andreia T.
Azevedo-Pereira, José Miguel
Anes, Elsa
Goncalves, Joao
author_sort Vale, Filipa F.
collection PubMed
description The coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) poses innumerous challenges, like understanding what triggered the emergence of this new human virus, how this RNA virus is evolving or how the variability of viral genome may impact the primary structure of proteins that are targets for vaccine. We analyzed 19471 SARS-CoV-2 genomes available at the GISAID database from all over the world and 3335 genomes of other Coronoviridae family members available at GenBank, collecting SARS-CoV-2 high-quality genomes and distinct Coronoviridae family genomes. Additionally, we analyzed 199,984 spike glycoprotein sequences. Here, we identify a SARS-CoV-2 emerging cluster containing 13 closely related genomes isolated from bat and pangolin that showed evidence of recombination, which may have contributed to the emergence of SARS-CoV-2. The analyzed SARS-CoV-2 genomes presented 9632 single nucleotide variants (SNVs) corresponding to a variant density of 0.3 over the genome, and a clear geographic distribution. SNVs are unevenly distributed throughout the genome and hotspots for mutations were found for the spike gene and ORF 1ab. We describe a set of predicted spike protein epitopes whose variability is negligible. Additionally, all predicted epitopes for the structural E, M and N proteins are highly conserved. The amino acid changes present in the spike glycoprotein of variables of concern (VOCs) comprise between 3.4% and 20.7% of the predicted epitopes of this protein. These results favors the continuous efficacy of the available vaccines targeting the spike protein, and other structural proteins. Multiple epitopes vaccines should sustain vaccine efficacy since at least some of the epitopes present in variability regions of VOCs are conserved and thus recognizable by antibodies.
format Online
Article
Text
id pubmed-8323504
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Published by Elsevier B.V.
record_format MEDLINE/PubMed
spelling pubmed-83235042021-07-30 Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide Vale, Filipa F. Vítor, Jorge M.B. Marques, Andreia T. Azevedo-Pereira, José Miguel Anes, Elsa Goncalves, Joao Virus Res Article The coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) poses innumerous challenges, like understanding what triggered the emergence of this new human virus, how this RNA virus is evolving or how the variability of viral genome may impact the primary structure of proteins that are targets for vaccine. We analyzed 19471 SARS-CoV-2 genomes available at the GISAID database from all over the world and 3335 genomes of other Coronoviridae family members available at GenBank, collecting SARS-CoV-2 high-quality genomes and distinct Coronoviridae family genomes. Additionally, we analyzed 199,984 spike glycoprotein sequences. Here, we identify a SARS-CoV-2 emerging cluster containing 13 closely related genomes isolated from bat and pangolin that showed evidence of recombination, which may have contributed to the emergence of SARS-CoV-2. The analyzed SARS-CoV-2 genomes presented 9632 single nucleotide variants (SNVs) corresponding to a variant density of 0.3 over the genome, and a clear geographic distribution. SNVs are unevenly distributed throughout the genome and hotspots for mutations were found for the spike gene and ORF 1ab. We describe a set of predicted spike protein epitopes whose variability is negligible. Additionally, all predicted epitopes for the structural E, M and N proteins are highly conserved. The amino acid changes present in the spike glycoprotein of variables of concern (VOCs) comprise between 3.4% and 20.7% of the predicted epitopes of this protein. These results favors the continuous efficacy of the available vaccines targeting the spike protein, and other structural proteins. Multiple epitopes vaccines should sustain vaccine efficacy since at least some of the epitopes present in variability regions of VOCs are conserved and thus recognizable by antibodies. Published by Elsevier B.V. 2021-10-15 2021-07-30 /pmc/articles/PMC8323504/ /pubmed/34339772 http://dx.doi.org/10.1016/j.virusres.2021.198526 Text en © 2021 Published by Elsevier B.V. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Vale, Filipa F.
Vítor, Jorge M.B.
Marques, Andreia T.
Azevedo-Pereira, José Miguel
Anes, Elsa
Goncalves, Joao
Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide
title Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide
title_full Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide
title_fullStr Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide
title_full_unstemmed Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide
title_short Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide
title_sort origin, phylogeny, variability and epitope conservation of sars-cov-2 worldwide
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323504/
https://www.ncbi.nlm.nih.gov/pubmed/34339772
http://dx.doi.org/10.1016/j.virusres.2021.198526
work_keys_str_mv AT valefilipaf originphylogenyvariabilityandepitopeconservationofsarscov2worldwide
AT vitorjorgemb originphylogenyvariabilityandepitopeconservationofsarscov2worldwide
AT marquesandreiat originphylogenyvariabilityandepitopeconservationofsarscov2worldwide
AT azevedopereirajosemiguel originphylogenyvariabilityandepitopeconservationofsarscov2worldwide
AT aneselsa originphylogenyvariabilityandepitopeconservationofsarscov2worldwide
AT goncalvesjoao originphylogenyvariabilityandepitopeconservationofsarscov2worldwide