Cargando…

Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence

Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome. The co-occurrence of specific amino acid changes, collectively named ‘virus variant’, requires scrutiny (as variants may hugely impact the agent’s transmission, pathogenesis, or antige...

Descripción completa

Detalles Bibliográficos
Autores principales: Bernasconi, Anna, Mari, Lorenzo, Casagrandi, Renato, Ceri, Stefano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8548498/
https://www.ncbi.nlm.nih.gov/pubmed/34702903
http://dx.doi.org/10.1038/s41598-021-00496-z
_version_ 1784590583126294528
author Bernasconi, Anna
Mari, Lorenzo
Casagrandi, Renato
Ceri, Stefano
author_facet Bernasconi, Anna
Mari, Lorenzo
Casagrandi, Renato
Ceri, Stefano
author_sort Bernasconi, Anna
collection PubMed
description Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome. The co-occurrence of specific amino acid changes, collectively named ‘virus variant’, requires scrutiny (as variants may hugely impact the agent’s transmission, pathogenesis, or antigenicity); variant evolution is studied using phylogenetics. Yet, never has this problem been tackled by digging into data with ad hoc analysis techniques. Here we show that the emergence of variants can in fact be traced through data-driven methods, further capitalizing on the value of large collections of SARS-CoV-2 sequences. For all countries with sufficient data, we compute weekly counts of amino acid changes, unveil time-varying clusters of changes with similar—rapidly growing—dynamics, and then follow their evolution. Our method succeeds in timely associating clusters to variants of interest/concern, provided their change composition is well characterized. This allows us to detect variants’ emergence, rise, peak, and eventual decline under competitive pressure of another variant. Our early warning system, exclusively relying on deposited sequences, shows the power of big data in this context, and concurs to calling for the wide spreading of public SARS-CoV-2 genome sequencing for improved surveillance and control of the COVID-19 pandemic.
format Online
Article
Text
id pubmed-8548498
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-85484982021-10-28 Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence Bernasconi, Anna Mari, Lorenzo Casagrandi, Renato Ceri, Stefano Sci Rep Article Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome. The co-occurrence of specific amino acid changes, collectively named ‘virus variant’, requires scrutiny (as variants may hugely impact the agent’s transmission, pathogenesis, or antigenicity); variant evolution is studied using phylogenetics. Yet, never has this problem been tackled by digging into data with ad hoc analysis techniques. Here we show that the emergence of variants can in fact be traced through data-driven methods, further capitalizing on the value of large collections of SARS-CoV-2 sequences. For all countries with sufficient data, we compute weekly counts of amino acid changes, unveil time-varying clusters of changes with similar—rapidly growing—dynamics, and then follow their evolution. Our method succeeds in timely associating clusters to variants of interest/concern, provided their change composition is well characterized. This allows us to detect variants’ emergence, rise, peak, and eventual decline under competitive pressure of another variant. Our early warning system, exclusively relying on deposited sequences, shows the power of big data in this context, and concurs to calling for the wide spreading of public SARS-CoV-2 genome sequencing for improved surveillance and control of the COVID-19 pandemic. Nature Publishing Group UK 2021-10-26 /pmc/articles/PMC8548498/ /pubmed/34702903 http://dx.doi.org/10.1038/s41598-021-00496-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Bernasconi, Anna
Mari, Lorenzo
Casagrandi, Renato
Ceri, Stefano
Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence
title Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence
title_full Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence
title_fullStr Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence
title_full_unstemmed Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence
title_short Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence
title_sort data-driven analysis of amino acid change dynamics timely reveals sars-cov-2 variant emergence
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8548498/
https://www.ncbi.nlm.nih.gov/pubmed/34702903
http://dx.doi.org/10.1038/s41598-021-00496-z
work_keys_str_mv AT bernasconianna datadrivenanalysisofaminoacidchangedynamicstimelyrevealssarscov2variantemergence
AT marilorenzo datadrivenanalysisofaminoacidchangedynamicstimelyrevealssarscov2variantemergence
AT casagrandirenato datadrivenanalysisofaminoacidchangedynamicstimelyrevealssarscov2variantemergence
AT ceristefano datadrivenanalysisofaminoacidchangedynamicstimelyrevealssarscov2variantemergence