Cargando…

Epidemiological associations with genomic variation in SARS-CoV-2

SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidem...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahnavard, Ali, Dawson, Tyson, Clement, Rebecca, Stearrett, Nathaniel, Pérez-Losada, Marcos, Crandall, Keith A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8626494/
https://www.ncbi.nlm.nih.gov/pubmed/34837008
http://dx.doi.org/10.1038/s41598-021-02548-w
_version_ 1784606669623263232
author Rahnavard, Ali
Dawson, Tyson
Clement, Rebecca
Stearrett, Nathaniel
Pérez-Losada, Marcos
Crandall, Keith A.
author_facet Rahnavard, Ali
Dawson, Tyson
Clement, Rebecca
Stearrett, Nathaniel
Pérez-Losada, Marcos
Crandall, Keith A.
author_sort Rahnavard, Ali
collection PubMed
description SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3′-to-5′ exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic—coherence—and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status.
format Online
Article
Text
id pubmed-8626494
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-86264942021-11-29 Epidemiological associations with genomic variation in SARS-CoV-2 Rahnavard, Ali Dawson, Tyson Clement, Rebecca Stearrett, Nathaniel Pérez-Losada, Marcos Crandall, Keith A. Sci Rep Article SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3′-to-5′ exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic—coherence—and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status. Nature Publishing Group UK 2021-11-26 /pmc/articles/PMC8626494/ /pubmed/34837008 http://dx.doi.org/10.1038/s41598-021-02548-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Rahnavard, Ali
Dawson, Tyson
Clement, Rebecca
Stearrett, Nathaniel
Pérez-Losada, Marcos
Crandall, Keith A.
Epidemiological associations with genomic variation in SARS-CoV-2
title Epidemiological associations with genomic variation in SARS-CoV-2
title_full Epidemiological associations with genomic variation in SARS-CoV-2
title_fullStr Epidemiological associations with genomic variation in SARS-CoV-2
title_full_unstemmed Epidemiological associations with genomic variation in SARS-CoV-2
title_short Epidemiological associations with genomic variation in SARS-CoV-2
title_sort epidemiological associations with genomic variation in sars-cov-2
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8626494/
https://www.ncbi.nlm.nih.gov/pubmed/34837008
http://dx.doi.org/10.1038/s41598-021-02548-w
work_keys_str_mv AT rahnavardali epidemiologicalassociationswithgenomicvariationinsarscov2
AT dawsontyson epidemiologicalassociationswithgenomicvariationinsarscov2
AT clementrebecca epidemiologicalassociationswithgenomicvariationinsarscov2
AT stearrettnathaniel epidemiologicalassociationswithgenomicvariationinsarscov2
AT perezlosadamarcos epidemiologicalassociationswithgenomicvariationinsarscov2
AT crandallkeitha epidemiologicalassociationswithgenomicvariationinsarscov2