Cargando…

Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally

Since its emergence in 2019, SARS-CoV-2 has spread and evolved globally, with newly emerged variants of concern (VOCs) accounting for more than 500 million COVID-19 cases and 6 million deaths. Continuous surveillance utilizing simple genetic tools is needed to measure the viral epidemiological diver...

Descripción completa

Detalles Bibliográficos
Autores principales: Chan, Felicia Hui Min, Ataide, Ricardo, Richards, Jack S., Narh, Charles A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9316073/
https://www.ncbi.nlm.nih.gov/pubmed/35891414
http://dx.doi.org/10.3390/v14071434
_version_ 1784754715545829376
author Chan, Felicia Hui Min
Ataide, Ricardo
Richards, Jack S.
Narh, Charles A.
author_facet Chan, Felicia Hui Min
Ataide, Ricardo
Richards, Jack S.
Narh, Charles A.
author_sort Chan, Felicia Hui Min
collection PubMed
description Since its emergence in 2019, SARS-CoV-2 has spread and evolved globally, with newly emerged variants of concern (VOCs) accounting for more than 500 million COVID-19 cases and 6 million deaths. Continuous surveillance utilizing simple genetic tools is needed to measure the viral epidemiological diversity, risk of infection, and distribution among different demographics in different geographical regions. To help address this need, we developed a proof-of-concept multilocus genotyping tool and demonstrated its utility to monitor viral populations sampled in 2020 and 2021 across six continents. We sampled globally 22,164 SARS-CoV-2 genomes from GISAID (inclusion criteria: available clinical and demographic data). They comprised two study populations, “2020 genomes” (N = 5959) sampled from December 2019 to September 2020 and “2021 genomes” (N = 16,205) sampled from 15 January to 15 March 2021. All genomes were aligned to the SARS-CoV-2 reference genome and amino acid polymorphisms were called with quality filtering. Thereafter, 74 codons (loci) in 14 genes including orf1ab polygene (N = 9), orf3a, orf8, nucleocapsid (N), matrix (M), and spike (S) met the 0.01 minimum allele frequency criteria and were selected to construct multilocus genotypes (MLGs) for the genomes. At these loci, 137 mutant/variant amino acids (alleles) were detected with eight VOC-defining variant alleles, including N KR(203&204), orf1ab (I(265), F(3606), and L(4715)), orf3a H(57), orf8 S(84), and S G(614), being predominant globally with > 35% prevalence. Their persistence and selection were associated with peaks in the viral transmission and COVID-19 incidence between 2020 and 2021. Epidemiologically, older patients (≥20 years) compared to younger patients (<20 years) had a higher risk of being infected with these variants, but this association was dependent on the continent of origin. In the global population, the discriminant analysis of principal components (DAPC) showed contrasting patterns of genetic clustering with three (Africa, Asia, and North America) and two (North and South America) continental clusters being observed for the 2020 and 2021 global populations, respectively. Within each continent, the MLG repertoires (range 40–199) sampled in 2020 and 2021 were genetically differentiated, with ≤4 MLGs per repertoire accounting for the majority of genomes sampled. These data suggested that the majority of SARS-CoV-2 infections in 2020 and 2021 were caused by genetically distinct variants that likely adapted to local populations. Indeed, four GISAID clade-defined VOCs - GRY (Alpha), GH (Beta), GR (Gamma), and G/GK (Delta variant) were differentiated by their MLG signatures, demonstrating the versatility of the MLG tool for variant identification. Results from this proof-of-concept multilocus genotyping demonstrates its utility for SARS-CoV-2 genomic surveillance and for monitoring its spatiotemporal epidemiology and evolution, particularly in response to control interventions including COVID-19 vaccines and chemotherapies.
format Online
Article
Text
id pubmed-9316073
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93160732022-07-27 Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally Chan, Felicia Hui Min Ataide, Ricardo Richards, Jack S. Narh, Charles A. Viruses Article Since its emergence in 2019, SARS-CoV-2 has spread and evolved globally, with newly emerged variants of concern (VOCs) accounting for more than 500 million COVID-19 cases and 6 million deaths. Continuous surveillance utilizing simple genetic tools is needed to measure the viral epidemiological diversity, risk of infection, and distribution among different demographics in different geographical regions. To help address this need, we developed a proof-of-concept multilocus genotyping tool and demonstrated its utility to monitor viral populations sampled in 2020 and 2021 across six continents. We sampled globally 22,164 SARS-CoV-2 genomes from GISAID (inclusion criteria: available clinical and demographic data). They comprised two study populations, “2020 genomes” (N = 5959) sampled from December 2019 to September 2020 and “2021 genomes” (N = 16,205) sampled from 15 January to 15 March 2021. All genomes were aligned to the SARS-CoV-2 reference genome and amino acid polymorphisms were called with quality filtering. Thereafter, 74 codons (loci) in 14 genes including orf1ab polygene (N = 9), orf3a, orf8, nucleocapsid (N), matrix (M), and spike (S) met the 0.01 minimum allele frequency criteria and were selected to construct multilocus genotypes (MLGs) for the genomes. At these loci, 137 mutant/variant amino acids (alleles) were detected with eight VOC-defining variant alleles, including N KR(203&204), orf1ab (I(265), F(3606), and L(4715)), orf3a H(57), orf8 S(84), and S G(614), being predominant globally with > 35% prevalence. Their persistence and selection were associated with peaks in the viral transmission and COVID-19 incidence between 2020 and 2021. Epidemiologically, older patients (≥20 years) compared to younger patients (<20 years) had a higher risk of being infected with these variants, but this association was dependent on the continent of origin. In the global population, the discriminant analysis of principal components (DAPC) showed contrasting patterns of genetic clustering with three (Africa, Asia, and North America) and two (North and South America) continental clusters being observed for the 2020 and 2021 global populations, respectively. Within each continent, the MLG repertoires (range 40–199) sampled in 2020 and 2021 were genetically differentiated, with ≤4 MLGs per repertoire accounting for the majority of genomes sampled. These data suggested that the majority of SARS-CoV-2 infections in 2020 and 2021 were caused by genetically distinct variants that likely adapted to local populations. Indeed, four GISAID clade-defined VOCs - GRY (Alpha), GH (Beta), GR (Gamma), and G/GK (Delta variant) were differentiated by their MLG signatures, demonstrating the versatility of the MLG tool for variant identification. Results from this proof-of-concept multilocus genotyping demonstrates its utility for SARS-CoV-2 genomic surveillance and for monitoring its spatiotemporal epidemiology and evolution, particularly in response to control interventions including COVID-19 vaccines and chemotherapies. MDPI 2022-06-29 /pmc/articles/PMC9316073/ /pubmed/35891414 http://dx.doi.org/10.3390/v14071434 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chan, Felicia Hui Min
Ataide, Ricardo
Richards, Jack S.
Narh, Charles A.
Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally
title Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally
title_full Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally
title_fullStr Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally
title_full_unstemmed Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally
title_short Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally
title_sort contrasting epidemiology and population genetics of covid-19 infections defined by multilocus genotypes in sars-cov-2 genomes sampled globally
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9316073/
https://www.ncbi.nlm.nih.gov/pubmed/35891414
http://dx.doi.org/10.3390/v14071434
work_keys_str_mv AT chanfeliciahuimin contrastingepidemiologyandpopulationgeneticsofcovid19infectionsdefinedbymultilocusgenotypesinsarscov2genomessampledglobally
AT ataidericardo contrastingepidemiologyandpopulationgeneticsofcovid19infectionsdefinedbymultilocusgenotypesinsarscov2genomessampledglobally
AT richardsjacks contrastingepidemiologyandpopulationgeneticsofcovid19infectionsdefinedbymultilocusgenotypesinsarscov2genomessampledglobally
AT narhcharlesa contrastingepidemiologyandpopulationgeneticsofcovid19infectionsdefinedbymultilocusgenotypesinsarscov2genomessampledglobally