Cargando…
Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing
Discrete classification of SARS-CoV-2 viral genotypes can identify emerging strains and detect geographic spread, viral diversity, and transmission events. We developed a tool (GNU-based Virus IDentification [GNUVID]) that integrates whole-genome multilocus sequence typing and a supervised machine l...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8449825/ https://www.ncbi.nlm.nih.gov/pubmed/34432021 http://dx.doi.org/10.1093/gbe/evab197 |
_version_ | 1784569494486646784 |
---|---|
author | Moustafa, Ahmed M Planet, Paul J |
author_facet | Moustafa, Ahmed M Planet, Paul J |
author_sort | Moustafa, Ahmed M |
collection | PubMed |
description | Discrete classification of SARS-CoV-2 viral genotypes can identify emerging strains and detect geographic spread, viral diversity, and transmission events. We developed a tool (GNU-based Virus IDentification [GNUVID]) that integrates whole-genome multilocus sequence typing and a supervised machine learning random forest-based classifier. We used GNUVID to assign sequence type (ST) profiles to all high-quality genomes available from GISAID. STs were clustered into clonal complexes (CCs) and then used to train a machine learning classifier. We used this tool to detect potential introduction and exportation events and to estimate effective viral diversity across locations and over time in 16 US states. GNUVID is a highly scalable tool for viral genotype classification (https://github.com/ahmedmagds/GNUVID) that can quickly classify hundreds of thousands of genomes in a way that is consistent with phylogeny. Our genotyping ST/CC analysis uncovered dynamic local changes in ST/CC prevalence and diversity with multiple replacement events in different states, an average of 20.6 putative introductions and 7.5 exportations for each state over the time period analyzed. We introduce the use of effective diversity metrics (Hill numbers) that can be used to estimate the impact of interventions (e.g., travel restrictions, vaccine uptake, mask mandates) on the variation in circulating viruses. Our classification tool uncovered multiple introduction and exportation events, as well as waves of expansion and replacement of SARS-CoV-2 genotypes in different states. GNUVID classification lends itself to measures of ecological diversity, and, with systematic genomic sampling, it could be used to track circulating viral diversity and identify emerging clones and hotspots. |
format | Online Article Text |
id | pubmed-8449825 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84498252021-09-20 Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing Moustafa, Ahmed M Planet, Paul J Genome Biol Evol Research Article Discrete classification of SARS-CoV-2 viral genotypes can identify emerging strains and detect geographic spread, viral diversity, and transmission events. We developed a tool (GNU-based Virus IDentification [GNUVID]) that integrates whole-genome multilocus sequence typing and a supervised machine learning random forest-based classifier. We used GNUVID to assign sequence type (ST) profiles to all high-quality genomes available from GISAID. STs were clustered into clonal complexes (CCs) and then used to train a machine learning classifier. We used this tool to detect potential introduction and exportation events and to estimate effective viral diversity across locations and over time in 16 US states. GNUVID is a highly scalable tool for viral genotype classification (https://github.com/ahmedmagds/GNUVID) that can quickly classify hundreds of thousands of genomes in a way that is consistent with phylogeny. Our genotyping ST/CC analysis uncovered dynamic local changes in ST/CC prevalence and diversity with multiple replacement events in different states, an average of 20.6 putative introductions and 7.5 exportations for each state over the time period analyzed. We introduce the use of effective diversity metrics (Hill numbers) that can be used to estimate the impact of interventions (e.g., travel restrictions, vaccine uptake, mask mandates) on the variation in circulating viruses. Our classification tool uncovered multiple introduction and exportation events, as well as waves of expansion and replacement of SARS-CoV-2 genotypes in different states. GNUVID classification lends itself to measures of ecological diversity, and, with systematic genomic sampling, it could be used to track circulating viral diversity and identify emerging clones and hotspots. Oxford University Press 2021-08-25 /pmc/articles/PMC8449825/ /pubmed/34432021 http://dx.doi.org/10.1093/gbe/evab197 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Moustafa, Ahmed M Planet, Paul J Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing |
title | Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing |
title_full | Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing |
title_fullStr | Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing |
title_full_unstemmed | Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing |
title_short | Emerging SARS-CoV-2 Diversity Revealed by Rapid Whole-Genome Sequence Typing |
title_sort | emerging sars-cov-2 diversity revealed by rapid whole-genome sequence typing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8449825/ https://www.ncbi.nlm.nih.gov/pubmed/34432021 http://dx.doi.org/10.1093/gbe/evab197 |
work_keys_str_mv | AT moustafaahmedm emergingsarscov2diversityrevealedbyrapidwholegenomesequencetyping AT planetpaulj emergingsarscov2diversityrevealedbyrapidwholegenomesequencetyping |