Cargando…
Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide
Deciphering the population structure of SARS-CoV-2 is critical to inform public health management and reduce the risk of future dissemination. With the continuous accruing of SARS-CoV-2 genomes worldwide, discovering an effective way to group these genomes is critical for organizing the landscape of...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9030792/ https://www.ncbi.nlm.nih.gov/pubmed/35456454 http://dx.doi.org/10.3390/genes13040648 |
_version_ | 1784692229051252736 |
---|---|
author | Li, Yawei Liu, Qingyun Zeng, Zexian Luo, Yuan |
author_facet | Li, Yawei Liu, Qingyun Zeng, Zexian Luo, Yuan |
author_sort | Li, Yawei |
collection | PubMed |
description | Deciphering the population structure of SARS-CoV-2 is critical to inform public health management and reduce the risk of future dissemination. With the continuous accruing of SARS-CoV-2 genomes worldwide, discovering an effective way to group these genomes is critical for organizing the landscape of the population structure of the virus. Taking advantage of recently published state-of-the-art machine learning algorithms, we used an unsupervised deep learning clustering algorithm to group a total of 16,873 SARS-CoV-2 genomes. Using single nucleotide polymorphisms as input features, we identified six major subtypes of SARS-CoV-2. The proportions of the clusters across the continents revealed distinct geographical distributions. Comprehensive analysis indicated that both genetic factors and human migration factors shaped the specific geographical distribution of the population structure. This study provides a different approach using clustering methods to study the population structure of a never-seen-before and fast-growing species such as SARS-CoV-2. Moreover, clustering techniques can be used for further studies of local population structures of the proliferating virus. |
format | Online Article Text |
id | pubmed-9030792 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-90307922022-04-23 Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide Li, Yawei Liu, Qingyun Zeng, Zexian Luo, Yuan Genes (Basel) Article Deciphering the population structure of SARS-CoV-2 is critical to inform public health management and reduce the risk of future dissemination. With the continuous accruing of SARS-CoV-2 genomes worldwide, discovering an effective way to group these genomes is critical for organizing the landscape of the population structure of the virus. Taking advantage of recently published state-of-the-art machine learning algorithms, we used an unsupervised deep learning clustering algorithm to group a total of 16,873 SARS-CoV-2 genomes. Using single nucleotide polymorphisms as input features, we identified six major subtypes of SARS-CoV-2. The proportions of the clusters across the continents revealed distinct geographical distributions. Comprehensive analysis indicated that both genetic factors and human migration factors shaped the specific geographical distribution of the population structure. This study provides a different approach using clustering methods to study the population structure of a never-seen-before and fast-growing species such as SARS-CoV-2. Moreover, clustering techniques can be used for further studies of local population structures of the proliferating virus. MDPI 2022-04-07 /pmc/articles/PMC9030792/ /pubmed/35456454 http://dx.doi.org/10.3390/genes13040648 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Li, Yawei Liu, Qingyun Zeng, Zexian Luo, Yuan Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide |
title | Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide |
title_full | Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide |
title_fullStr | Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide |
title_full_unstemmed | Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide |
title_short | Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide |
title_sort | using an unsupervised clustering model to detect the early spread of sars-cov-2 worldwide |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9030792/ https://www.ncbi.nlm.nih.gov/pubmed/35456454 http://dx.doi.org/10.3390/genes13040648 |
work_keys_str_mv | AT liyawei usinganunsupervisedclusteringmodeltodetecttheearlyspreadofsarscov2worldwide AT liuqingyun usinganunsupervisedclusteringmodeltodetecttheearlyspreadofsarscov2worldwide AT zengzexian usinganunsupervisedclusteringmodeltodetecttheearlyspreadofsarscov2worldwide AT luoyuan usinganunsupervisedclusteringmodeltodetecttheearlyspreadofsarscov2worldwide |