Cargando…
Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database
SARS-CoV-2 has infected more than 600 million people. However, the origin of the virus is still unclear; knowing where the virus came from could help us prevent future zoonotic epidemics. Sequencing data, particularly metagenomic data, can profile the genomes of all species in the sample, including...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9927258/ https://www.ncbi.nlm.nih.gov/pubmed/36622170 http://dx.doi.org/10.1128/spectrum.03426-22 |
_version_ | 1784888442381926400 |
---|---|
author | Sun, Xiao Kan, Chuanwen Ma, Wentai Du, Zhenglin Li, Mingkun |
author_facet | Sun, Xiao Kan, Chuanwen Ma, Wentai Du, Zhenglin Li, Mingkun |
author_sort | Sun, Xiao |
collection | PubMed |
description | SARS-CoV-2 has infected more than 600 million people. However, the origin of the virus is still unclear; knowing where the virus came from could help us prevent future zoonotic epidemics. Sequencing data, particularly metagenomic data, can profile the genomes of all species in the sample, including those not recognized at the time, thus allowing for the identification of the progenitor of SARS-CoV-2 in samples collected before the pandemic. We analyzed the data from 5,196 SARS-CoV-2-positive sequencing runs in the NCBI’s SRA database with collection dates prior to 2020 or unknown. We found that the mutation patterns obtained from these suspicious SARS-CoV-2 reads did not match the genome characteristics of an unknown progenitor of the virus, suggesting that they may derive from circulating SARS-CoV-2 variants or other coronaviruses. Despite a negative result for tracking the progenitor of SARS-CoV-2, the methods developed in the study could assist in pinpointing the origin of various pathogens in the future. IMPORTANCE Sequences that are homologous to the SARS-CoV-2 genome were found in numerous sequencing runs that were not associated with the SARS-CoV-2 studies in the public database. It is unclear whether they are derived from the possible progenitor of SARS-CoV-2 or contamination of more recent SARS-CoV-2 variants circulated in the population due to the lack of information on the collection, library preparation, and sequencing processes. We have developed a computational framework to infer the evolutionary relationship between sequences based on the comparison of mutations, which enabled us to rule out the possibility that these suspicious sequences originate from unknown progenitors of SARS-CoV-2. |
format | Online Article Text |
id | pubmed-9927258 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-99272582023-02-15 Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database Sun, Xiao Kan, Chuanwen Ma, Wentai Du, Zhenglin Li, Mingkun Microbiol Spectr Research Article SARS-CoV-2 has infected more than 600 million people. However, the origin of the virus is still unclear; knowing where the virus came from could help us prevent future zoonotic epidemics. Sequencing data, particularly metagenomic data, can profile the genomes of all species in the sample, including those not recognized at the time, thus allowing for the identification of the progenitor of SARS-CoV-2 in samples collected before the pandemic. We analyzed the data from 5,196 SARS-CoV-2-positive sequencing runs in the NCBI’s SRA database with collection dates prior to 2020 or unknown. We found that the mutation patterns obtained from these suspicious SARS-CoV-2 reads did not match the genome characteristics of an unknown progenitor of the virus, suggesting that they may derive from circulating SARS-CoV-2 variants or other coronaviruses. Despite a negative result for tracking the progenitor of SARS-CoV-2, the methods developed in the study could assist in pinpointing the origin of various pathogens in the future. IMPORTANCE Sequences that are homologous to the SARS-CoV-2 genome were found in numerous sequencing runs that were not associated with the SARS-CoV-2 studies in the public database. It is unclear whether they are derived from the possible progenitor of SARS-CoV-2 or contamination of more recent SARS-CoV-2 variants circulated in the population due to the lack of information on the collection, library preparation, and sequencing processes. We have developed a computational framework to infer the evolutionary relationship between sequences based on the comparison of mutations, which enabled us to rule out the possibility that these suspicious sequences originate from unknown progenitors of SARS-CoV-2. American Society for Microbiology 2023-01-09 /pmc/articles/PMC9927258/ /pubmed/36622170 http://dx.doi.org/10.1128/spectrum.03426-22 Text en Copyright © 2023 Sun et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Research Article Sun, Xiao Kan, Chuanwen Ma, Wentai Du, Zhenglin Li, Mingkun Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database |
title | Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database |
title_full | Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database |
title_fullStr | Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database |
title_full_unstemmed | Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database |
title_short | Genomic Analysis of the Suspicious SARS-CoV-2 Sequences in the Public Sequencing Database |
title_sort | genomic analysis of the suspicious sars-cov-2 sequences in the public sequencing database |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9927258/ https://www.ncbi.nlm.nih.gov/pubmed/36622170 http://dx.doi.org/10.1128/spectrum.03426-22 |
work_keys_str_mv | AT sunxiao genomicanalysisofthesuspicioussarscov2sequencesinthepublicsequencingdatabase AT kanchuanwen genomicanalysisofthesuspicioussarscov2sequencesinthepublicsequencingdatabase AT mawentai genomicanalysisofthesuspicioussarscov2sequencesinthepublicsequencingdatabase AT duzhenglin genomicanalysisofthesuspicioussarscov2sequencesinthepublicsequencingdatabase AT limingkun genomicanalysisofthesuspicioussarscov2sequencesinthepublicsequencingdatabase |