Cargando…

Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters

A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is responsible for the ongoing outbreak of coronavirus disease 2019 (COVID-19). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensi...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Qi, Zhao, Shilei, Shi, Cheng-Min, Song, Shuhui, Zhu, Sihui, Su, Yankai, Zhao, Wenming, Li, Mingkun, Bao, Yiming, Xue, Yongbiao, Chen, Hua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7354277/
https://www.ncbi.nlm.nih.gov/pubmed/32663617
http://dx.doi.org/10.1016/j.gpb.2020.06.001
_version_ 1783558049227603968
author Liu, Qi
Zhao, Shilei
Shi, Cheng-Min
Song, Shuhui
Zhu, Sihui
Su, Yankai
Zhao, Wenming
Li, Mingkun
Bao, Yiming
Xue, Yongbiao
Chen, Hua
author_facet Liu, Qi
Zhao, Shilei
Shi, Cheng-Min
Song, Shuhui
Zhu, Sihui
Su, Yankai
Zhao, Wenming
Li, Mingkun
Bao, Yiming
Xue, Yongbiao
Chen, Hua
author_sort Liu, Qi
collection PubMed
description A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is responsible for the ongoing outbreak of coronavirus disease 2019 (COVID-19). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensive sampling bias and existence of infection clusters during the epidemic spread, direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation. In this study, we first present robust estimator for the time to the most recent common ancestor (TMRCA) and the mutation rate, and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2. The mutation rate is inferred to be 8.69 × 10(−4) per site per year with a 95% confidence interval (CI) of [8.61 × 10(−4), 8.77 × 10(−4)], and the TMRCA of the samples inferred to be Nov 28, 2019 with a 95% CI of [Oct 20, 2019, Dec 9, 2019]. The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market. We further demonstrate that genetic polymorphism patterns, including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters, are similar to those caused by evolutionary forces such as natural selection. Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2. Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081.
format Online
Article
Text
id pubmed-7354277
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-73542772020-07-13 Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters Liu, Qi Zhao, Shilei Shi, Cheng-Min Song, Shuhui Zhu, Sihui Su, Yankai Zhao, Wenming Li, Mingkun Bao, Yiming Xue, Yongbiao Chen, Hua Genomics Proteomics Bioinformatics Original Research A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is responsible for the ongoing outbreak of coronavirus disease 2019 (COVID-19). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensive sampling bias and existence of infection clusters during the epidemic spread, direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation. In this study, we first present robust estimator for the time to the most recent common ancestor (TMRCA) and the mutation rate, and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2. The mutation rate is inferred to be 8.69 × 10(−4) per site per year with a 95% confidence interval (CI) of [8.61 × 10(−4), 8.77 × 10(−4)], and the TMRCA of the samples inferred to be Nov 28, 2019 with a 95% CI of [Oct 20, 2019, Dec 9, 2019]. The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market. We further demonstrate that genetic polymorphism patterns, including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters, are similar to those caused by evolutionary forces such as natural selection. Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2. Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081. Elsevier 2020-12 2020-07-12 /pmc/articles/PMC7354277/ /pubmed/32663617 http://dx.doi.org/10.1016/j.gpb.2020.06.001 Text en © 2021 Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Original Research
Liu, Qi
Zhao, Shilei
Shi, Cheng-Min
Song, Shuhui
Zhu, Sihui
Su, Yankai
Zhao, Wenming
Li, Mingkun
Bao, Yiming
Xue, Yongbiao
Chen, Hua
Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
title Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
title_full Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
title_fullStr Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
title_full_unstemmed Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
title_short Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
title_sort population genetics of sars-cov-2: disentangling effects of sampling bias and infection clusters
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7354277/
https://www.ncbi.nlm.nih.gov/pubmed/32663617
http://dx.doi.org/10.1016/j.gpb.2020.06.001
work_keys_str_mv AT liuqi populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT zhaoshilei populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT shichengmin populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT songshuhui populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT zhusihui populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT suyankai populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT zhaowenming populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT limingkun populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT baoyiming populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT xueyongbiao populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters
AT chenhua populationgeneticsofsarscov2disentanglingeffectsofsamplingbiasandinfectionclusters