Cargando…
Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences
The tremendous majority of SARS-CoV-2 genomic data so far neglected intra-host genetic diversity. Here, we studied SARS-CoV-2 quasispecies based on data generated by next-generation sequencing (NGS) of complete genomes. SARS-CoV-2 raw NGS data had been generated for nasopharyngeal samples collected...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9779826/ https://www.ncbi.nlm.nih.gov/pubmed/36555300 http://dx.doi.org/10.3390/ijms232415658 |
_version_ | 1784856704931856384 |
---|---|
author | Bader, Wahiba Delerce, Jeremy Aherfi, Sarah La Scola, Bernard Colson, Philippe |
author_facet | Bader, Wahiba Delerce, Jeremy Aherfi, Sarah La Scola, Bernard Colson, Philippe |
author_sort | Bader, Wahiba |
collection | PubMed |
description | The tremendous majority of SARS-CoV-2 genomic data so far neglected intra-host genetic diversity. Here, we studied SARS-CoV-2 quasispecies based on data generated by next-generation sequencing (NGS) of complete genomes. SARS-CoV-2 raw NGS data had been generated for nasopharyngeal samples collected between March 2020 and February 2021 by the Illumina technology on a MiSeq instrument, without prior PCR amplification. To analyze viral quasispecies, we designed and implemented an in-house Excel file (“QuasiS”) that can characterize intra-sample nucleotide diversity along the genomes using data of the mapping of NGS reads. We compared intra-sample genetic diversity and global genetic diversity available from Nextstrain. Hierarchical clustering of all samples based on the intra-sample genetic diversity was performed and visualized with the Morpheus web application. NGS mapping data from 110 SARS-CoV-2-positive respiratory samples characterized by a mean depth of 169 NGS reads/nucleotide position and for which consensus genomes that had been obtained were classified into 15 viral lineages were analyzed. Mean intra-sample nucleotide diversity was 0.21 ± 0.65%, and 5357 positions (17.9%) exhibited significant (>4%) diversity, in ≥2 genomes for 1730 (5.8%) of them. ORF10, spike, and N genes had the highest number of positions exhibiting diversity (0.56%, 0.34%, and 0.24%, respectively). Nine hot spots of intra-sample diversity were identified in the SARS-CoV-2 NSP6, NSP12, ORF8, and N genes. Hierarchical clustering delineated a set of six genomes of different lineages characterized by 920 positions exhibiting intra-sample diversity. In addition, 118 nucleotide positions (0.4%) exhibited diversity at both intra- and inter-patient levels. Overall, the present study illustrates that the SARS-CoV-2 consensus genome sequences are only an incomplete and imperfect representation of the entire viral population infecting a patient, and that quasispecies analysis may allow deciphering more accurately the viral evolutionary pathways. |
format | Online Article Text |
id | pubmed-9779826 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-97798262022-12-23 Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences Bader, Wahiba Delerce, Jeremy Aherfi, Sarah La Scola, Bernard Colson, Philippe Int J Mol Sci Article The tremendous majority of SARS-CoV-2 genomic data so far neglected intra-host genetic diversity. Here, we studied SARS-CoV-2 quasispecies based on data generated by next-generation sequencing (NGS) of complete genomes. SARS-CoV-2 raw NGS data had been generated for nasopharyngeal samples collected between March 2020 and February 2021 by the Illumina technology on a MiSeq instrument, without prior PCR amplification. To analyze viral quasispecies, we designed and implemented an in-house Excel file (“QuasiS”) that can characterize intra-sample nucleotide diversity along the genomes using data of the mapping of NGS reads. We compared intra-sample genetic diversity and global genetic diversity available from Nextstrain. Hierarchical clustering of all samples based on the intra-sample genetic diversity was performed and visualized with the Morpheus web application. NGS mapping data from 110 SARS-CoV-2-positive respiratory samples characterized by a mean depth of 169 NGS reads/nucleotide position and for which consensus genomes that had been obtained were classified into 15 viral lineages were analyzed. Mean intra-sample nucleotide diversity was 0.21 ± 0.65%, and 5357 positions (17.9%) exhibited significant (>4%) diversity, in ≥2 genomes for 1730 (5.8%) of them. ORF10, spike, and N genes had the highest number of positions exhibiting diversity (0.56%, 0.34%, and 0.24%, respectively). Nine hot spots of intra-sample diversity were identified in the SARS-CoV-2 NSP6, NSP12, ORF8, and N genes. Hierarchical clustering delineated a set of six genomes of different lineages characterized by 920 positions exhibiting intra-sample diversity. In addition, 118 nucleotide positions (0.4%) exhibited diversity at both intra- and inter-patient levels. Overall, the present study illustrates that the SARS-CoV-2 consensus genome sequences are only an incomplete and imperfect representation of the entire viral population infecting a patient, and that quasispecies analysis may allow deciphering more accurately the viral evolutionary pathways. MDPI 2022-12-10 /pmc/articles/PMC9779826/ /pubmed/36555300 http://dx.doi.org/10.3390/ijms232415658 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Bader, Wahiba Delerce, Jeremy Aherfi, Sarah La Scola, Bernard Colson, Philippe Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences |
title | Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences |
title_full | Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences |
title_fullStr | Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences |
title_full_unstemmed | Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences |
title_short | Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences |
title_sort | quasispecies analysis of sars-cov-2 of 15 different lineages during the first year of the pandemic prompts scratching under the surface of consensus genome sequences |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9779826/ https://www.ncbi.nlm.nih.gov/pubmed/36555300 http://dx.doi.org/10.3390/ijms232415658 |
work_keys_str_mv | AT baderwahiba quasispeciesanalysisofsarscov2of15differentlineagesduringthefirstyearofthepandemicpromptsscratchingunderthesurfaceofconsensusgenomesequences AT delercejeremy quasispeciesanalysisofsarscov2of15differentlineagesduringthefirstyearofthepandemicpromptsscratchingunderthesurfaceofconsensusgenomesequences AT aherfisarah quasispeciesanalysisofsarscov2of15differentlineagesduringthefirstyearofthepandemicpromptsscratchingunderthesurfaceofconsensusgenomesequences AT lascolabernard quasispeciesanalysisofsarscov2of15differentlineagesduringthefirstyearofthepandemicpromptsscratchingunderthesurfaceofconsensusgenomesequences AT colsonphilippe quasispeciesanalysisofsarscov2of15differentlineagesduringthefirstyearofthepandemicpromptsscratchingunderthesurfaceofconsensusgenomesequences |