Cargando…

Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants

Although many laboratories worldwide have developed their sequencing capacities in response to the need for SARS-CoV-2 genome-based surveillance of variants, only a few reported some quality criteria to ensure sequence quality before lineage assignment and submission to public databases. Hence, we a...

Descripción completa

Detalles Bibliográficos
Autores principales: Jacot, Damien, Pillonel, Trestan, Greub, Gilbert, Bertelli, Claire
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8451431/
https://www.ncbi.nlm.nih.gov/pubmed/34319802
http://dx.doi.org/10.1128/JCM.00944-21
_version_ 1784569841850515456
author Jacot, Damien
Pillonel, Trestan
Greub, Gilbert
Bertelli, Claire
author_facet Jacot, Damien
Pillonel, Trestan
Greub, Gilbert
Bertelli, Claire
author_sort Jacot, Damien
collection PubMed
description Although many laboratories worldwide have developed their sequencing capacities in response to the need for SARS-CoV-2 genome-based surveillance of variants, only a few reported some quality criteria to ensure sequence quality before lineage assignment and submission to public databases. Hence, we aimed here to provide simple quality control criteria for SARS-CoV-2 sequencing to prevent erroneous interpretation of low-quality or contaminated data. We retrospectively investigated 647 SARS-CoV-2 genomes obtained over 10 tiled amplicons sequencing runs. We extracted 26 potentially relevant metrics covering the entire workflow from sample selection to bioinformatics analysis. Based on data distribution, critical values were established for 11 selected metrics to prompt further quality investigations for problematic samples, in particular those with a low viral RNA quantity. Low-frequency variants (<70% of supporting reads) can result from PCR amplification errors, sample cross contaminations, or presence of distinct SARS-CoV2 genomes in the sample sequenced. The number and the prevalence of low-frequency variants can be used as a robust quality criterion to identify possible sequencing errors or contaminations. Overall, we propose 11 metrics with fixed cutoff values as a simple tool to evaluate the quality of SARS-CoV-2 genomes, among which are cycle thresholds, mean depth, proportion of genome covered at least 10×, and the number of low-frequency variants combined with mutation prevalence data.
format Online
Article
Text
id pubmed-8451431
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-84514312021-10-04 Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants Jacot, Damien Pillonel, Trestan Greub, Gilbert Bertelli, Claire J Clin Microbiol Virology Although many laboratories worldwide have developed their sequencing capacities in response to the need for SARS-CoV-2 genome-based surveillance of variants, only a few reported some quality criteria to ensure sequence quality before lineage assignment and submission to public databases. Hence, we aimed here to provide simple quality control criteria for SARS-CoV-2 sequencing to prevent erroneous interpretation of low-quality or contaminated data. We retrospectively investigated 647 SARS-CoV-2 genomes obtained over 10 tiled amplicons sequencing runs. We extracted 26 potentially relevant metrics covering the entire workflow from sample selection to bioinformatics analysis. Based on data distribution, critical values were established for 11 selected metrics to prompt further quality investigations for problematic samples, in particular those with a low viral RNA quantity. Low-frequency variants (<70% of supporting reads) can result from PCR amplification errors, sample cross contaminations, or presence of distinct SARS-CoV2 genomes in the sample sequenced. The number and the prevalence of low-frequency variants can be used as a robust quality criterion to identify possible sequencing errors or contaminations. Overall, we propose 11 metrics with fixed cutoff values as a simple tool to evaluate the quality of SARS-CoV-2 genomes, among which are cycle thresholds, mean depth, proportion of genome covered at least 10×, and the number of low-frequency variants combined with mutation prevalence data. American Society for Microbiology 2021-09-20 /pmc/articles/PMC8451431/ /pubmed/34319802 http://dx.doi.org/10.1128/JCM.00944-21 Text en Copyright © 2021 Jacot et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Virology
Jacot, Damien
Pillonel, Trestan
Greub, Gilbert
Bertelli, Claire
Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
title Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
title_full Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
title_fullStr Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
title_full_unstemmed Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
title_short Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
title_sort assessment of sars-cov-2 genome sequencing: quality criteria and low-frequency variants
topic Virology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8451431/
https://www.ncbi.nlm.nih.gov/pubmed/34319802
http://dx.doi.org/10.1128/JCM.00944-21
work_keys_str_mv AT jacotdamien assessmentofsarscov2genomesequencingqualitycriteriaandlowfrequencyvariants
AT pilloneltrestan assessmentofsarscov2genomesequencingqualitycriteriaandlowfrequencyvariants
AT greubgilbert assessmentofsarscov2genomesequencingqualitycriteriaandlowfrequencyvariants
AT bertelliclaire assessmentofsarscov2genomesequencingqualitycriteriaandlowfrequencyvariants