Cargando…
Quantifying single nucleotide variant detection sensitivity in exome sequencing
BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To f...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3695811/ https://www.ncbi.nlm.nih.gov/pubmed/23773188 http://dx.doi.org/10.1186/1471-2105-14-195 |
_version_ | 1782275014151634944 |
---|---|
author | Meynert, Alison M Bicknell, Louise S Hurles, Matthew E Jackson, Andrew P Taylor, Martin S |
author_facet | Meynert, Alison M Bicknell, Louise S Hurles, Matthew E Jackson, Andrew P Taylor, Martin S |
author_sort | Meynert, Alison M |
collection | PubMed |
description | BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits. |
format | Online Article Text |
id | pubmed-3695811 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36958112013-07-01 Quantifying single nucleotide variant detection sensitivity in exome sequencing Meynert, Alison M Bicknell, Louise S Hurles, Matthew E Jackson, Andrew P Taylor, Martin S BMC Bioinformatics Research Article BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits. BioMed Central 2013-06-18 /pmc/articles/PMC3695811/ /pubmed/23773188 http://dx.doi.org/10.1186/1471-2105-14-195 Text en Copyright © 2013 Meynert et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Meynert, Alison M Bicknell, Louise S Hurles, Matthew E Jackson, Andrew P Taylor, Martin S Quantifying single nucleotide variant detection sensitivity in exome sequencing |
title | Quantifying single nucleotide variant detection sensitivity in exome sequencing |
title_full | Quantifying single nucleotide variant detection sensitivity in exome sequencing |
title_fullStr | Quantifying single nucleotide variant detection sensitivity in exome sequencing |
title_full_unstemmed | Quantifying single nucleotide variant detection sensitivity in exome sequencing |
title_short | Quantifying single nucleotide variant detection sensitivity in exome sequencing |
title_sort | quantifying single nucleotide variant detection sensitivity in exome sequencing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3695811/ https://www.ncbi.nlm.nih.gov/pubmed/23773188 http://dx.doi.org/10.1186/1471-2105-14-195 |
work_keys_str_mv | AT meynertalisonm quantifyingsinglenucleotidevariantdetectionsensitivityinexomesequencing AT bicknelllouises quantifyingsinglenucleotidevariantdetectionsensitivityinexomesequencing AT hurlesmatthewe quantifyingsinglenucleotidevariantdetectionsensitivityinexomesequencing AT jacksonandrewp quantifyingsinglenucleotidevariantdetectionsensitivityinexomesequencing AT taylormartins quantifyingsinglenucleotidevariantdetectionsensitivityinexomesequencing |