Cargando…

Identification of single nucleotide variants using position-specific error estimation in deep sequencing data

BACKGROUND: Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging c...

Descripción completa

Detalles Bibliográficos
Autores principales: Kleftogiannis, Dimitrios, Punta, Marco, Jayaram, Anuradha, Sandhu, Shahneen, Wong, Stephen Q., Gasi Tandefelt, Delila, Conteduca, Vincenza, Wetterskog, Daniel, Attard, Gerhardt, Lise, Stefano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6679440/
https://www.ncbi.nlm.nih.gov/pubmed/31375105
http://dx.doi.org/10.1186/s12920-019-0557-9
_version_ 1783441335391354880
author Kleftogiannis, Dimitrios
Punta, Marco
Jayaram, Anuradha
Sandhu, Shahneen
Wong, Stephen Q.
Gasi Tandefelt, Delila
Conteduca, Vincenza
Wetterskog, Daniel
Attard, Gerhardt
Lise, Stefano
author_facet Kleftogiannis, Dimitrios
Punta, Marco
Jayaram, Anuradha
Sandhu, Shahneen
Wong, Stephen Q.
Gasi Tandefelt, Delila
Conteduca, Vincenza
Wetterskog, Daniel
Attard, Gerhardt
Lise, Stefano
author_sort Kleftogiannis, Dimitrios
collection PubMed
description BACKGROUND: Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). METHODS: To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. RESULTS: Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. CONCLUSIONS: AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-019-0557-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6679440
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66794402019-08-06 Identification of single nucleotide variants using position-specific error estimation in deep sequencing data Kleftogiannis, Dimitrios Punta, Marco Jayaram, Anuradha Sandhu, Shahneen Wong, Stephen Q. Gasi Tandefelt, Delila Conteduca, Vincenza Wetterskog, Daniel Attard, Gerhardt Lise, Stefano BMC Med Genomics Technical Advance BACKGROUND: Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). METHODS: To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. RESULTS: Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. CONCLUSIONS: AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-019-0557-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-08-02 /pmc/articles/PMC6679440/ /pubmed/31375105 http://dx.doi.org/10.1186/s12920-019-0557-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Technical Advance
Kleftogiannis, Dimitrios
Punta, Marco
Jayaram, Anuradha
Sandhu, Shahneen
Wong, Stephen Q.
Gasi Tandefelt, Delila
Conteduca, Vincenza
Wetterskog, Daniel
Attard, Gerhardt
Lise, Stefano
Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_full Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_fullStr Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_full_unstemmed Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_short Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
title_sort identification of single nucleotide variants using position-specific error estimation in deep sequencing data
topic Technical Advance
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6679440/
https://www.ncbi.nlm.nih.gov/pubmed/31375105
http://dx.doi.org/10.1186/s12920-019-0557-9
work_keys_str_mv AT kleftogiannisdimitrios identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT puntamarco identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT jayaramanuradha identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT sandhushahneen identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT wongstephenq identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT gasitandefeltdelila identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT conteducavincenza identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT wetterskogdaniel identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT attardgerhardt identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata
AT lisestefano identificationofsinglenucleotidevariantsusingpositionspecificerrorestimationindeepsequencingdata