Cargando…

FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures

BACKGROUND: Accurate identification of real somatic variants is a primary part of cancer genome studies and precision oncology. However, artifacts introduced in various steps of sequencing obfuscate confidence in variant calling. Current computational approaches to variant filtering involve intensiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Hyunbin, Lee, Andy Jinseok, Lee, Jongkeun, Chun, Hyonho, Ju, Young Seok, Hong, Dongwan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6916105/
https://www.ncbi.nlm.nih.gov/pubmed/31847917
http://dx.doi.org/10.1186/s13073-019-0695-x
_version_ 1783480164550705152
author Kim, Hyunbin
Lee, Andy Jinseok
Lee, Jongkeun
Chun, Hyonho
Ju, Young Seok
Hong, Dongwan
author_facet Kim, Hyunbin
Lee, Andy Jinseok
Lee, Jongkeun
Chun, Hyonho
Ju, Young Seok
Hong, Dongwan
author_sort Kim, Hyunbin
collection PubMed
description BACKGROUND: Accurate identification of real somatic variants is a primary part of cancer genome studies and precision oncology. However, artifacts introduced in various steps of sequencing obfuscate confidence in variant calling. Current computational approaches to variant filtering involve intensive interrogation of Binary Alignment Map (BAM) files and require massive computing power, data storage, and manual labor. Recently, mutational signatures associated with sequencing artifacts have been extracted by the Pan-cancer Analysis of Whole Genomes (PCAWG) study. These spectrums can be used to evaluate refinement quality of a given set of somatic mutations. RESULTS: Here we introduce a novel variant refinement software, FIREVAT (FInding REliable Variants without ArTifacts), which uses known spectrums of sequencing artifacts extracted from one of the largest publicly available catalogs of human tumor samples. FIREVAT performs a quick and efficient variant refinement that accurately removes artifacts and greatly improves the precision and specificity of somatic calls. We validated FIREVAT refinement performance using orthogonal sequencing datasets totaling 384 tumor samples with respect to ground truth. Our novel method achieved the highest level of performance compared to existing filtering approaches. Application of FIREVAT on additional 308 The Cancer Genome Atlas (TCGA) samples demonstrated that FIREVAT refinement leads to identification of more biologically and clinically relevant mutational signatures as well as enrichment of sequence contexts associated with experimental errors. FIREVAT only requires a Variant Call Format file (VCF) and generates a comprehensive report of the variant refinement processes and outcomes for the user. CONCLUSIONS: In summary, FIREVAT facilitates a novel refinement strategy using mutational signatures to distinguish artifactual point mutations called in human cancer samples. We anticipate that FIREVAT results will further contribute to precision oncology efforts that rely on accurate identification of variants, especially in the context of analyzing mutational signatures that bear prognostic and therapeutic significance. FIREVAT is freely available at https://github.com/cgab-ncc/FIREVAT
format Online
Article
Text
id pubmed-6916105
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69161052019-12-30 FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures Kim, Hyunbin Lee, Andy Jinseok Lee, Jongkeun Chun, Hyonho Ju, Young Seok Hong, Dongwan Genome Med Software BACKGROUND: Accurate identification of real somatic variants is a primary part of cancer genome studies and precision oncology. However, artifacts introduced in various steps of sequencing obfuscate confidence in variant calling. Current computational approaches to variant filtering involve intensive interrogation of Binary Alignment Map (BAM) files and require massive computing power, data storage, and manual labor. Recently, mutational signatures associated with sequencing artifacts have been extracted by the Pan-cancer Analysis of Whole Genomes (PCAWG) study. These spectrums can be used to evaluate refinement quality of a given set of somatic mutations. RESULTS: Here we introduce a novel variant refinement software, FIREVAT (FInding REliable Variants without ArTifacts), which uses known spectrums of sequencing artifacts extracted from one of the largest publicly available catalogs of human tumor samples. FIREVAT performs a quick and efficient variant refinement that accurately removes artifacts and greatly improves the precision and specificity of somatic calls. We validated FIREVAT refinement performance using orthogonal sequencing datasets totaling 384 tumor samples with respect to ground truth. Our novel method achieved the highest level of performance compared to existing filtering approaches. Application of FIREVAT on additional 308 The Cancer Genome Atlas (TCGA) samples demonstrated that FIREVAT refinement leads to identification of more biologically and clinically relevant mutational signatures as well as enrichment of sequence contexts associated with experimental errors. FIREVAT only requires a Variant Call Format file (VCF) and generates a comprehensive report of the variant refinement processes and outcomes for the user. CONCLUSIONS: In summary, FIREVAT facilitates a novel refinement strategy using mutational signatures to distinguish artifactual point mutations called in human cancer samples. We anticipate that FIREVAT results will further contribute to precision oncology efforts that rely on accurate identification of variants, especially in the context of analyzing mutational signatures that bear prognostic and therapeutic significance. FIREVAT is freely available at https://github.com/cgab-ncc/FIREVAT BioMed Central 2019-12-17 /pmc/articles/PMC6916105/ /pubmed/31847917 http://dx.doi.org/10.1186/s13073-019-0695-x Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Kim, Hyunbin
Lee, Andy Jinseok
Lee, Jongkeun
Chun, Hyonho
Ju, Young Seok
Hong, Dongwan
FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
title FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
title_full FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
title_fullStr FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
title_full_unstemmed FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
title_short FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
title_sort firevat: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6916105/
https://www.ncbi.nlm.nih.gov/pubmed/31847917
http://dx.doi.org/10.1186/s13073-019-0695-x
work_keys_str_mv AT kimhyunbin firevatfindingreliablevariantswithoutartifactsinhumancancersamplesusingetiologicallyrelevantmutationalsignatures
AT leeandyjinseok firevatfindingreliablevariantswithoutartifactsinhumancancersamplesusingetiologicallyrelevantmutationalsignatures
AT leejongkeun firevatfindingreliablevariantswithoutartifactsinhumancancersamplesusingetiologicallyrelevantmutationalsignatures
AT chunhyonho firevatfindingreliablevariantswithoutartifactsinhumancancersamplesusingetiologicallyrelevantmutationalsignatures
AT juyoungseok firevatfindingreliablevariantswithoutartifactsinhumancancersamplesusingetiologicallyrelevantmutationalsignatures
AT hongdongwan firevatfindingreliablevariantswithoutartifactsinhumancancersamplesusingetiologicallyrelevantmutationalsignatures