Cargando…

Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies

A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant call...

Descripción completa

Detalles Bibliográficos
Autores principales: Field, Matthew A., Cho, Vicky, Andrews, T. Daniel, Goodnow, Chris C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4658170/
https://www.ncbi.nlm.nih.gov/pubmed/26600436
http://dx.doi.org/10.1371/journal.pone.0143199
_version_ 1782402491876376576
author Field, Matthew A.
Cho, Vicky
Andrews, T. Daniel
Goodnow, Chris C.
author_facet Field, Matthew A.
Cho, Vicky
Andrews, T. Daniel
Goodnow, Chris C.
author_sort Field, Matthew A.
collection PubMed
description A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as the choice of short-read aligner, variant caller, and variant caller filtering strategy. Here we present a two-part study first using the high quality ‘genome in a bottle’ reference set to demonstrate the significant impact the choice of aligner, variant caller, and variant caller filtering strategy has on overall variant call quality and further how certain variant callers outperform others with increased sample contamination, an important consideration when analyzing sequenced cancer samples. This analysis confirms previous work showing that combining variant calls of multiple tools results in the best quality resultant variant set, for either specificity or sensitivity, depending on whether the intersection or union, of all variant calls is used respectively. Second, we analyze a melanoma cell line derived from a control lymphocyte sample to determine whether software choices affect the detection of clinically important melanoma risk-factor variants finding that only one of the three such variants is unanimously detected under all conditions. Finally, we describe a cogent strategy for implementing a clinical variant detection pipeline; a strategy that requires careful software selection, variant caller filtering optimizing, and combined variant calls in order to effectively minimize false negative variants. While implementing such features represents an increase in complexity and computation the results offer indisputable improvements in data quality.
format Online
Article
Text
id pubmed-4658170
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46581702015-12-02 Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies Field, Matthew A. Cho, Vicky Andrews, T. Daniel Goodnow, Chris C. PLoS One Research Article A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as the choice of short-read aligner, variant caller, and variant caller filtering strategy. Here we present a two-part study first using the high quality ‘genome in a bottle’ reference set to demonstrate the significant impact the choice of aligner, variant caller, and variant caller filtering strategy has on overall variant call quality and further how certain variant callers outperform others with increased sample contamination, an important consideration when analyzing sequenced cancer samples. This analysis confirms previous work showing that combining variant calls of multiple tools results in the best quality resultant variant set, for either specificity or sensitivity, depending on whether the intersection or union, of all variant calls is used respectively. Second, we analyze a melanoma cell line derived from a control lymphocyte sample to determine whether software choices affect the detection of clinically important melanoma risk-factor variants finding that only one of the three such variants is unanimously detected under all conditions. Finally, we describe a cogent strategy for implementing a clinical variant detection pipeline; a strategy that requires careful software selection, variant caller filtering optimizing, and combined variant calls in order to effectively minimize false negative variants. While implementing such features represents an increase in complexity and computation the results offer indisputable improvements in data quality. Public Library of Science 2015-11-23 /pmc/articles/PMC4658170/ /pubmed/26600436 http://dx.doi.org/10.1371/journal.pone.0143199 Text en © 2015 Field et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Field, Matthew A.
Cho, Vicky
Andrews, T. Daniel
Goodnow, Chris C.
Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
title Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
title_full Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
title_fullStr Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
title_full_unstemmed Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
title_short Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
title_sort reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4658170/
https://www.ncbi.nlm.nih.gov/pubmed/26600436
http://dx.doi.org/10.1371/journal.pone.0143199
work_keys_str_mv AT fieldmatthewa reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
AT chovicky reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
AT andrewstdaniel reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies
AT goodnowchrisc reliablydetectingclinicallyimportantvariantsrequiresbothcombinedvariantcallsandoptimizedfilteringstrategies