Cargando…

SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach

It is challenging to identify somatic variants from high-throughput sequence reads due to tumor heterogeneity, sub-clonality, and sequencing artifacts. In this study, we evaluated the performance of eight primary somatic variant callers and multiple ensemble methods using both real and synthetic who...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Mingyi, Luo, Wen, Jones, Kristine, Bian, Xiaopeng, Williams, Russell, Higson, Herbert, Wu, Dongjing, Hicks, Belynda, Yeager, Meredith, Zhu, Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7393490/
https://www.ncbi.nlm.nih.gov/pubmed/32732891
http://dx.doi.org/10.1038/s41598-020-69772-8
_version_ 1783565057113718784
author Wang, Mingyi
Luo, Wen
Jones, Kristine
Bian, Xiaopeng
Williams, Russell
Higson, Herbert
Wu, Dongjing
Hicks, Belynda
Yeager, Meredith
Zhu, Bin
author_facet Wang, Mingyi
Luo, Wen
Jones, Kristine
Bian, Xiaopeng
Williams, Russell
Higson, Herbert
Wu, Dongjing
Hicks, Belynda
Yeager, Meredith
Zhu, Bin
author_sort Wang, Mingyi
collection PubMed
description It is challenging to identify somatic variants from high-throughput sequence reads due to tumor heterogeneity, sub-clonality, and sequencing artifacts. In this study, we evaluated the performance of eight primary somatic variant callers and multiple ensemble methods using both real and synthetic whole-genome sequencing, whole-exome sequencing, and deep targeted sequencing datasets with the NA12878 cell line. The test results showed that a simple consensus approach can significantly improve performance even with a limited number of callers and is more robust and stable than machine learning based ensemble approaches. To fully exploit the multi-callers, we also developed a software package, SomaticCombiner, that can combine multiple callers and integrates a new variant allelic frequency (VAF) adaptive majority voting approach, which can maintain sensitive detection for variants with low VAFs.
format Online
Article
Text
id pubmed-7393490
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73934902020-08-03 SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach Wang, Mingyi Luo, Wen Jones, Kristine Bian, Xiaopeng Williams, Russell Higson, Herbert Wu, Dongjing Hicks, Belynda Yeager, Meredith Zhu, Bin Sci Rep Article It is challenging to identify somatic variants from high-throughput sequence reads due to tumor heterogeneity, sub-clonality, and sequencing artifacts. In this study, we evaluated the performance of eight primary somatic variant callers and multiple ensemble methods using both real and synthetic whole-genome sequencing, whole-exome sequencing, and deep targeted sequencing datasets with the NA12878 cell line. The test results showed that a simple consensus approach can significantly improve performance even with a limited number of callers and is more robust and stable than machine learning based ensemble approaches. To fully exploit the multi-callers, we also developed a software package, SomaticCombiner, that can combine multiple callers and integrates a new variant allelic frequency (VAF) adaptive majority voting approach, which can maintain sensitive detection for variants with low VAFs. Nature Publishing Group UK 2020-07-30 /pmc/articles/PMC7393490/ /pubmed/32732891 http://dx.doi.org/10.1038/s41598-020-69772-8 Text en © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2020 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Wang, Mingyi
Luo, Wen
Jones, Kristine
Bian, Xiaopeng
Williams, Russell
Higson, Herbert
Wu, Dongjing
Hicks, Belynda
Yeager, Meredith
Zhu, Bin
SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
title SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
title_full SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
title_fullStr SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
title_full_unstemmed SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
title_short SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
title_sort somaticcombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7393490/
https://www.ncbi.nlm.nih.gov/pubmed/32732891
http://dx.doi.org/10.1038/s41598-020-69772-8
work_keys_str_mv AT wangmingyi somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT luowen somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT joneskristine somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT bianxiaopeng somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT williamsrussell somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT higsonherbert somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT wudongjing somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT hicksbelynda somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT yeagermeredith somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach
AT zhubin somaticcombinerimprovingtheperformanceofsomaticvariantcallingbasedonevaluationtestsandaconsensusapproach