Cargando…
Towards an accurate and robust analysis pipeline for somatic mutation calling
Accurate and robust somatic mutation detection is essential for cancer treatment, diagnostics and research. Various analysis pipelines give different results and thus should be systematically evaluated. In this study, we benchmarked 5 commonly-used somatic mutation calling pipelines (VarScan, VarDic...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9705725/ https://www.ncbi.nlm.nih.gov/pubmed/36457740 http://dx.doi.org/10.3389/fgene.2022.979928 |
_version_ | 1784840336514744320 |
---|---|
author | Jin, Jingjie Chen, Zixi Liu, Jinchao Du, Hongli Zhang, Gong |
author_facet | Jin, Jingjie Chen, Zixi Liu, Jinchao Du, Hongli Zhang, Gong |
author_sort | Jin, Jingjie |
collection | PubMed |
description | Accurate and robust somatic mutation detection is essential for cancer treatment, diagnostics and research. Various analysis pipelines give different results and thus should be systematically evaluated. In this study, we benchmarked 5 commonly-used somatic mutation calling pipelines (VarScan, VarDictJava, Mutect2, Strelka2 and FANSe) for their precision, recall and speed, using standard benchmarking datasets based on a series of real-world whole-exome sequencing datasets. All the 5 pipelines showed very high precision in all cases, and high recall rate in mutation rates higher than 10%. However, for the low frequency mutations, these pipelines showed large difference. FANSe showed the highest accuracy (especially the sensitivity) in all cases, and VarScan and VarDictJava outperformed Mutect2 and Strelka2 in low frequency mutations at all sequencing depths. The flaws in filter was the major cause of the low sensitivity of the four pipelines other than FANSe. Concerning the speed, FANSe pipeline was 8.8∼19x faster than the other pipelines. Our benchmarking results demonstrated performance of the somatic calling pipelines and provided a reference for a proper choice of such pipelines in cancer applications. |
format | Online Article Text |
id | pubmed-9705725 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-97057252022-11-30 Towards an accurate and robust analysis pipeline for somatic mutation calling Jin, Jingjie Chen, Zixi Liu, Jinchao Du, Hongli Zhang, Gong Front Genet Genetics Accurate and robust somatic mutation detection is essential for cancer treatment, diagnostics and research. Various analysis pipelines give different results and thus should be systematically evaluated. In this study, we benchmarked 5 commonly-used somatic mutation calling pipelines (VarScan, VarDictJava, Mutect2, Strelka2 and FANSe) for their precision, recall and speed, using standard benchmarking datasets based on a series of real-world whole-exome sequencing datasets. All the 5 pipelines showed very high precision in all cases, and high recall rate in mutation rates higher than 10%. However, for the low frequency mutations, these pipelines showed large difference. FANSe showed the highest accuracy (especially the sensitivity) in all cases, and VarScan and VarDictJava outperformed Mutect2 and Strelka2 in low frequency mutations at all sequencing depths. The flaws in filter was the major cause of the low sensitivity of the four pipelines other than FANSe. Concerning the speed, FANSe pipeline was 8.8∼19x faster than the other pipelines. Our benchmarking results demonstrated performance of the somatic calling pipelines and provided a reference for a proper choice of such pipelines in cancer applications. Frontiers Media S.A. 2022-11-15 /pmc/articles/PMC9705725/ /pubmed/36457740 http://dx.doi.org/10.3389/fgene.2022.979928 Text en Copyright © 2022 Jin, Chen, Liu, Du and Zhang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Jin, Jingjie Chen, Zixi Liu, Jinchao Du, Hongli Zhang, Gong Towards an accurate and robust analysis pipeline for somatic mutation calling |
title | Towards an accurate and robust analysis pipeline for somatic mutation calling |
title_full | Towards an accurate and robust analysis pipeline for somatic mutation calling |
title_fullStr | Towards an accurate and robust analysis pipeline for somatic mutation calling |
title_full_unstemmed | Towards an accurate and robust analysis pipeline for somatic mutation calling |
title_short | Towards an accurate and robust analysis pipeline for somatic mutation calling |
title_sort | towards an accurate and robust analysis pipeline for somatic mutation calling |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9705725/ https://www.ncbi.nlm.nih.gov/pubmed/36457740 http://dx.doi.org/10.3389/fgene.2022.979928 |
work_keys_str_mv | AT jinjingjie towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling AT chenzixi towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling AT liujinchao towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling AT duhongli towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling AT zhanggong towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling |