Cargando…

Towards an accurate and robust analysis pipeline for somatic mutation calling

Accurate and robust somatic mutation detection is essential for cancer treatment, diagnostics and research. Various analysis pipelines give different results and thus should be systematically evaluated. In this study, we benchmarked 5 commonly-used somatic mutation calling pipelines (VarScan, VarDic...

Descripción completa

Detalles Bibliográficos
Autores principales: Jin, Jingjie, Chen, Zixi, Liu, Jinchao, Du, Hongli, Zhang, Gong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9705725/
https://www.ncbi.nlm.nih.gov/pubmed/36457740
http://dx.doi.org/10.3389/fgene.2022.979928
_version_ 1784840336514744320
author Jin, Jingjie
Chen, Zixi
Liu, Jinchao
Du, Hongli
Zhang, Gong
author_facet Jin, Jingjie
Chen, Zixi
Liu, Jinchao
Du, Hongli
Zhang, Gong
author_sort Jin, Jingjie
collection PubMed
description Accurate and robust somatic mutation detection is essential for cancer treatment, diagnostics and research. Various analysis pipelines give different results and thus should be systematically evaluated. In this study, we benchmarked 5 commonly-used somatic mutation calling pipelines (VarScan, VarDictJava, Mutect2, Strelka2 and FANSe) for their precision, recall and speed, using standard benchmarking datasets based on a series of real-world whole-exome sequencing datasets. All the 5 pipelines showed very high precision in all cases, and high recall rate in mutation rates higher than 10%. However, for the low frequency mutations, these pipelines showed large difference. FANSe showed the highest accuracy (especially the sensitivity) in all cases, and VarScan and VarDictJava outperformed Mutect2 and Strelka2 in low frequency mutations at all sequencing depths. The flaws in filter was the major cause of the low sensitivity of the four pipelines other than FANSe. Concerning the speed, FANSe pipeline was 8.8∼19x faster than the other pipelines. Our benchmarking results demonstrated performance of the somatic calling pipelines and provided a reference for a proper choice of such pipelines in cancer applications.
format Online
Article
Text
id pubmed-9705725
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97057252022-11-30 Towards an accurate and robust analysis pipeline for somatic mutation calling Jin, Jingjie Chen, Zixi Liu, Jinchao Du, Hongli Zhang, Gong Front Genet Genetics Accurate and robust somatic mutation detection is essential for cancer treatment, diagnostics and research. Various analysis pipelines give different results and thus should be systematically evaluated. In this study, we benchmarked 5 commonly-used somatic mutation calling pipelines (VarScan, VarDictJava, Mutect2, Strelka2 and FANSe) for their precision, recall and speed, using standard benchmarking datasets based on a series of real-world whole-exome sequencing datasets. All the 5 pipelines showed very high precision in all cases, and high recall rate in mutation rates higher than 10%. However, for the low frequency mutations, these pipelines showed large difference. FANSe showed the highest accuracy (especially the sensitivity) in all cases, and VarScan and VarDictJava outperformed Mutect2 and Strelka2 in low frequency mutations at all sequencing depths. The flaws in filter was the major cause of the low sensitivity of the four pipelines other than FANSe. Concerning the speed, FANSe pipeline was 8.8∼19x faster than the other pipelines. Our benchmarking results demonstrated performance of the somatic calling pipelines and provided a reference for a proper choice of such pipelines in cancer applications. Frontiers Media S.A. 2022-11-15 /pmc/articles/PMC9705725/ /pubmed/36457740 http://dx.doi.org/10.3389/fgene.2022.979928 Text en Copyright © 2022 Jin, Chen, Liu, Du and Zhang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Jin, Jingjie
Chen, Zixi
Liu, Jinchao
Du, Hongli
Zhang, Gong
Towards an accurate and robust analysis pipeline for somatic mutation calling
title Towards an accurate and robust analysis pipeline for somatic mutation calling
title_full Towards an accurate and robust analysis pipeline for somatic mutation calling
title_fullStr Towards an accurate and robust analysis pipeline for somatic mutation calling
title_full_unstemmed Towards an accurate and robust analysis pipeline for somatic mutation calling
title_short Towards an accurate and robust analysis pipeline for somatic mutation calling
title_sort towards an accurate and robust analysis pipeline for somatic mutation calling
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9705725/
https://www.ncbi.nlm.nih.gov/pubmed/36457740
http://dx.doi.org/10.3389/fgene.2022.979928
work_keys_str_mv AT jinjingjie towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling
AT chenzixi towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling
AT liujinchao towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling
AT duhongli towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling
AT zhanggong towardsanaccurateandrobustanalysispipelineforsomaticmutationcalling