Cargando…

Comparing somatic mutation-callers: beyond Venn diagrams

BACKGROUND: Somatic mutation-calling based on DNA from matched tumor-normal patient samples is one of the key tasks carried by many cancer genome projects. One such large-scale project is The Cancer Genome Atlas (TCGA), which is now routinely compiling catalogs of somatic mutations from hundreds of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kim, Su Yeon, Speed, Terence P
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702398/ https://www.ncbi.nlm.nih.gov/pubmed/23758877 http://dx.doi.org/10.1186/1471-2105-14-189

_version_	1782275797857337344
author	Kim, Su Yeon Speed, Terence P
author_facet	Kim, Su Yeon Speed, Terence P
author_sort	Kim, Su Yeon
collection	PubMed
description	BACKGROUND: Somatic mutation-calling based on DNA from matched tumor-normal patient samples is one of the key tasks carried by many cancer genome projects. One such large-scale project is The Cancer Genome Atlas (TCGA), which is now routinely compiling catalogs of somatic mutations from hundreds of paired tumor-normal DNA exome-sequence data. Nonetheless, mutation calling is still very challenging. TCGA benchmark studies revealed that even relatively recent mutation callers from major centers showed substantial discrepancies. Evaluation of the mutation callers or understanding the sources of discrepancies is not straightforward, since for most tumor studies, validation data based on independent whole-exome DNA sequencing is not available, only partial validation data for a selected (ascertained) subset of sites. RESULTS: To provide guidelines to comparing outputs from multiple callers, we have analyzed two sets of mutation-calling data from the TCGA benchmark studies and their partial validation data. Various aspects of the mutation-calling outputs were explored to characterize the discrepancies in detail. To assess the performances of multiple callers, we introduce four approaches utilizing the external sequence data to varying degrees, ranging from having independent DNA-seq pairs, RNA-seq for tumor samples only, the original exome-seq pairs only, or none of those. CONCLUSIONS: Our analyses provide guidelines to visualizing and understanding the discrepancies among the outputs from multiple callers. Furthermore, applying the four evaluation approaches to the whole exome data, we illustrate the challenges and highlight the various circumstances that require extra caution in assessing the performances of multiple callers.
format	Online Article Text
id	pubmed-3702398
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-37023982013-07-10 Comparing somatic mutation-callers: beyond Venn diagrams Kim, Su Yeon Speed, Terence P BMC Bioinformatics Research Article BACKGROUND: Somatic mutation-calling based on DNA from matched tumor-normal patient samples is one of the key tasks carried by many cancer genome projects. One such large-scale project is The Cancer Genome Atlas (TCGA), which is now routinely compiling catalogs of somatic mutations from hundreds of paired tumor-normal DNA exome-sequence data. Nonetheless, mutation calling is still very challenging. TCGA benchmark studies revealed that even relatively recent mutation callers from major centers showed substantial discrepancies. Evaluation of the mutation callers or understanding the sources of discrepancies is not straightforward, since for most tumor studies, validation data based on independent whole-exome DNA sequencing is not available, only partial validation data for a selected (ascertained) subset of sites. RESULTS: To provide guidelines to comparing outputs from multiple callers, we have analyzed two sets of mutation-calling data from the TCGA benchmark studies and their partial validation data. Various aspects of the mutation-calling outputs were explored to characterize the discrepancies in detail. To assess the performances of multiple callers, we introduce four approaches utilizing the external sequence data to varying degrees, ranging from having independent DNA-seq pairs, RNA-seq for tumor samples only, the original exome-seq pairs only, or none of those. CONCLUSIONS: Our analyses provide guidelines to visualizing and understanding the discrepancies among the outputs from multiple callers. Furthermore, applying the four evaluation approaches to the whole exome data, we illustrate the challenges and highlight the various circumstances that require extra caution in assessing the performances of multiple callers. BioMed Central 2013-06-10 /pmc/articles/PMC3702398/ /pubmed/23758877 http://dx.doi.org/10.1186/1471-2105-14-189 Text en Copyright © 2013 Kim and Speed; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Kim, Su Yeon Speed, Terence P Comparing somatic mutation-callers: beyond Venn diagrams
title	Comparing somatic mutation-callers: beyond Venn diagrams
title_full	Comparing somatic mutation-callers: beyond Venn diagrams
title_fullStr	Comparing somatic mutation-callers: beyond Venn diagrams
title_full_unstemmed	Comparing somatic mutation-callers: beyond Venn diagrams
title_short	Comparing somatic mutation-callers: beyond Venn diagrams
title_sort	comparing somatic mutation-callers: beyond venn diagrams
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702398/ https://www.ncbi.nlm.nih.gov/pubmed/23758877 http://dx.doi.org/10.1186/1471-2105-14-189
work_keys_str_mv	AT kimsuyeon comparingsomaticmutationcallersbeyondvenndiagrams AT speedterencep comparingsomaticmutationcallersbeyondvenndiagrams

Comparing somatic mutation-callers: beyond Venn diagrams

Ejemplares similares