Cargando…

Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts

The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Victor G., Kim, Hyunsoo, Chuang, Jeffrey H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169918/
https://www.ncbi.nlm.nih.gov/pubmed/30281678
http://dx.doi.org/10.1371/journal.pone.0204912
_version_ 1783360585109340160
author Wang, Victor G.
Kim, Hyunsoo
Chuang, Jeffrey H.
author_facet Wang, Victor G.
Kim, Hyunsoo
Chuang, Jeffrey H.
author_sort Wang, Victor G.
collection PubMed
description The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types but absent in others. As an example, we observed that the gene CCDC168 has no mutations in the stomach adenocarcinoma (STAD) cohort despite its common presence in other tumor types. Surprisingly, we found that the lack of called mutations was due to a systematic insufficiency in the number of sequencing reads in the STAD and other cohorts, as opposed to differential driver biology. Using strict filtering criteria, we found similar behavior in four other genes across TCGA cohorts, with each gene exhibiting systematic sequencing depth issues affecting the ability to call mutations. We identified the culprit as the choice of exome capture kit, as kit choice was highly associated with the set of genes that have insufficient reads to call a mutation. Overall, we found that thousands of samples across all cohorts are subject to some capture kit problems. For example, for the 6353 samples using the Broad Institute’s Custom capture kit there are undercalling biases for at least 4833 genes. False negative mutation calls at these genes may obscure biological similarities between tumor types and other important cancer driver effects in TCGA datasets.
format Online
Article
Text
id pubmed-6169918
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61699182018-10-19 Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts Wang, Victor G. Kim, Hyunsoo Chuang, Jeffrey H. PLoS One Research Article The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types but absent in others. As an example, we observed that the gene CCDC168 has no mutations in the stomach adenocarcinoma (STAD) cohort despite its common presence in other tumor types. Surprisingly, we found that the lack of called mutations was due to a systematic insufficiency in the number of sequencing reads in the STAD and other cohorts, as opposed to differential driver biology. Using strict filtering criteria, we found similar behavior in four other genes across TCGA cohorts, with each gene exhibiting systematic sequencing depth issues affecting the ability to call mutations. We identified the culprit as the choice of exome capture kit, as kit choice was highly associated with the set of genes that have insufficient reads to call a mutation. Overall, we found that thousands of samples across all cohorts are subject to some capture kit problems. For example, for the 6353 samples using the Broad Institute’s Custom capture kit there are undercalling biases for at least 4833 genes. False negative mutation calls at these genes may obscure biological similarities between tumor types and other important cancer driver effects in TCGA datasets. Public Library of Science 2018-10-03 /pmc/articles/PMC6169918/ /pubmed/30281678 http://dx.doi.org/10.1371/journal.pone.0204912 Text en © 2018 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wang, Victor G.
Kim, Hyunsoo
Chuang, Jeffrey H.
Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
title Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
title_full Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
title_fullStr Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
title_full_unstemmed Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
title_short Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
title_sort whole-exome sequencing capture kit biases yield false negative mutation calls in tcga cohorts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169918/
https://www.ncbi.nlm.nih.gov/pubmed/30281678
http://dx.doi.org/10.1371/journal.pone.0204912
work_keys_str_mv AT wangvictorg wholeexomesequencingcapturekitbiasesyieldfalsenegativemutationcallsintcgacohorts
AT kimhyunsoo wholeexomesequencingcapturekitbiasesyieldfalsenegativemutationcallsintcgacohorts
AT chuangjeffreyh wholeexomesequencingcapturekitbiasesyieldfalsenegativemutationcallsintcgacohorts