Cargando…
Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts
The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169918/ https://www.ncbi.nlm.nih.gov/pubmed/30281678 http://dx.doi.org/10.1371/journal.pone.0204912 |
_version_ | 1783360585109340160 |
---|---|
author | Wang, Victor G. Kim, Hyunsoo Chuang, Jeffrey H. |
author_facet | Wang, Victor G. Kim, Hyunsoo Chuang, Jeffrey H. |
author_sort | Wang, Victor G. |
collection | PubMed |
description | The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types but absent in others. As an example, we observed that the gene CCDC168 has no mutations in the stomach adenocarcinoma (STAD) cohort despite its common presence in other tumor types. Surprisingly, we found that the lack of called mutations was due to a systematic insufficiency in the number of sequencing reads in the STAD and other cohorts, as opposed to differential driver biology. Using strict filtering criteria, we found similar behavior in four other genes across TCGA cohorts, with each gene exhibiting systematic sequencing depth issues affecting the ability to call mutations. We identified the culprit as the choice of exome capture kit, as kit choice was highly associated with the set of genes that have insufficient reads to call a mutation. Overall, we found that thousands of samples across all cohorts are subject to some capture kit problems. For example, for the 6353 samples using the Broad Institute’s Custom capture kit there are undercalling biases for at least 4833 genes. False negative mutation calls at these genes may obscure biological similarities between tumor types and other important cancer driver effects in TCGA datasets. |
format | Online Article Text |
id | pubmed-6169918 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-61699182018-10-19 Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts Wang, Victor G. Kim, Hyunsoo Chuang, Jeffrey H. PLoS One Research Article The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types but absent in others. As an example, we observed that the gene CCDC168 has no mutations in the stomach adenocarcinoma (STAD) cohort despite its common presence in other tumor types. Surprisingly, we found that the lack of called mutations was due to a systematic insufficiency in the number of sequencing reads in the STAD and other cohorts, as opposed to differential driver biology. Using strict filtering criteria, we found similar behavior in four other genes across TCGA cohorts, with each gene exhibiting systematic sequencing depth issues affecting the ability to call mutations. We identified the culprit as the choice of exome capture kit, as kit choice was highly associated with the set of genes that have insufficient reads to call a mutation. Overall, we found that thousands of samples across all cohorts are subject to some capture kit problems. For example, for the 6353 samples using the Broad Institute’s Custom capture kit there are undercalling biases for at least 4833 genes. False negative mutation calls at these genes may obscure biological similarities between tumor types and other important cancer driver effects in TCGA datasets. Public Library of Science 2018-10-03 /pmc/articles/PMC6169918/ /pubmed/30281678 http://dx.doi.org/10.1371/journal.pone.0204912 Text en © 2018 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Wang, Victor G. Kim, Hyunsoo Chuang, Jeffrey H. Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts |
title | Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts |
title_full | Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts |
title_fullStr | Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts |
title_full_unstemmed | Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts |
title_short | Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts |
title_sort | whole-exome sequencing capture kit biases yield false negative mutation calls in tcga cohorts |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169918/ https://www.ncbi.nlm.nih.gov/pubmed/30281678 http://dx.doi.org/10.1371/journal.pone.0204912 |
work_keys_str_mv | AT wangvictorg wholeexomesequencingcapturekitbiasesyieldfalsenegativemutationcallsintcgacohorts AT kimhyunsoo wholeexomesequencingcapturekitbiasesyieldfalsenegativemutationcallsintcgacohorts AT chuangjeffreyh wholeexomesequencingcapturekitbiasesyieldfalsenegativemutationcallsintcgacohorts |