Cargando…
Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples
BACKGROUND: Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is infl...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245819/ https://www.ncbi.nlm.nih.gov/pubmed/30453881 http://dx.doi.org/10.1186/s12859-018-2455-0 |
_version_ | 1783372317957554176 |
---|---|
author | Gorlov, Ivan P. Pikielny, Claudio W. Frost, Hildreth R. Her, Stephanie C. Cole, Michael D. Strohbehn, Samuel D. Wallace-Bradley, David Kimmel, Marek Gorlova, Olga Y. Amos, Christopher I. |
author_facet | Gorlov, Ivan P. Pikielny, Claudio W. Frost, Hildreth R. Her, Stephanie C. Cole, Michael D. Strohbehn, Samuel D. Wallace-Bradley, David Kimmel, Marek Gorlova, Olga Y. Amos, Christopher I. |
author_sort | Gorlov, Ivan P. |
collection | PubMed |
description | BACKGROUND: Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is influenced by many gene characteristics, such as size, nucleotide composition, etc. The goal of this study was to identify gene characteristics associated with the frequency of somatic mutations in the gene in tumor samples. RESULTS: We used data on somatic mutations detected by genome wide screens from the Catalog of Somatic Mutations in Cancer (COSMIC). Gene size, nucleotide composition, expression level of the gene, relative replication time in the cell cycle, level of evolutionary conservation and other gene characteristics (totaling 11) were used as predictors of the number of somatic mutations. We applied stepwise multiple linear regression to predict the number of mutations per gene. Because missense, nonsense, and frameshift mutations are associated with different sets of gene characteristics, they were modeled separately. Gene characteristics explain 88% of the variation in the number of missense, 40% of nonsense, and 23% of frameshift mutations. Comparisons of the observed and expected numbers of mutations identified genes with a higher than expected number of mutations– positive outliers. Many of these are known driver genes. A number of novel candidate driver genes was also identified. CONCLUSIONS: By comparing the observed and predicted number of mutations in a gene, we have identified known cancer-associated genes as well as 111 novel cancer associated genes. We also showed that adding the number of silent mutations per gene reported by genome/exome wide screens across all cancer type (COSMIC data) as a predictor substantially exceeds predicting accuracy of the most popular cancer gene predicting tool - MutsigCV. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2455-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6245819 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62458192018-11-26 Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples Gorlov, Ivan P. Pikielny, Claudio W. Frost, Hildreth R. Her, Stephanie C. Cole, Michael D. Strohbehn, Samuel D. Wallace-Bradley, David Kimmel, Marek Gorlova, Olga Y. Amos, Christopher I. BMC Bioinformatics Research Article BACKGROUND: Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is influenced by many gene characteristics, such as size, nucleotide composition, etc. The goal of this study was to identify gene characteristics associated with the frequency of somatic mutations in the gene in tumor samples. RESULTS: We used data on somatic mutations detected by genome wide screens from the Catalog of Somatic Mutations in Cancer (COSMIC). Gene size, nucleotide composition, expression level of the gene, relative replication time in the cell cycle, level of evolutionary conservation and other gene characteristics (totaling 11) were used as predictors of the number of somatic mutations. We applied stepwise multiple linear regression to predict the number of mutations per gene. Because missense, nonsense, and frameshift mutations are associated with different sets of gene characteristics, they were modeled separately. Gene characteristics explain 88% of the variation in the number of missense, 40% of nonsense, and 23% of frameshift mutations. Comparisons of the observed and expected numbers of mutations identified genes with a higher than expected number of mutations– positive outliers. Many of these are known driver genes. A number of novel candidate driver genes was also identified. CONCLUSIONS: By comparing the observed and predicted number of mutations in a gene, we have identified known cancer-associated genes as well as 111 novel cancer associated genes. We also showed that adding the number of silent mutations per gene reported by genome/exome wide screens across all cancer type (COSMIC data) as a predictor substantially exceeds predicting accuracy of the most popular cancer gene predicting tool - MutsigCV. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2455-0) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-19 /pmc/articles/PMC6245819/ /pubmed/30453881 http://dx.doi.org/10.1186/s12859-018-2455-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Gorlov, Ivan P. Pikielny, Claudio W. Frost, Hildreth R. Her, Stephanie C. Cole, Michael D. Strohbehn, Samuel D. Wallace-Bradley, David Kimmel, Marek Gorlova, Olga Y. Amos, Christopher I. Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
title | Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
title_full | Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
title_fullStr | Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
title_full_unstemmed | Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
title_short | Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
title_sort | gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245819/ https://www.ncbi.nlm.nih.gov/pubmed/30453881 http://dx.doi.org/10.1186/s12859-018-2455-0 |
work_keys_str_mv | AT gorlovivanp genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT pikielnyclaudiow genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT frosthildrethr genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT herstephaniec genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT colemichaeld genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT strohbehnsamueld genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT wallacebradleydavid genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT kimmelmarek genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT gorlovaolgay genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples AT amoschristopheri genecharacteristicspredictingmissensenonsenseandframeshiftmutationsintumorsamples |