Cargando…

Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Thomas C.A., Carr, Antony M., Eyre-Walker, Adam C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5036107/
https://www.ncbi.nlm.nih.gov/pubmed/27688957
http://dx.doi.org/10.7717/peerj.2391
_version_ 1782455497093283840
author Smith, Thomas C.A.
Carr, Antony M.
Eyre-Walker, Adam C.
author_facet Smith, Thomas C.A.
Carr, Antony M.
Eyre-Walker, Adam C.
author_sort Smith, Thomas C.A.
collection PubMed
description Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ∼4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites.
format Online
Article
Text
id pubmed-5036107
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-50361072016-09-29 Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors? Smith, Thomas C.A. Carr, Antony M. Eyre-Walker, Adam C. PeerJ Bioinformatics Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ∼4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites. PeerJ Inc. 2016-09-20 /pmc/articles/PMC5036107/ /pubmed/27688957 http://dx.doi.org/10.7717/peerj.2391 Text en ©2016 Smith et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Smith, Thomas C.A.
Carr, Antony M.
Eyre-Walker, Adam C.
Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
title Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
title_full Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
title_fullStr Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
title_full_unstemmed Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
title_short Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
title_sort are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5036107/
https://www.ncbi.nlm.nih.gov/pubmed/27688957
http://dx.doi.org/10.7717/peerj.2391
work_keys_str_mv AT smiththomasca aresiteswithmultiplesinglenucleotidevariantsincancergenomesaconsequenceofdrivershypermutablesitesorsequencingerrors
AT carrantonym aresiteswithmultiplesinglenucleotidevariantsincancergenomesaconsequenceofdrivershypermutablesitesorsequencingerrors
AT eyrewalkeradamc aresiteswithmultiplesinglenucleotidevariantsincancergenomesaconsequenceofdrivershypermutablesitesorsequencingerrors