Cargando…

Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences

In cancer, recurrently mutated sites in DNA and proteins, called hotspots, are thought to be raised by positive selection and therefore important due to its potential functional impact. Although recent evidence for APOBEC enzymatic activity have shown that specific types of sequences are likely to b...

Descripción completa

Detalles Bibliográficos
Autor principal: Trevino, Victor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7339035/
https://www.ncbi.nlm.nih.gov/pubmed/32670506
http://dx.doi.org/10.1016/j.csbj.2020.06.022
_version_ 1783554810099793920
author Trevino, Victor
author_facet Trevino, Victor
author_sort Trevino, Victor
collection PubMed
description In cancer, recurrently mutated sites in DNA and proteins, called hotspots, are thought to be raised by positive selection and therefore important due to its potential functional impact. Although recent evidence for APOBEC enzymatic activity have shown that specific types of sequences are likely to be false, the identification of putative hotspots is important to confirm either its functional role or its mechanistic bias. In this work, an algorithm and a statistical model is presented to detect hotspots. The model consists of a beta-binomial component plus fixed effects that efficiently fits the distribution of mutated sites. The algorithm employs an optimal stepwise approach to find the model parameters. Simulations show that the proposed algorithmic model is highly accurate for common hotspots. The approach has been applied to TCGA mutational data from 33 cancer types. The results show that well-known cancer hotspots are easily detected. Besides, novel hotspots are also detected. An analysis of the sequence context of detected hotspots show a preference for TCG sites that may be related to APOBEC or other unknown mechanistic biases. The detected hotspots are available online in http://bioinformatica.mty.itesm.mx/HotSpotsAnnotations.
format Online
Article
Text
id pubmed-7339035
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-73390352020-07-14 Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences Trevino, Victor Comput Struct Biotechnol J Research Article In cancer, recurrently mutated sites in DNA and proteins, called hotspots, are thought to be raised by positive selection and therefore important due to its potential functional impact. Although recent evidence for APOBEC enzymatic activity have shown that specific types of sequences are likely to be false, the identification of putative hotspots is important to confirm either its functional role or its mechanistic bias. In this work, an algorithm and a statistical model is presented to detect hotspots. The model consists of a beta-binomial component plus fixed effects that efficiently fits the distribution of mutated sites. The algorithm employs an optimal stepwise approach to find the model parameters. Simulations show that the proposed algorithmic model is highly accurate for common hotspots. The approach has been applied to TCGA mutational data from 33 cancer types. The results show that well-known cancer hotspots are easily detected. Besides, novel hotspots are also detected. An analysis of the sequence context of detected hotspots show a preference for TCG sites that may be related to APOBEC or other unknown mechanistic biases. The detected hotspots are available online in http://bioinformatica.mty.itesm.mx/HotSpotsAnnotations. Research Network of Computational and Structural Biotechnology 2020-06-20 /pmc/articles/PMC7339035/ /pubmed/32670506 http://dx.doi.org/10.1016/j.csbj.2020.06.022 Text en © 2020 The Author http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Trevino, Victor
Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
title Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
title_full Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
title_fullStr Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
title_full_unstemmed Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
title_short Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
title_sort modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7339035/
https://www.ncbi.nlm.nih.gov/pubmed/32670506
http://dx.doi.org/10.1016/j.csbj.2020.06.022
work_keys_str_mv AT trevinovictor modelingandanalysisofsitespecificmutationsincanceridentifiesknownplusputativenovelhotspotsandbiasduetocontextualsequences