Cargando…

A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations

The probability of point mutations is expected to be highly influenced by the flanking nucleotides that surround them, known as the sequence context. This phenomenon may be mainly attributed to the enzyme that modifies or mutates the genetic material, because most enzymes tend to have specific seque...

Descripción completa

Detalles Bibliográficos
Autores principales: Ling, Guy, Miller, Danielle, Nielsen, Rasmus, Stern, Adi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7038660/
https://www.ncbi.nlm.nih.gov/pubmed/31651955
http://dx.doi.org/10.1093/molbev/msz248
_version_ 1783500686626914304
author Ling, Guy
Miller, Danielle
Nielsen, Rasmus
Stern, Adi
author_facet Ling, Guy
Miller, Danielle
Nielsen, Rasmus
Stern, Adi
author_sort Ling, Guy
collection PubMed
description The probability of point mutations is expected to be highly influenced by the flanking nucleotides that surround them, known as the sequence context. This phenomenon may be mainly attributed to the enzyme that modifies or mutates the genetic material, because most enzymes tend to have specific sequence contexts that dictate their activity. Here, we develop a statistical model that allows for the detection and evaluation of the effects of different sequence contexts on mutation rates from deep population sequencing data. This task is computationally challenging, as the complexity of the model increases exponentially as the context size increases. We established our novel Bayesian method based on sparse model selection methods, with the leading assumption that the number of actual sequence contexts that directly influence mutation rates is minuscule compared with the number of possible sequence contexts. We show that our method is highly accurate on simulated data using pentanucleotide contexts, even when accounting for noisy data. We next analyze empirical population sequencing data from polioviruses and HIV-1 and detect a significant enrichment in sequence contexts associated with deamination by the cellular deaminases ADAR 1/2 and APOBEC3G, respectively. In the current era, where next-generation sequencing data are highly abundant, our approach can be used on any population sequencing data to reveal context-dependent base alterations and may assist in the discovery of novel mutable sites or editing sites.
format Online
Article
Text
id pubmed-7038660
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-70386602020-03-02 A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations Ling, Guy Miller, Danielle Nielsen, Rasmus Stern, Adi Mol Biol Evol Methods The probability of point mutations is expected to be highly influenced by the flanking nucleotides that surround them, known as the sequence context. This phenomenon may be mainly attributed to the enzyme that modifies or mutates the genetic material, because most enzymes tend to have specific sequence contexts that dictate their activity. Here, we develop a statistical model that allows for the detection and evaluation of the effects of different sequence contexts on mutation rates from deep population sequencing data. This task is computationally challenging, as the complexity of the model increases exponentially as the context size increases. We established our novel Bayesian method based on sparse model selection methods, with the leading assumption that the number of actual sequence contexts that directly influence mutation rates is minuscule compared with the number of possible sequence contexts. We show that our method is highly accurate on simulated data using pentanucleotide contexts, even when accounting for noisy data. We next analyze empirical population sequencing data from polioviruses and HIV-1 and detect a significant enrichment in sequence contexts associated with deamination by the cellular deaminases ADAR 1/2 and APOBEC3G, respectively. In the current era, where next-generation sequencing data are highly abundant, our approach can be used on any population sequencing data to reveal context-dependent base alterations and may assist in the discovery of novel mutable sites or editing sites. Oxford University Press 2020-03 2019-11-05 /pmc/articles/PMC7038660/ /pubmed/31651955 http://dx.doi.org/10.1093/molbev/msz248 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods
Ling, Guy
Miller, Danielle
Nielsen, Rasmus
Stern, Adi
A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations
title A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations
title_full A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations
title_fullStr A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations
title_full_unstemmed A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations
title_short A Bayesian Framework for Inferring the Influence of Sequence Context on Point Mutations
title_sort bayesian framework for inferring the influence of sequence context on point mutations
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7038660/
https://www.ncbi.nlm.nih.gov/pubmed/31651955
http://dx.doi.org/10.1093/molbev/msz248
work_keys_str_mv AT lingguy abayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT millerdanielle abayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT nielsenrasmus abayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT sternadi abayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT lingguy bayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT millerdanielle bayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT nielsenrasmus bayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations
AT sternadi bayesianframeworkforinferringtheinfluenceofsequencecontextonpointmutations