Cargando…

A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes

The B cells in our body generate protective antibodies by introducing somatic hypermutations (SHM) into the variable region of immunoglobulin genes (IgVs). The mutations are generated by activation induced deaminase (AID) that converts cytosine to uracil in single stranded DNA (ssDNA) generated duri...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Guojun, Wu, Yingru, Duan, Zhi, Tang, Catherine, Xing, Haipeng, Scharff, Matthew D., MacCarthy, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8462741/
https://www.ncbi.nlm.nih.gov/pubmed/34491985
http://dx.doi.org/10.1371/journal.pcbi.1009323
_version_ 1784572261984894976
author Yu, Guojun
Wu, Yingru
Duan, Zhi
Tang, Catherine
Xing, Haipeng
Scharff, Matthew D.
MacCarthy, Thomas
author_facet Yu, Guojun
Wu, Yingru
Duan, Zhi
Tang, Catherine
Xing, Haipeng
Scharff, Matthew D.
MacCarthy, Thomas
author_sort Yu, Guojun
collection PubMed
description The B cells in our body generate protective antibodies by introducing somatic hypermutations (SHM) into the variable region of immunoglobulin genes (IgVs). The mutations are generated by activation induced deaminase (AID) that converts cytosine to uracil in single stranded DNA (ssDNA) generated during transcription. Attempts have been made to correlate SHM with ssDNA using bisulfite to chemically convert cytosines that are accessible in the intact chromatin of mutating B cells. These studies have been complicated by using different definitions of “bisulfite accessible regions” (BARs). Recently, deep-sequencing has provided much larger datasets of such regions but computational methods are needed to enable this analysis. Here we leveraged the deep-sequencing approach with unique molecular identifiers and developed a novel Hidden Markov Model based Bayesian Segmentation algorithm to characterize the ssDNA regions in the IGHV4-34 gene of the human Ramos B cell line. Combining hierarchical clustering and our new Bayesian model, we identified recurrent BARs in certain subregions of both top and bottom strands of this gene. Using this new system, the average size of BARs is about 15 bp. We also identified potential G-quadruplex DNA structures in this gene and found that the BARs co-locate with G-quadruplex structures in the opposite strand. Using various correlation analyses, there is not a direct site-to-site relationship between the bisulfite accessible ssDNA and all sites of SHM but most of the highly AID mutated sites are within 15 bp of a BAR. In summary, we developed a novel platform to study single stranded DNA in chromatin at a base pair resolution that reveals potential relationships among BARs, SHM and G-quadruplexes. This platform could be applied to genome wide studies in the future.
format Online
Article
Text
id pubmed-8462741
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-84627412021-09-25 A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes Yu, Guojun Wu, Yingru Duan, Zhi Tang, Catherine Xing, Haipeng Scharff, Matthew D. MacCarthy, Thomas PLoS Comput Biol Research Article The B cells in our body generate protective antibodies by introducing somatic hypermutations (SHM) into the variable region of immunoglobulin genes (IgVs). The mutations are generated by activation induced deaminase (AID) that converts cytosine to uracil in single stranded DNA (ssDNA) generated during transcription. Attempts have been made to correlate SHM with ssDNA using bisulfite to chemically convert cytosines that are accessible in the intact chromatin of mutating B cells. These studies have been complicated by using different definitions of “bisulfite accessible regions” (BARs). Recently, deep-sequencing has provided much larger datasets of such regions but computational methods are needed to enable this analysis. Here we leveraged the deep-sequencing approach with unique molecular identifiers and developed a novel Hidden Markov Model based Bayesian Segmentation algorithm to characterize the ssDNA regions in the IGHV4-34 gene of the human Ramos B cell line. Combining hierarchical clustering and our new Bayesian model, we identified recurrent BARs in certain subregions of both top and bottom strands of this gene. Using this new system, the average size of BARs is about 15 bp. We also identified potential G-quadruplex DNA structures in this gene and found that the BARs co-locate with G-quadruplex structures in the opposite strand. Using various correlation analyses, there is not a direct site-to-site relationship between the bisulfite accessible ssDNA and all sites of SHM but most of the highly AID mutated sites are within 15 bp of a BAR. In summary, we developed a novel platform to study single stranded DNA in chromatin at a base pair resolution that reveals potential relationships among BARs, SHM and G-quadruplexes. This platform could be applied to genome wide studies in the future. Public Library of Science 2021-09-07 /pmc/articles/PMC8462741/ /pubmed/34491985 http://dx.doi.org/10.1371/journal.pcbi.1009323 Text en © 2021 Yu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Yu, Guojun
Wu, Yingru
Duan, Zhi
Tang, Catherine
Xing, Haipeng
Scharff, Matthew D.
MacCarthy, Thomas
A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes
title A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes
title_full A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes
title_fullStr A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes
title_full_unstemmed A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes
title_short A Bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded DNA in chromatin and somatic hypermutation of immunoglobulin genes
title_sort bayesian model based computational analysis of the relationship between bisulfite accessible single-stranded dna in chromatin and somatic hypermutation of immunoglobulin genes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8462741/
https://www.ncbi.nlm.nih.gov/pubmed/34491985
http://dx.doi.org/10.1371/journal.pcbi.1009323
work_keys_str_mv AT yuguojun abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT wuyingru abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT duanzhi abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT tangcatherine abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT xinghaipeng abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT scharffmatthewd abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT maccarthythomas abayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT yuguojun bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT wuyingru bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT duanzhi bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT tangcatherine bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT xinghaipeng bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT scharffmatthewd bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes
AT maccarthythomas bayesianmodelbasedcomputationalanalysisoftherelationshipbetweenbisulfiteaccessiblesinglestrandeddnainchromatinandsomatichypermutationofimmunoglobulingenes