Cargando…

Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming

To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial asse...

Descripción completa

Detalles Bibliográficos
Autores principales: Russell, Magdalena L, Simon, Noah, Bradley, Philip, Matsen, Frederick A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: eLife Sciences Publications, Ltd 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10212571/
https://www.ncbi.nlm.nih.gov/pubmed/37227256
http://dx.doi.org/10.7554/eLife.85145
_version_ 1785047444595146752
author Russell, Magdalena L
Simon, Noah
Bradley, Philip
Matsen, Frederick A
author_facet Russell, Magdalena L
Simon, Noah
Bradley, Philip
Matsen, Frederick A
author_sort Russell, Magdalena L
collection PubMed
description To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial assembly of V(D)J-genes and the junctional deletion and insertion of nucleotides. While the Artemis protein is often regarded as the main nuclease involved in V(D)J recombination, the exact mechanism of nucleotide trimming is not understood. Using a previously published TCRβ repertoire sequencing data set, we have designed a flexible probabilistic model of nucleotide trimming that allows us to explore various mechanistically interpretable sequence-level features. We show that local sequence context, length, and GC nucleotide content in both directions of the wider sequence, together, can most accurately predict the trimming probabilities of a given V-gene sequence. Because GC nucleotide content is predictive of sequence-breathing, this model provides quantitative statistical evidence regarding the extent to which double-stranded DNA may need to be able to breathe for trimming to occur. We also see evidence of a sequence motif that appears to get preferentially trimmed, independent of GC-content-related effects. Further, we find that the inferred coefficients from this model provide accurate prediction for V- and J-gene sequences from other adaptive immune receptor loci. These results refine our understanding of how the Artemis nuclease may function to trim nucleotides during V(D)J recombination and provide another step toward understanding how V(D)J recombination generates diverse receptors and supports a powerful, unique immune response in healthy humans.
format Online
Article
Text
id pubmed-10212571
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher eLife Sciences Publications, Ltd
record_format MEDLINE/PubMed
spelling pubmed-102125712023-05-26 Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming Russell, Magdalena L Simon, Noah Bradley, Philip Matsen, Frederick A eLife Computational and Systems Biology To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial assembly of V(D)J-genes and the junctional deletion and insertion of nucleotides. While the Artemis protein is often regarded as the main nuclease involved in V(D)J recombination, the exact mechanism of nucleotide trimming is not understood. Using a previously published TCRβ repertoire sequencing data set, we have designed a flexible probabilistic model of nucleotide trimming that allows us to explore various mechanistically interpretable sequence-level features. We show that local sequence context, length, and GC nucleotide content in both directions of the wider sequence, together, can most accurately predict the trimming probabilities of a given V-gene sequence. Because GC nucleotide content is predictive of sequence-breathing, this model provides quantitative statistical evidence regarding the extent to which double-stranded DNA may need to be able to breathe for trimming to occur. We also see evidence of a sequence motif that appears to get preferentially trimmed, independent of GC-content-related effects. Further, we find that the inferred coefficients from this model provide accurate prediction for V- and J-gene sequences from other adaptive immune receptor loci. These results refine our understanding of how the Artemis nuclease may function to trim nucleotides during V(D)J recombination and provide another step toward understanding how V(D)J recombination generates diverse receptors and supports a powerful, unique immune response in healthy humans. eLife Sciences Publications, Ltd 2023-05-25 /pmc/articles/PMC10212571/ /pubmed/37227256 http://dx.doi.org/10.7554/eLife.85145 Text en © 2023, Russell et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Computational and Systems Biology
Russell, Magdalena L
Simon, Noah
Bradley, Philip
Matsen, Frederick A
Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
title Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
title_full Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
title_fullStr Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
title_full_unstemmed Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
title_short Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
title_sort statistical inference reveals the role of length, gc content, and local sequence in v(d)j nucleotide trimming
topic Computational and Systems Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10212571/
https://www.ncbi.nlm.nih.gov/pubmed/37227256
http://dx.doi.org/10.7554/eLife.85145
work_keys_str_mv AT russellmagdalenal statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming
AT simonnoah statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming
AT bradleyphilip statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming
AT matsenfredericka statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming