Cargando…
Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial asse...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
eLife Sciences Publications, Ltd
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10212571/ https://www.ncbi.nlm.nih.gov/pubmed/37227256 http://dx.doi.org/10.7554/eLife.85145 |
_version_ | 1785047444595146752 |
---|---|
author | Russell, Magdalena L Simon, Noah Bradley, Philip Matsen, Frederick A |
author_facet | Russell, Magdalena L Simon, Noah Bradley, Philip Matsen, Frederick A |
author_sort | Russell, Magdalena L |
collection | PubMed |
description | To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial assembly of V(D)J-genes and the junctional deletion and insertion of nucleotides. While the Artemis protein is often regarded as the main nuclease involved in V(D)J recombination, the exact mechanism of nucleotide trimming is not understood. Using a previously published TCRβ repertoire sequencing data set, we have designed a flexible probabilistic model of nucleotide trimming that allows us to explore various mechanistically interpretable sequence-level features. We show that local sequence context, length, and GC nucleotide content in both directions of the wider sequence, together, can most accurately predict the trimming probabilities of a given V-gene sequence. Because GC nucleotide content is predictive of sequence-breathing, this model provides quantitative statistical evidence regarding the extent to which double-stranded DNA may need to be able to breathe for trimming to occur. We also see evidence of a sequence motif that appears to get preferentially trimmed, independent of GC-content-related effects. Further, we find that the inferred coefficients from this model provide accurate prediction for V- and J-gene sequences from other adaptive immune receptor loci. These results refine our understanding of how the Artemis nuclease may function to trim nucleotides during V(D)J recombination and provide another step toward understanding how V(D)J recombination generates diverse receptors and supports a powerful, unique immune response in healthy humans. |
format | Online Article Text |
id | pubmed-10212571 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | eLife Sciences Publications, Ltd |
record_format | MEDLINE/PubMed |
spelling | pubmed-102125712023-05-26 Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming Russell, Magdalena L Simon, Noah Bradley, Philip Matsen, Frederick A eLife Computational and Systems Biology To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random process called V(D)J recombination. Receptor diversity is achieved during this process through both the combinatorial assembly of V(D)J-genes and the junctional deletion and insertion of nucleotides. While the Artemis protein is often regarded as the main nuclease involved in V(D)J recombination, the exact mechanism of nucleotide trimming is not understood. Using a previously published TCRβ repertoire sequencing data set, we have designed a flexible probabilistic model of nucleotide trimming that allows us to explore various mechanistically interpretable sequence-level features. We show that local sequence context, length, and GC nucleotide content in both directions of the wider sequence, together, can most accurately predict the trimming probabilities of a given V-gene sequence. Because GC nucleotide content is predictive of sequence-breathing, this model provides quantitative statistical evidence regarding the extent to which double-stranded DNA may need to be able to breathe for trimming to occur. We also see evidence of a sequence motif that appears to get preferentially trimmed, independent of GC-content-related effects. Further, we find that the inferred coefficients from this model provide accurate prediction for V- and J-gene sequences from other adaptive immune receptor loci. These results refine our understanding of how the Artemis nuclease may function to trim nucleotides during V(D)J recombination and provide another step toward understanding how V(D)J recombination generates diverse receptors and supports a powerful, unique immune response in healthy humans. eLife Sciences Publications, Ltd 2023-05-25 /pmc/articles/PMC10212571/ /pubmed/37227256 http://dx.doi.org/10.7554/eLife.85145 Text en © 2023, Russell et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited. |
spellingShingle | Computational and Systems Biology Russell, Magdalena L Simon, Noah Bradley, Philip Matsen, Frederick A Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming |
title | Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming |
title_full | Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming |
title_fullStr | Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming |
title_full_unstemmed | Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming |
title_short | Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming |
title_sort | statistical inference reveals the role of length, gc content, and local sequence in v(d)j nucleotide trimming |
topic | Computational and Systems Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10212571/ https://www.ncbi.nlm.nih.gov/pubmed/37227256 http://dx.doi.org/10.7554/eLife.85145 |
work_keys_str_mv | AT russellmagdalenal statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming AT simonnoah statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming AT bradleyphilip statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming AT matsenfredericka statisticalinferencerevealstheroleoflengthgccontentandlocalsequenceinvdjnucleotidetrimming |