Cargando…
OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs
MOTIVATION: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging o...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735909/ https://www.ncbi.nlm.nih.gov/pubmed/30657870 http://dx.doi.org/10.1093/bioinformatics/btz035 |
_version_ | 1783450432672104448 |
---|---|
author | Sethna, Zachary Elhanati, Yuval Callan, Curtis G Walczak, Aleksandra M Mora, Thierry |
author_facet | Sethna, Zachary Elhanati, Yuval Callan, Curtis G Walczak, Aleksandra M Mora, Thierry |
author_sort | Sethna, Zachary |
collection | PubMed |
description | MOTIVATION: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem. RESULTS: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/zsethna/OLGA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6735909 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-67359092019-09-16 OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs Sethna, Zachary Elhanati, Yuval Callan, Curtis G Walczak, Aleksandra M Mora, Thierry Bioinformatics Original Papers MOTIVATION: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem. RESULTS: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/zsethna/OLGA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-09-01 2019-01-18 /pmc/articles/PMC6735909/ /pubmed/30657870 http://dx.doi.org/10.1093/bioinformatics/btz035 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Sethna, Zachary Elhanati, Yuval Callan, Curtis G Walczak, Aleksandra M Mora, Thierry OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs |
title | OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs |
title_full | OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs |
title_fullStr | OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs |
title_full_unstemmed | OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs |
title_short | OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs |
title_sort | olga: fast computation of generation probabilities of b- and t-cell receptor amino acid sequences and motifs |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6735909/ https://www.ncbi.nlm.nih.gov/pubmed/30657870 http://dx.doi.org/10.1093/bioinformatics/btz035 |
work_keys_str_mv | AT sethnazachary olgafastcomputationofgenerationprobabilitiesofbandtcellreceptoraminoacidsequencesandmotifs AT elhanatiyuval olgafastcomputationofgenerationprobabilitiesofbandtcellreceptoraminoacidsequencesandmotifs AT callancurtisg olgafastcomputationofgenerationprobabilitiesofbandtcellreceptoraminoacidsequencesandmotifs AT walczakaleksandram olgafastcomputationofgenerationprobabilitiesofbandtcellreceptoraminoacidsequencesandmotifs AT morathierry olgafastcomputationofgenerationprobabilitiesofbandtcellreceptoraminoacidsequencesandmotifs |