Cargando…

BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing

MOTIVATION: The advent of T-cell receptor (TCR) sequencing experiments allowed for a significant increase in the amount of peptide:TCR binding data available and a number of machine-learning models appeared in recent years. High-quality prediction models for a fixed epitope sequence are feasible, pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Myronov, Alexander, Mazzocco, Giovanni, Król, Paulina, Plewczynski, Dariusz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444968/
https://www.ncbi.nlm.nih.gov/pubmed/37535685
http://dx.doi.org/10.1093/bioinformatics/btad468
Descripción
Sumario:MOTIVATION: The advent of T-cell receptor (TCR) sequencing experiments allowed for a significant increase in the amount of peptide:TCR binding data available and a number of machine-learning models appeared in recent years. High-quality prediction models for a fixed epitope sequence are feasible, provided enough known binding TCR sequences are available. However, their performance drops significantly for previously unseen peptides. RESULTS: We prepare the dataset of known peptide:TCR binders and augment it with negative decoys created using healthy donors’ T-cell repertoires. We employ deep learning methods commonly applied in Natural Language Processing to train part a peptide:TCR binding model with a degree of cross-peptide generalization (0.69 AUROC). We demonstrate that BERTrand outperforms the published methods when evaluated on peptide sequences not used during model training. AVAILABILITY AND IMPLEMENTATION: The datasets and the code for model training are available at https://github.com/SFGLab/bertrand.