Cargando…

BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing

MOTIVATION: The advent of T-cell receptor (TCR) sequencing experiments allowed for a significant increase in the amount of peptide:TCR binding data available and a number of machine-learning models appeared in recent years. High-quality prediction models for a fixed epitope sequence are feasible, pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Myronov, Alexander, Mazzocco, Giovanni, Król, Paulina, Plewczynski, Dariusz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444968/
https://www.ncbi.nlm.nih.gov/pubmed/37535685
http://dx.doi.org/10.1093/bioinformatics/btad468
_version_ 1785094070913204224
author Myronov, Alexander
Mazzocco, Giovanni
Król, Paulina
Plewczynski, Dariusz
author_facet Myronov, Alexander
Mazzocco, Giovanni
Król, Paulina
Plewczynski, Dariusz
author_sort Myronov, Alexander
collection PubMed
description MOTIVATION: The advent of T-cell receptor (TCR) sequencing experiments allowed for a significant increase in the amount of peptide:TCR binding data available and a number of machine-learning models appeared in recent years. High-quality prediction models for a fixed epitope sequence are feasible, provided enough known binding TCR sequences are available. However, their performance drops significantly for previously unseen peptides. RESULTS: We prepare the dataset of known peptide:TCR binders and augment it with negative decoys created using healthy donors’ T-cell repertoires. We employ deep learning methods commonly applied in Natural Language Processing to train part a peptide:TCR binding model with a degree of cross-peptide generalization (0.69 AUROC). We demonstrate that BERTrand outperforms the published methods when evaluated on peptide sequences not used during model training. AVAILABILITY AND IMPLEMENTATION: The datasets and the code for model training are available at https://github.com/SFGLab/bertrand.
format Online
Article
Text
id pubmed-10444968
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-104449682023-08-24 BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing Myronov, Alexander Mazzocco, Giovanni Król, Paulina Plewczynski, Dariusz Bioinformatics Original Paper MOTIVATION: The advent of T-cell receptor (TCR) sequencing experiments allowed for a significant increase in the amount of peptide:TCR binding data available and a number of machine-learning models appeared in recent years. High-quality prediction models for a fixed epitope sequence are feasible, provided enough known binding TCR sequences are available. However, their performance drops significantly for previously unseen peptides. RESULTS: We prepare the dataset of known peptide:TCR binders and augment it with negative decoys created using healthy donors’ T-cell repertoires. We employ deep learning methods commonly applied in Natural Language Processing to train part a peptide:TCR binding model with a degree of cross-peptide generalization (0.69 AUROC). We demonstrate that BERTrand outperforms the published methods when evaluated on peptide sequences not used during model training. AVAILABILITY AND IMPLEMENTATION: The datasets and the code for model training are available at https://github.com/SFGLab/bertrand. Oxford University Press 2023-08-03 /pmc/articles/PMC10444968/ /pubmed/37535685 http://dx.doi.org/10.1093/bioinformatics/btad468 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Myronov, Alexander
Mazzocco, Giovanni
Król, Paulina
Plewczynski, Dariusz
BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing
title BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing
title_full BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing
title_fullStr BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing
title_full_unstemmed BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing
title_short BERTrand—peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing
title_sort bertrand—peptide:tcr binding prediction using bidirectional encoder representations from transformers augmented with random tcr pairing
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444968/
https://www.ncbi.nlm.nih.gov/pubmed/37535685
http://dx.doi.org/10.1093/bioinformatics/btad468
work_keys_str_mv AT myronovalexander bertrandpeptidetcrbindingpredictionusingbidirectionalencoderrepresentationsfromtransformersaugmentedwithrandomtcrpairing
AT mazzoccogiovanni bertrandpeptidetcrbindingpredictionusingbidirectionalencoderrepresentationsfromtransformersaugmentedwithrandomtcrpairing
AT krolpaulina bertrandpeptidetcrbindingpredictionusingbidirectionalencoderrepresentationsfromtransformersaugmentedwithrandomtcrpairing
AT plewczynskidariusz bertrandpeptidetcrbindingpredictionusingbidirectionalencoderrepresentationsfromtransformersaugmentedwithrandomtcrpairing