Cargando…

epiTCR: a highly sensitive predictor for TCR–peptide binding

MOTIVATION: Predicting the binding between T-cell receptor (TCR) and peptide presented by human leucocyte antigen molecule is a highly challenging task and a key bottleneck in the development of immunotherapy. Existing prediction tools, despite exhibiting good performance on the datasets they were b...

Descripción completa

Detalles Bibliográficos
Autores principales: Pham, My-Diem Nguyen, Nguyen, Thanh-Nhan, Tran, Le Son, Nguyen, Que-Tran Bui, Nguyen, Thien-Phuc Hoang, Pham, Thi Mong Quynh, Nguyen, Hoai-Nghia, Giang, Hoa, Phan, Minh-Duy, Nguyen, Vy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10159657/
https://www.ncbi.nlm.nih.gov/pubmed/37094220
http://dx.doi.org/10.1093/bioinformatics/btad284
_version_ 1785037147617624064
author Pham, My-Diem Nguyen
Nguyen, Thanh-Nhan
Tran, Le Son
Nguyen, Que-Tran Bui
Nguyen, Thien-Phuc Hoang
Pham, Thi Mong Quynh
Nguyen, Hoai-Nghia
Giang, Hoa
Phan, Minh-Duy
Nguyen, Vy
author_facet Pham, My-Diem Nguyen
Nguyen, Thanh-Nhan
Tran, Le Son
Nguyen, Que-Tran Bui
Nguyen, Thien-Phuc Hoang
Pham, Thi Mong Quynh
Nguyen, Hoai-Nghia
Giang, Hoa
Phan, Minh-Duy
Nguyen, Vy
author_sort Pham, My-Diem Nguyen
collection PubMed
description MOTIVATION: Predicting the binding between T-cell receptor (TCR) and peptide presented by human leucocyte antigen molecule is a highly challenging task and a key bottleneck in the development of immunotherapy. Existing prediction tools, despite exhibiting good performance on the datasets they were built with, suffer from low true positive rates when used to predict epitopes capable of eliciting T-cell responses in patients. Therefore, an improved tool for TCR–peptide prediction built upon a large dataset combining existing publicly available data is still needed. RESULTS: We collected data from five public databases (IEDB, TBAdb, VDJdb, McPAS-TCR, and 10X) to form a dataset of >3 million TCR–peptide pairs, 3.27% of which were binding interactions. We proposed epiTCR, a Random Forest-based method dedicated to predicting the TCR–peptide interactions. epiTCR used simple input of TCR CDR3β sequences and antigen sequences, which are encoded by flattened BLOSUM62. epiTCR performed with area under the curve (0.98) and higher sensitivity (0.94) than other existing tools (NetTCR, Imrex, ATM-TCR, and pMTnet), while maintaining comparable prediction specificity (0.9). We identified seven epitopes that contributed to 98.67% of false positives predicted by epiTCR and exerted similar effects on other tools. We also demonstrated a considerable influence of peptide sequences on prediction, highlighting the need for more diverse peptides in a more balanced dataset. In conclusion, epiTCR is among the most well-performing tools, thanks to the use of combined data from public sources and its use will contribute to the quest in identifying neoantigens for precision cancer immunotherapy. AVAILABILITY AND IMPLEMENTATION: epiTCR is available on GitHub (https://github.com/ddiem-ri-4D/epiTCR).
format Online
Article
Text
id pubmed-10159657
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101596572023-05-05 epiTCR: a highly sensitive predictor for TCR–peptide binding Pham, My-Diem Nguyen Nguyen, Thanh-Nhan Tran, Le Son Nguyen, Que-Tran Bui Nguyen, Thien-Phuc Hoang Pham, Thi Mong Quynh Nguyen, Hoai-Nghia Giang, Hoa Phan, Minh-Duy Nguyen, Vy Bioinformatics Original Paper MOTIVATION: Predicting the binding between T-cell receptor (TCR) and peptide presented by human leucocyte antigen molecule is a highly challenging task and a key bottleneck in the development of immunotherapy. Existing prediction tools, despite exhibiting good performance on the datasets they were built with, suffer from low true positive rates when used to predict epitopes capable of eliciting T-cell responses in patients. Therefore, an improved tool for TCR–peptide prediction built upon a large dataset combining existing publicly available data is still needed. RESULTS: We collected data from five public databases (IEDB, TBAdb, VDJdb, McPAS-TCR, and 10X) to form a dataset of >3 million TCR–peptide pairs, 3.27% of which were binding interactions. We proposed epiTCR, a Random Forest-based method dedicated to predicting the TCR–peptide interactions. epiTCR used simple input of TCR CDR3β sequences and antigen sequences, which are encoded by flattened BLOSUM62. epiTCR performed with area under the curve (0.98) and higher sensitivity (0.94) than other existing tools (NetTCR, Imrex, ATM-TCR, and pMTnet), while maintaining comparable prediction specificity (0.9). We identified seven epitopes that contributed to 98.67% of false positives predicted by epiTCR and exerted similar effects on other tools. We also demonstrated a considerable influence of peptide sequences on prediction, highlighting the need for more diverse peptides in a more balanced dataset. In conclusion, epiTCR is among the most well-performing tools, thanks to the use of combined data from public sources and its use will contribute to the quest in identifying neoantigens for precision cancer immunotherapy. AVAILABILITY AND IMPLEMENTATION: epiTCR is available on GitHub (https://github.com/ddiem-ri-4D/epiTCR). Oxford University Press 2023-04-24 /pmc/articles/PMC10159657/ /pubmed/37094220 http://dx.doi.org/10.1093/bioinformatics/btad284 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Pham, My-Diem Nguyen
Nguyen, Thanh-Nhan
Tran, Le Son
Nguyen, Que-Tran Bui
Nguyen, Thien-Phuc Hoang
Pham, Thi Mong Quynh
Nguyen, Hoai-Nghia
Giang, Hoa
Phan, Minh-Duy
Nguyen, Vy
epiTCR: a highly sensitive predictor for TCR–peptide binding
title epiTCR: a highly sensitive predictor for TCR–peptide binding
title_full epiTCR: a highly sensitive predictor for TCR–peptide binding
title_fullStr epiTCR: a highly sensitive predictor for TCR–peptide binding
title_full_unstemmed epiTCR: a highly sensitive predictor for TCR–peptide binding
title_short epiTCR: a highly sensitive predictor for TCR–peptide binding
title_sort epitcr: a highly sensitive predictor for tcr–peptide binding
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10159657/
https://www.ncbi.nlm.nih.gov/pubmed/37094220
http://dx.doi.org/10.1093/bioinformatics/btad284
work_keys_str_mv AT phammydiemnguyen epitcrahighlysensitivepredictorfortcrpeptidebinding
AT nguyenthanhnhan epitcrahighlysensitivepredictorfortcrpeptidebinding
AT tranleson epitcrahighlysensitivepredictorfortcrpeptidebinding
AT nguyenquetranbui epitcrahighlysensitivepredictorfortcrpeptidebinding
AT nguyenthienphuchoang epitcrahighlysensitivepredictorfortcrpeptidebinding
AT phamthimongquynh epitcrahighlysensitivepredictorfortcrpeptidebinding
AT nguyenhoainghia epitcrahighlysensitivepredictorfortcrpeptidebinding
AT gianghoa epitcrahighlysensitivepredictorfortcrpeptidebinding
AT phanminhduy epitcrahighlysensitivepredictorfortcrpeptidebinding
AT nguyenvy epitcrahighlysensitivepredictorfortcrpeptidebinding