Cargando…

BERTMHC: improved MHC–peptide class II interaction prediction with transformer and multiple instance learning

MOTIVATION: Increasingly comprehensive characterization of cancer-associated genetic alterations has paved the way for the development of highly specific therapeutic vaccines. Predicting precisely the binding and presentation of peptides to major histocompatibility complex (MHC) alleles is an import...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Jun, Bendjama, Kaïdre, Rittner, Karola, Malone, Brandon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9502151/
https://www.ncbi.nlm.nih.gov/pubmed/34096999
http://dx.doi.org/10.1093/bioinformatics/btab422
Descripción
Sumario:MOTIVATION: Increasingly comprehensive characterization of cancer-associated genetic alterations has paved the way for the development of highly specific therapeutic vaccines. Predicting precisely the binding and presentation of peptides to major histocompatibility complex (MHC) alleles is an important step toward such therapies. Recent data suggest that presentation of both class I and II epitopes are critical for the induction of a sustained effective immune response. However, the prediction performance for MHC class II has been limited compared to class I. RESULTS: We present a transformer neural network model which leverages self-supervised pretraining from a large corpus of protein sequences. We also propose a multiple instance learning (MIL) framework to deconvolve mass spectrometry data where multiple potential MHC alleles may have presented each peptide. We show that pretraining boosted the performance for these tasks. Combining pretraining and the novel MIL approach, our model outperforms state-of-the-art models based on peptide and MHC sequence only for both binding and cell surface presentation predictions. AVAILABILITY AND IMPLEMENTATION: Our source code is available at https://github.com/s6juncheng/BERTMHC under a noncommercial license. A webserver is available at https://bertmhc.privacy.nlehd.de/ SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.