Cargando…

BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models

B‐cell epitope prediction tools are of great medical and commercial interest due to their practical applications in vaccine development and disease diagnostics. The introduction of protein language models (LMs), trained on unprecedented large datasets of protein sequences and structures, tap into a...

Descripción completa

Detalles Bibliográficos
Autores principales: Clifford, Joakim Nøddeskov, Høie, Magnus Haraldson, Deleuran, Sebastian, Peters, Bjoern, Nielsen, Morten, Marcatili, Paolo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679979/
https://www.ncbi.nlm.nih.gov/pubmed/36366745
http://dx.doi.org/10.1002/pro.4497
_version_ 1784834310512050176
author Clifford, Joakim Nøddeskov
Høie, Magnus Haraldson
Deleuran, Sebastian
Peters, Bjoern
Nielsen, Morten
Marcatili, Paolo
author_facet Clifford, Joakim Nøddeskov
Høie, Magnus Haraldson
Deleuran, Sebastian
Peters, Bjoern
Nielsen, Morten
Marcatili, Paolo
author_sort Clifford, Joakim Nøddeskov
collection PubMed
description B‐cell epitope prediction tools are of great medical and commercial interest due to their practical applications in vaccine development and disease diagnostics. The introduction of protein language models (LMs), trained on unprecedented large datasets of protein sequences and structures, tap into a powerful numeric representation that can be exploited to accurately predict local and global protein structural features from amino acid sequences only. In this paper, we present BepiPred‐3.0, a sequence‐based epitope prediction tool that, by exploiting LM embeddings, greatly improves the prediction accuracy for both linear and conformational epitope prediction on several independent test sets. Furthermore, by carefully selecting additional input variables and epitope residue annotation strategy, performance was further improved, thus achieving unprecedented predictive power. Our tool can predict epitopes across hundreds of sequences in minutes. It is freely available as a web server and a standalone package at https://services.healthtech.dtu.dk/service.php?BepiPred-3.0 with a user‐friendly interface to navigate the results.
format Online
Article
Text
id pubmed-9679979
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley & Sons, Inc.
record_format MEDLINE/PubMed
spelling pubmed-96799792022-12-01 BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models Clifford, Joakim Nøddeskov Høie, Magnus Haraldson Deleuran, Sebastian Peters, Bjoern Nielsen, Morten Marcatili, Paolo Protein Sci Tools for Protein Science B‐cell epitope prediction tools are of great medical and commercial interest due to their practical applications in vaccine development and disease diagnostics. The introduction of protein language models (LMs), trained on unprecedented large datasets of protein sequences and structures, tap into a powerful numeric representation that can be exploited to accurately predict local and global protein structural features from amino acid sequences only. In this paper, we present BepiPred‐3.0, a sequence‐based epitope prediction tool that, by exploiting LM embeddings, greatly improves the prediction accuracy for both linear and conformational epitope prediction on several independent test sets. Furthermore, by carefully selecting additional input variables and epitope residue annotation strategy, performance was further improved, thus achieving unprecedented predictive power. Our tool can predict epitopes across hundreds of sequences in minutes. It is freely available as a web server and a standalone package at https://services.healthtech.dtu.dk/service.php?BepiPred-3.0 with a user‐friendly interface to navigate the results. John Wiley & Sons, Inc. 2022-12 /pmc/articles/PMC9679979/ /pubmed/36366745 http://dx.doi.org/10.1002/pro.4497 Text en © 2022 The Authors. Protein Science published by Wiley Periodicals LLC on behalf of The Protein Society. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Tools for Protein Science
Clifford, Joakim Nøddeskov
Høie, Magnus Haraldson
Deleuran, Sebastian
Peters, Bjoern
Nielsen, Morten
Marcatili, Paolo
BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models
title BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models
title_full BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models
title_fullStr BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models
title_full_unstemmed BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models
title_short BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models
title_sort bepipred‐3.0: improved b‐cell epitope prediction using protein language models
topic Tools for Protein Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679979/
https://www.ncbi.nlm.nih.gov/pubmed/36366745
http://dx.doi.org/10.1002/pro.4497
work_keys_str_mv AT cliffordjoakimnøddeskov bepipred30improvedbcellepitopepredictionusingproteinlanguagemodels
AT høiemagnusharaldson bepipred30improvedbcellepitopepredictionusingproteinlanguagemodels
AT deleuransebastian bepipred30improvedbcellepitopepredictionusingproteinlanguagemodels
AT petersbjoern bepipred30improvedbcellepitopepredictionusingproteinlanguagemodels
AT nielsenmorten bepipred30improvedbcellepitopepredictionusingproteinlanguagemodels
AT marcatilipaolo bepipred30improvedbcellepitopepredictionusingproteinlanguagemodels