Cargando…

Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes

A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhao, Weilong, Sher, Xinwei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6224037/ https://www.ncbi.nlm.nih.gov/pubmed/30408041 http://dx.doi.org/10.1371/journal.pcbi.1006457

_version_	1783369526349398016
author	Zhao, Weilong Sher, Xinwei
author_facet	Zhao, Weilong Sher, Xinwei
author_sort	Zhao, Weilong
collection	PubMed
description	A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics.
format	Online Article Text
id	pubmed-6224037
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-62240372018-11-19 Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes Zhao, Weilong Sher, Xinwei PLoS Comput Biol Research Article A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics. Public Library of Science 2018-11-08 /pmc/articles/PMC6224037/ /pubmed/30408041 http://dx.doi.org/10.1371/journal.pcbi.1006457 Text en © 2018 Zhao, Sher http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Zhao, Weilong Sher, Xinwei Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
title	Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
title_full	Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
title_fullStr	Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
title_full_unstemmed	Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
title_short	Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes
title_sort	systematically benchmarking peptide-mhc binding predictors: from synthetic to naturally processed epitopes
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6224037/ https://www.ncbi.nlm.nih.gov/pubmed/30408041 http://dx.doi.org/10.1371/journal.pcbi.1006457
work_keys_str_mv	AT zhaoweilong systematicallybenchmarkingpeptidemhcbindingpredictorsfromsynthetictonaturallyprocessedepitopes AT sherxinwei systematicallybenchmarkingpeptidemhcbindingpredictorsfromsynthetictonaturallyprocessedepitopes

Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes

Ejemplares similares