Cargando…

PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions

BACKGROUND: Many different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction...

Descripción completa

Detalles Bibliográficos
Autores principales: Hertz, Tomer, Yanover, Chen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1810314/
https://www.ncbi.nlm.nih.gov/pubmed/16723006
http://dx.doi.org/10.1186/1471-2105-7-S1-S3
_version_ 1782132576222183424
author Hertz, Tomer
Yanover, Chen
author_facet Hertz, Tomer
Yanover, Chen
author_sort Hertz, Tomer
collection PubMed
description BACKGROUND: Many different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction is therefore an important task with applications to vaccine and drug design. METHODS: Previous learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose PepDist: a novel approach for predicting binding affinity. Our approach is based on learning peptide-peptide distance functions. Moreover, we suggest to learn a single peptide-peptide distance function over an entire family of proteins (e.g. MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically, we propose to use DistBoost [1,2], which is a semi-supervised distance learning algorithm. RESULTS: We compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, our method's performance gain, when compared to other computational methods, is even more pronounced. We have recently uploaded the PepDist webserver which provides binding prediction of peptides to 35 different MHC class I alleles. The webserver which can be found at is powered by a prediction engine which was trained using the framework presented in this paper. CONCLUSION: The results obtained suggest that learning a single distance function over an entire family of proteins achieves higher prediction accuracy than learning a set of binary classifiers for each of the proteins separately. We also show the importance of obtaining information on experimentally determined non-binders. Learning with real non-binders generalizes better than learning with randomly generated peptides that are assumed to be non-binders. This suggests that information about non-binding peptides should also be published and made publicly available.
format Text
id pubmed-1810314
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18103142007-03-14 PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions Hertz, Tomer Yanover, Chen BMC Bioinformatics Proceedings BACKGROUND: Many different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction is therefore an important task with applications to vaccine and drug design. METHODS: Previous learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose PepDist: a novel approach for predicting binding affinity. Our approach is based on learning peptide-peptide distance functions. Moreover, we suggest to learn a single peptide-peptide distance function over an entire family of proteins (e.g. MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically, we propose to use DistBoost [1,2], which is a semi-supervised distance learning algorithm. RESULTS: We compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, our method's performance gain, when compared to other computational methods, is even more pronounced. We have recently uploaded the PepDist webserver which provides binding prediction of peptides to 35 different MHC class I alleles. The webserver which can be found at is powered by a prediction engine which was trained using the framework presented in this paper. CONCLUSION: The results obtained suggest that learning a single distance function over an entire family of proteins achieves higher prediction accuracy than learning a set of binary classifiers for each of the proteins separately. We also show the importance of obtaining information on experimentally determined non-binders. Learning with real non-binders generalizes better than learning with randomly generated peptides that are assumed to be non-binders. This suggests that information about non-binding peptides should also be published and made publicly available. BioMed Central 2006-03-20 /pmc/articles/PMC1810314/ /pubmed/16723006 http://dx.doi.org/10.1186/1471-2105-7-S1-S3 Text en
spellingShingle Proceedings
Hertz, Tomer
Yanover, Chen
PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
title PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
title_full PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
title_fullStr PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
title_full_unstemmed PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
title_short PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
title_sort pepdist: a new framework for protein-peptide binding prediction based on learning peptide distance functions
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1810314/
https://www.ncbi.nlm.nih.gov/pubmed/16723006
http://dx.doi.org/10.1186/1471-2105-7-S1-S3
work_keys_str_mv AT hertztomer pepdistanewframeworkforproteinpeptidebindingpredictionbasedonlearningpeptidedistancefunctions
AT yanoverchen pepdistanewframeworkforproteinpeptidebindingpredictionbasedonlearningpeptidedistancefunctions