Cargando…

POPISK: T-cell reactivity prediction using support vector machines and string kernels

BACKGROUND: Accurate prediction of peptide immunogenicity and characterization of relation between peptide sequences and peptide immunogenicity will be greatly helpful for vaccine designs and understanding of the immune system. In contrast to the prediction of antigen processing and presentation pat...

Descripción completa

Detalles Bibliográficos
Autores principales: Tung, Chun-Wei, Ziehm, Matthias, Kämper, Andreas, Kohlbacher, Oliver, Ho, Shinn-Ying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228774/
https://www.ncbi.nlm.nih.gov/pubmed/22085524
http://dx.doi.org/10.1186/1471-2105-12-446
_version_ 1782217868694257664
author Tung, Chun-Wei
Ziehm, Matthias
Kämper, Andreas
Kohlbacher, Oliver
Ho, Shinn-Ying
author_facet Tung, Chun-Wei
Ziehm, Matthias
Kämper, Andreas
Kohlbacher, Oliver
Ho, Shinn-Ying
author_sort Tung, Chun-Wei
collection PubMed
description BACKGROUND: Accurate prediction of peptide immunogenicity and characterization of relation between peptide sequences and peptide immunogenicity will be greatly helpful for vaccine designs and understanding of the immune system. In contrast to the prediction of antigen processing and presentation pathway, the prediction of subsequent T-cell reactivity is a much harder topic. Previous studies of identifying T-cell receptor (TCR) recognition positions were based on small-scale analyses using only a few peptides and concluded different recognition positions such as positions 4, 6 and 8 of peptides with length 9. Large-scale analyses are necessary to better characterize the effect of peptide sequence variations on T-cell reactivity and design predictors of a peptide's T-cell reactivity (and thus immunogenicity). The identification and characterization of important positions influencing T-cell reactivity will provide insights into the underlying mechanism of immunogenicity. RESULTS: This work establishes a large dataset by collecting immunogenicity data from three major immunology databases. In order to consider the effect of MHC restriction, peptides are classified by their associated MHC alleles. Subsequently, a computational method (named POPISK) using support vector machine with a weighted degree string kernel is proposed to predict T-cell reactivity and identify important recognition positions. POPISK yields a mean 10-fold cross-validation accuracy of 68% in predicting T-cell reactivity of HLA-A2-binding peptides. POPISK is capable of predicting immunogenicity with scores that can also correctly predict the change in T-cell reactivity related to point mutations in epitopes reported in previous studies using crystal structures. Thorough analyses of the prediction results identify the important positions 4, 6, 8 and 9, and yield insights into the molecular basis for TCR recognition. Finally, we relate this finding to physicochemical properties and structural features of the MHC-peptide-TCR interaction. CONCLUSIONS: A computational method POPISK is proposed to predict immunogenicity with scores which are useful for predicting immunogenicity changes made by single-residue modifications. The web server of POPISK is freely available at http://iclab.life.nctu.edu.tw/POPISK.
format Online
Article
Text
id pubmed-3228774
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32287742011-12-12 POPISK: T-cell reactivity prediction using support vector machines and string kernels Tung, Chun-Wei Ziehm, Matthias Kämper, Andreas Kohlbacher, Oliver Ho, Shinn-Ying BMC Bioinformatics Research Article BACKGROUND: Accurate prediction of peptide immunogenicity and characterization of relation between peptide sequences and peptide immunogenicity will be greatly helpful for vaccine designs and understanding of the immune system. In contrast to the prediction of antigen processing and presentation pathway, the prediction of subsequent T-cell reactivity is a much harder topic. Previous studies of identifying T-cell receptor (TCR) recognition positions were based on small-scale analyses using only a few peptides and concluded different recognition positions such as positions 4, 6 and 8 of peptides with length 9. Large-scale analyses are necessary to better characterize the effect of peptide sequence variations on T-cell reactivity and design predictors of a peptide's T-cell reactivity (and thus immunogenicity). The identification and characterization of important positions influencing T-cell reactivity will provide insights into the underlying mechanism of immunogenicity. RESULTS: This work establishes a large dataset by collecting immunogenicity data from three major immunology databases. In order to consider the effect of MHC restriction, peptides are classified by their associated MHC alleles. Subsequently, a computational method (named POPISK) using support vector machine with a weighted degree string kernel is proposed to predict T-cell reactivity and identify important recognition positions. POPISK yields a mean 10-fold cross-validation accuracy of 68% in predicting T-cell reactivity of HLA-A2-binding peptides. POPISK is capable of predicting immunogenicity with scores that can also correctly predict the change in T-cell reactivity related to point mutations in epitopes reported in previous studies using crystal structures. Thorough analyses of the prediction results identify the important positions 4, 6, 8 and 9, and yield insights into the molecular basis for TCR recognition. Finally, we relate this finding to physicochemical properties and structural features of the MHC-peptide-TCR interaction. CONCLUSIONS: A computational method POPISK is proposed to predict immunogenicity with scores which are useful for predicting immunogenicity changes made by single-residue modifications. The web server of POPISK is freely available at http://iclab.life.nctu.edu.tw/POPISK. BioMed Central 2011-11-15 /pmc/articles/PMC3228774/ /pubmed/22085524 http://dx.doi.org/10.1186/1471-2105-12-446 Text en Copyright ©2011 Tung et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tung, Chun-Wei
Ziehm, Matthias
Kämper, Andreas
Kohlbacher, Oliver
Ho, Shinn-Ying
POPISK: T-cell reactivity prediction using support vector machines and string kernels
title POPISK: T-cell reactivity prediction using support vector machines and string kernels
title_full POPISK: T-cell reactivity prediction using support vector machines and string kernels
title_fullStr POPISK: T-cell reactivity prediction using support vector machines and string kernels
title_full_unstemmed POPISK: T-cell reactivity prediction using support vector machines and string kernels
title_short POPISK: T-cell reactivity prediction using support vector machines and string kernels
title_sort popisk: t-cell reactivity prediction using support vector machines and string kernels
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228774/
https://www.ncbi.nlm.nih.gov/pubmed/22085524
http://dx.doi.org/10.1186/1471-2105-12-446
work_keys_str_mv AT tungchunwei popisktcellreactivitypredictionusingsupportvectormachinesandstringkernels
AT ziehmmatthias popisktcellreactivitypredictionusingsupportvectormachinesandstringkernels
AT kamperandreas popisktcellreactivitypredictionusingsupportvectormachinesandstringkernels
AT kohlbacheroliver popisktcellreactivitypredictionusingsupportvectormachinesandstringkernels
AT hoshinnying popisktcellreactivitypredictionusingsupportvectormachinesandstringkernels