Cargando…

Determinants of antigenicity and specificity in immune response for protein sequences

BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the prop...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yulong, Wu, Wenjun, Negre, Nicolas N, White, Kevin P, Li, Cheng, Shah, Parantu K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3133554/
https://www.ncbi.nlm.nih.gov/pubmed/21693021
http://dx.doi.org/10.1186/1471-2105-12-251
_version_ 1782207913548316672
author Wang, Yulong
Wu, Wenjun
Negre, Nicolas N
White, Kevin P
Li, Cheng
Shah, Parantu K
author_facet Wang, Yulong
Wu, Wenjun
Negre, Nicolas N
White, Kevin P
Li, Cheng
Shah, Parantu K
author_sort Wang, Yulong
collection PubMed
description BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. RESULTS: Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. CONCLUSIONS: Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/.
format Online
Article
Text
id pubmed-3133554
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31335542011-07-12 Determinants of antigenicity and specificity in immune response for protein sequences Wang, Yulong Wu, Wenjun Negre, Nicolas N White, Kevin P Li, Cheng Shah, Parantu K BMC Bioinformatics Research Article BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. RESULTS: Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. CONCLUSIONS: Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/. BioMed Central 2011-06-21 /pmc/articles/PMC3133554/ /pubmed/21693021 http://dx.doi.org/10.1186/1471-2105-12-251 Text en Copyright ©2011 Wang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wang, Yulong
Wu, Wenjun
Negre, Nicolas N
White, Kevin P
Li, Cheng
Shah, Parantu K
Determinants of antigenicity and specificity in immune response for protein sequences
title Determinants of antigenicity and specificity in immune response for protein sequences
title_full Determinants of antigenicity and specificity in immune response for protein sequences
title_fullStr Determinants of antigenicity and specificity in immune response for protein sequences
title_full_unstemmed Determinants of antigenicity and specificity in immune response for protein sequences
title_short Determinants of antigenicity and specificity in immune response for protein sequences
title_sort determinants of antigenicity and specificity in immune response for protein sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3133554/
https://www.ncbi.nlm.nih.gov/pubmed/21693021
http://dx.doi.org/10.1186/1471-2105-12-251
work_keys_str_mv AT wangyulong determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT wuwenjun determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT negrenicolasn determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT whitekevinp determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT licheng determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT shahparantuk determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences