Cargando…
Determinants of antigenicity and specificity in immune response for protein sequences
BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the prop...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3133554/ https://www.ncbi.nlm.nih.gov/pubmed/21693021 http://dx.doi.org/10.1186/1471-2105-12-251 |
_version_ | 1782207913548316672 |
---|---|
author | Wang, Yulong Wu, Wenjun Negre, Nicolas N White, Kevin P Li, Cheng Shah, Parantu K |
author_facet | Wang, Yulong Wu, Wenjun Negre, Nicolas N White, Kevin P Li, Cheng Shah, Parantu K |
author_sort | Wang, Yulong |
collection | PubMed |
description | BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. RESULTS: Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. CONCLUSIONS: Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/. |
format | Online Article Text |
id | pubmed-3133554 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-31335542011-07-12 Determinants of antigenicity and specificity in immune response for protein sequences Wang, Yulong Wu, Wenjun Negre, Nicolas N White, Kevin P Li, Cheng Shah, Parantu K BMC Bioinformatics Research Article BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. RESULTS: Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. CONCLUSIONS: Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/. BioMed Central 2011-06-21 /pmc/articles/PMC3133554/ /pubmed/21693021 http://dx.doi.org/10.1186/1471-2105-12-251 Text en Copyright ©2011 Wang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Wang, Yulong Wu, Wenjun Negre, Nicolas N White, Kevin P Li, Cheng Shah, Parantu K Determinants of antigenicity and specificity in immune response for protein sequences |
title | Determinants of antigenicity and specificity in immune response for protein sequences |
title_full | Determinants of antigenicity and specificity in immune response for protein sequences |
title_fullStr | Determinants of antigenicity and specificity in immune response for protein sequences |
title_full_unstemmed | Determinants of antigenicity and specificity in immune response for protein sequences |
title_short | Determinants of antigenicity and specificity in immune response for protein sequences |
title_sort | determinants of antigenicity and specificity in immune response for protein sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3133554/ https://www.ncbi.nlm.nih.gov/pubmed/21693021 http://dx.doi.org/10.1186/1471-2105-12-251 |
work_keys_str_mv | AT wangyulong determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences AT wuwenjun determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences AT negrenicolasn determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences AT whitekevinp determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences AT licheng determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences AT shahparantuk determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences |