Cargando…
Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature
BACKGROUND: Antigen-antibody interactions are key events in immune system, which provide important clues to the immune processes and responses. In Antigen-antibody interactions, the specific sites on the antigens that are directly bound by the B-cell produced antibodies are well known as B-cell epit...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228550/ https://www.ncbi.nlm.nih.gov/pubmed/21846404 http://dx.doi.org/10.1186/1471-2105-12-341 |
_version_ | 1782217831664844800 |
---|---|
author | Zhang, Wen Xiong, Yi Zhao, Meng Zou, Hua Ye, Xinghuo Liu, Juan |
author_facet | Zhang, Wen Xiong, Yi Zhao, Meng Zou, Hua Ye, Xinghuo Liu, Juan |
author_sort | Zhang, Wen |
collection | PubMed |
description | BACKGROUND: Antigen-antibody interactions are key events in immune system, which provide important clues to the immune processes and responses. In Antigen-antibody interactions, the specific sites on the antigens that are directly bound by the B-cell produced antibodies are well known as B-cell epitopes. The identification of epitopes is a hot topic in bioinformatics because of their potential use in the epitope-based drug design. Although most B-cell epitopes are discontinuous (or conformational), insufficient effort has been put into the conformational epitope prediction, and the performance of existing methods is far from satisfaction. RESULTS: In order to develop the high-accuracy model, we focus on some possible aspects concerning the prediction performance, including the impact of interior residues, different contributions of adjacent residues, and the imbalanced data which contain much more non-epitope residues than epitope residues. In order to address above issues, we take following strategies. Firstly, a concept of 'thick surface patch' instead of 'surface patch' is introduced to describe the local spatial context of each surface residue, which considers the impact of interior residue. The comparison between the thick surface patch and the surface patch shows that interior residues contribute to the recognition of epitopes. Secondly, statistical significance of the distance distribution difference between non-epitope patches and epitope patches is observed, thus an adjacent residue distance feature is presented, which reflects the unequal contributions of adjacent residues to the location of binding sites. Thirdly, a bootstrapping and voting procedure is adopted to deal with the imbalanced dataset. Based on the above ideas, we propose a new method to identify the B-cell conformational epitopes from 3D structures by combining conventional features and the proposed feature, and the random forest (RF) algorithm is used as the classification engine. The experiments show that our method can predict conformational B-cell epitopes with high accuracy. Evaluated by leave-one-out cross validation (LOOCV), our method achieves the mean AUC value of 0.633 for the benchmark bound dataset, and the mean AUC value of 0.654 for the benchmark unbound dataset. When compared with the state-of-the-art prediction models in the independent test, our method demonstrates comparable or better performance. CONCLUSIONS: Our method is demonstrated to be effective for the prediction of conformational epitopes. Based on the study, we develop a tool to predict the conformational epitopes from 3D structures, available at http://code.google.com/p/my-project-bpredictor/downloads/list. |
format | Online Article Text |
id | pubmed-3228550 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32285502011-12-07 Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature Zhang, Wen Xiong, Yi Zhao, Meng Zou, Hua Ye, Xinghuo Liu, Juan BMC Bioinformatics Research Article BACKGROUND: Antigen-antibody interactions are key events in immune system, which provide important clues to the immune processes and responses. In Antigen-antibody interactions, the specific sites on the antigens that are directly bound by the B-cell produced antibodies are well known as B-cell epitopes. The identification of epitopes is a hot topic in bioinformatics because of their potential use in the epitope-based drug design. Although most B-cell epitopes are discontinuous (or conformational), insufficient effort has been put into the conformational epitope prediction, and the performance of existing methods is far from satisfaction. RESULTS: In order to develop the high-accuracy model, we focus on some possible aspects concerning the prediction performance, including the impact of interior residues, different contributions of adjacent residues, and the imbalanced data which contain much more non-epitope residues than epitope residues. In order to address above issues, we take following strategies. Firstly, a concept of 'thick surface patch' instead of 'surface patch' is introduced to describe the local spatial context of each surface residue, which considers the impact of interior residue. The comparison between the thick surface patch and the surface patch shows that interior residues contribute to the recognition of epitopes. Secondly, statistical significance of the distance distribution difference between non-epitope patches and epitope patches is observed, thus an adjacent residue distance feature is presented, which reflects the unequal contributions of adjacent residues to the location of binding sites. Thirdly, a bootstrapping and voting procedure is adopted to deal with the imbalanced dataset. Based on the above ideas, we propose a new method to identify the B-cell conformational epitopes from 3D structures by combining conventional features and the proposed feature, and the random forest (RF) algorithm is used as the classification engine. The experiments show that our method can predict conformational B-cell epitopes with high accuracy. Evaluated by leave-one-out cross validation (LOOCV), our method achieves the mean AUC value of 0.633 for the benchmark bound dataset, and the mean AUC value of 0.654 for the benchmark unbound dataset. When compared with the state-of-the-art prediction models in the independent test, our method demonstrates comparable or better performance. CONCLUSIONS: Our method is demonstrated to be effective for the prediction of conformational epitopes. Based on the study, we develop a tool to predict the conformational epitopes from 3D structures, available at http://code.google.com/p/my-project-bpredictor/downloads/list. BioMed Central 2011-08-17 /pmc/articles/PMC3228550/ /pubmed/21846404 http://dx.doi.org/10.1186/1471-2105-12-341 Text en Copyright ©2011 Zhang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhang, Wen Xiong, Yi Zhao, Meng Zou, Hua Ye, Xinghuo Liu, Juan Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature |
title | Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature |
title_full | Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature |
title_fullStr | Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature |
title_full_unstemmed | Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature |
title_short | Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature |
title_sort | prediction of conformational b-cell epitopes from 3d structures by random forests with a distance-based feature |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228550/ https://www.ncbi.nlm.nih.gov/pubmed/21846404 http://dx.doi.org/10.1186/1471-2105-12-341 |
work_keys_str_mv | AT zhangwen predictionofconformationalbcellepitopesfrom3dstructuresbyrandomforestswithadistancebasedfeature AT xiongyi predictionofconformationalbcellepitopesfrom3dstructuresbyrandomforestswithadistancebasedfeature AT zhaomeng predictionofconformationalbcellepitopesfrom3dstructuresbyrandomforestswithadistancebasedfeature AT zouhua predictionofconformationalbcellepitopesfrom3dstructuresbyrandomforestswithadistancebasedfeature AT yexinghuo predictionofconformationalbcellepitopesfrom3dstructuresbyrandomforestswithadistancebasedfeature AT liujuan predictionofconformationalbcellepitopesfrom3dstructuresbyrandomforestswithadistancebasedfeature |