Cargando…

SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence

BACKGROUND: The identification of immunogenic regions on the surface of antigens, which are able to be recognized by antibodies and to trigger an immune response, is a major challenge for the design of new and effective vaccines. The prediction of such regions through computational immunology techni...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dalkas, Georgios A., Rooman, Marianne
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5301386/ https://www.ncbi.nlm.nih.gov/pubmed/28183272 http://dx.doi.org/10.1186/s12859-017-1528-9

_version_	1782506354505678848
author	Dalkas, Georgios A. Rooman, Marianne
author_facet	Dalkas, Georgios A. Rooman, Marianne
author_sort	Dalkas, Georgios A.
collection	PubMed
description	BACKGROUND: The identification of immunogenic regions on the surface of antigens, which are able to be recognized by antibodies and to trigger an immune response, is a major challenge for the design of new and effective vaccines. The prediction of such regions through computational immunology techniques is a challenging goal, which will ultimately lead to a drastic limitation of the experimental tests required to validate their efficiency. However, current methods are far from being sufficiently reliable and/or applicable on a large scale. RESULTS: We developed SEPIa, a B-cell epitope predictor from the protein sequence, which is sufficiently fast to be applicable on a large scale. The originality of SEPIa lies in the combination of two classifiers, a naïve Bayesian and a random forest classifier, through a voting algorithm that exploits the advantages of both. It is based on 13 sequence-based features, whose values in a 9-residue sequence window are compiled to predict the epitope/non-epitope state of the central residue. The features are related to the type of amino acid, its conservation in homologous proteins, and its tendency of being exposed to the solvent, soluble, flexible, and disordered. The highest signal is obtained from statistical amino acid preferences, but all 13 features contribute non-negligibly in the predictor. SEPIa’s average prediction accuracy is limited, with an AUC score (area under the receiver operating characteristic curve) that reaches 0.65 both in 10-fold cross-validation and on an independent test set. It is nevertheless slightly higher than that of other methods evaluated on the same test set. CONCLUSIONS: SEPIa was applied to a test protein whose epitopes are known, human β2 adrenergic G-protein-coupled receptor, with promising results. Although the actual AUC score is rather low, many of the predicted epitopes cluster together and overlap the experimental epitope region. The reasons underlying the limitations of SEPIa and of all other B-cell epitope predictors are discussed. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1528-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5301386
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-53013862017-02-15 SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence Dalkas, Georgios A. Rooman, Marianne BMC Bioinformatics Research Article BACKGROUND: The identification of immunogenic regions on the surface of antigens, which are able to be recognized by antibodies and to trigger an immune response, is a major challenge for the design of new and effective vaccines. The prediction of such regions through computational immunology techniques is a challenging goal, which will ultimately lead to a drastic limitation of the experimental tests required to validate their efficiency. However, current methods are far from being sufficiently reliable and/or applicable on a large scale. RESULTS: We developed SEPIa, a B-cell epitope predictor from the protein sequence, which is sufficiently fast to be applicable on a large scale. The originality of SEPIa lies in the combination of two classifiers, a naïve Bayesian and a random forest classifier, through a voting algorithm that exploits the advantages of both. It is based on 13 sequence-based features, whose values in a 9-residue sequence window are compiled to predict the epitope/non-epitope state of the central residue. The features are related to the type of amino acid, its conservation in homologous proteins, and its tendency of being exposed to the solvent, soluble, flexible, and disordered. The highest signal is obtained from statistical amino acid preferences, but all 13 features contribute non-negligibly in the predictor. SEPIa’s average prediction accuracy is limited, with an AUC score (area under the receiver operating characteristic curve) that reaches 0.65 both in 10-fold cross-validation and on an independent test set. It is nevertheless slightly higher than that of other methods evaluated on the same test set. CONCLUSIONS: SEPIa was applied to a test protein whose epitopes are known, human β2 adrenergic G-protein-coupled receptor, with promising results. Although the actual AUC score is rather low, many of the predicted epitopes cluster together and overlap the experimental epitope region. The reasons underlying the limitations of SEPIa and of all other B-cell epitope predictors are discussed. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1528-9) contains supplementary material, which is available to authorized users. BioMed Central 2017-02-10 /pmc/articles/PMC5301386/ /pubmed/28183272 http://dx.doi.org/10.1186/s12859-017-1528-9 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Dalkas, Georgios A. Rooman, Marianne SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence
title	SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence
title_full	SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence
title_fullStr	SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence
title_full_unstemmed	SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence
title_short	SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence
title_sort	sepia, a knowledge-driven algorithm for predicting conformational b-cell epitopes from the amino acid sequence
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5301386/ https://www.ncbi.nlm.nih.gov/pubmed/28183272 http://dx.doi.org/10.1186/s12859-017-1528-9
work_keys_str_mv	AT dalkasgeorgiosa sepiaaknowledgedrivenalgorithmforpredictingconformationalbcellepitopesfromtheaminoacidsequence AT roomanmarianne sepiaaknowledgedrivenalgorithmforpredictingconformationalbcellepitopesfromtheaminoacidsequence

SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence

Ejemplares similares