Cargando…

Positive-unlabeled learning for the prediction of conformational B-cell epitopes

BACKGROUND: The incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues...

Descripción completa

Detalles Bibliográficos
Autores principales: Ren, Jing, Liu, Qian, Ellis, John, Li, Jinyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682424/
https://www.ncbi.nlm.nih.gov/pubmed/26681157
http://dx.doi.org/10.1186/1471-2105-16-S18-S12
_version_ 1782405888076677120
author Ren, Jing
Liu, Qian
Ellis, John
Li, Jinyan
author_facet Ren, Jing
Liu, Qian
Ellis, John
Li, Jinyan
author_sort Ren, Jing
collection PubMed
description BACKGROUND: The incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues are unlabeled. As some of these uncertain residues can possibly be grouped to form novel but currently unknown epitopes, it is misguided to unanimously classify all the unlabeled residues as negative training data following the traditional supervised learning scheme. RESULTS: We propose a positive-unlabeled learning algorithm to address this problem. The key idea is to distinguish between epitope-likely residues and reliable negative residues in unlabeled data. The method has two steps: (1) identify reliable negative residues using a weighted SVM with a high recall; and (2) construct a classification model on the positive residues and the reliable negative residues. Complex-based 10-fold cross-validation was conducted to show that this method outperforms those commonly used predictors DiscoTope 2.0, ElliPro and SEPPA 2.0 in every aspect. We conducted four case studies, in which the approach was tested on antigens of West Nile virus, dihydrofolate reductase, beta-lactamase, and two Ebola antigens whose epitopes are currently unknown. All the results were assessed on a newly-established data set of antigen structures not bound by antibodies, instead of on antibody-bound antigen structures. These bound structures may contain unfair binding information such as bound-state B-factors and protrusion index which could exaggerate the epitope prediction performance. Source codes are available on request.
format Online
Article
Text
id pubmed-4682424
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46824242015-12-21 Positive-unlabeled learning for the prediction of conformational B-cell epitopes Ren, Jing Liu, Qian Ellis, John Li, Jinyan BMC Bioinformatics Research BACKGROUND: The incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues are unlabeled. As some of these uncertain residues can possibly be grouped to form novel but currently unknown epitopes, it is misguided to unanimously classify all the unlabeled residues as negative training data following the traditional supervised learning scheme. RESULTS: We propose a positive-unlabeled learning algorithm to address this problem. The key idea is to distinguish between epitope-likely residues and reliable negative residues in unlabeled data. The method has two steps: (1) identify reliable negative residues using a weighted SVM with a high recall; and (2) construct a classification model on the positive residues and the reliable negative residues. Complex-based 10-fold cross-validation was conducted to show that this method outperforms those commonly used predictors DiscoTope 2.0, ElliPro and SEPPA 2.0 in every aspect. We conducted four case studies, in which the approach was tested on antigens of West Nile virus, dihydrofolate reductase, beta-lactamase, and two Ebola antigens whose epitopes are currently unknown. All the results were assessed on a newly-established data set of antigen structures not bound by antibodies, instead of on antibody-bound antigen structures. These bound structures may contain unfair binding information such as bound-state B-factors and protrusion index which could exaggerate the epitope prediction performance. Source codes are available on request. BioMed Central 2015-12-09 /pmc/articles/PMC4682424/ /pubmed/26681157 http://dx.doi.org/10.1186/1471-2105-16-S18-S12 Text en Copyright © 2015 Ren et al. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Ren, Jing
Liu, Qian
Ellis, John
Li, Jinyan
Positive-unlabeled learning for the prediction of conformational B-cell epitopes
title Positive-unlabeled learning for the prediction of conformational B-cell epitopes
title_full Positive-unlabeled learning for the prediction of conformational B-cell epitopes
title_fullStr Positive-unlabeled learning for the prediction of conformational B-cell epitopes
title_full_unstemmed Positive-unlabeled learning for the prediction of conformational B-cell epitopes
title_short Positive-unlabeled learning for the prediction of conformational B-cell epitopes
title_sort positive-unlabeled learning for the prediction of conformational b-cell epitopes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682424/
https://www.ncbi.nlm.nih.gov/pubmed/26681157
http://dx.doi.org/10.1186/1471-2105-16-S18-S12
work_keys_str_mv AT renjing positiveunlabeledlearningforthepredictionofconformationalbcellepitopes
AT liuqian positiveunlabeledlearningforthepredictionofconformationalbcellepitopes
AT ellisjohn positiveunlabeledlearningforthepredictionofconformationalbcellepitopes
AT lijinyan positiveunlabeledlearningforthepredictionofconformationalbcellepitopes