Cargando…

Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning

In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences wer...

Descripción completa

Detalles Bibliográficos
Autores principales: Munteanu, Cristian R., Gestal, Marcos, Martínez-Acevedo, Yunuen G., Pedreira, Nieves, Pazos, Alejandro, Dorado, Julián
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6770149/
https://www.ncbi.nlm.nih.gov/pubmed/31491969
http://dx.doi.org/10.3390/ijms20184362
_version_ 1783455403906957312
author Munteanu, Cristian R.
Gestal, Marcos
Martínez-Acevedo, Yunuen G.
Pedreira, Nieves
Pazos, Alejandro
Dorado, Julián
author_facet Munteanu, Cristian R.
Gestal, Marcos
Martínez-Acevedo, Yunuen G.
Pedreira, Nieves
Pazos, Alejandro
Dorado, Julián
author_sort Munteanu, Cristian R.
collection PubMed
description In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequence recurrence networks and were mixed under experimental conditions. The new models were generated using 709,100 instances of pair descriptors for query and reference peptide sequences. Using perturbations of the initial descriptors under sequence or assay conditions, 10 transformed features were used as inputs for seven Machine Learning methods. The best model was obtained with random forest classifiers with an Area Under the Receiver Operating Characteristics (AUROC) of 0.981 ± 0.0005 for the external validation series (five-fold cross-validation). The database included information about 83,683 peptides sequences, 1448 epitope organisms, 323 host organisms, 15 types of in vivo processes, 28 experimental techniques, and 505 adjuvant additives. The current model could improve the in silico predictions of epitopes for vaccine design. The script and results are available as a free repository.
format Online
Article
Text
id pubmed-6770149
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-67701492019-10-30 Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning Munteanu, Cristian R. Gestal, Marcos Martínez-Acevedo, Yunuen G. Pedreira, Nieves Pazos, Alejandro Dorado, Julián Int J Mol Sci Article In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequence recurrence networks and were mixed under experimental conditions. The new models were generated using 709,100 instances of pair descriptors for query and reference peptide sequences. Using perturbations of the initial descriptors under sequence or assay conditions, 10 transformed features were used as inputs for seven Machine Learning methods. The best model was obtained with random forest classifiers with an Area Under the Receiver Operating Characteristics (AUROC) of 0.981 ± 0.0005 for the external validation series (five-fold cross-validation). The database included information about 83,683 peptides sequences, 1448 epitope organisms, 323 host organisms, 15 types of in vivo processes, 28 experimental techniques, and 505 adjuvant additives. The current model could improve the in silico predictions of epitopes for vaccine design. The script and results are available as a free repository. MDPI 2019-09-05 /pmc/articles/PMC6770149/ /pubmed/31491969 http://dx.doi.org/10.3390/ijms20184362 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Munteanu, Cristian R.
Gestal, Marcos
Martínez-Acevedo, Yunuen G.
Pedreira, Nieves
Pazos, Alejandro
Dorado, Julián
Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning
title Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning
title_full Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning
title_fullStr Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning
title_full_unstemmed Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning
title_short Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning
title_sort improvement of epitope prediction using peptide sequence descriptors and machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6770149/
https://www.ncbi.nlm.nih.gov/pubmed/31491969
http://dx.doi.org/10.3390/ijms20184362
work_keys_str_mv AT munteanucristianr improvementofepitopepredictionusingpeptidesequencedescriptorsandmachinelearning
AT gestalmarcos improvementofepitopepredictionusingpeptidesequencedescriptorsandmachinelearning
AT martinezacevedoyunueng improvementofepitopepredictionusingpeptidesequencedescriptorsandmachinelearning
AT pedreiranieves improvementofepitopepredictionusingpeptidesequencedescriptorsandmachinelearning
AT pazosalejandro improvementofepitopepredictionusingpeptidesequencedescriptorsandmachinelearning
AT doradojulian improvementofepitopepredictionusingpeptidesequencedescriptorsandmachinelearning