Cargando…

Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction

Zika virus (ZIKV), the causative agent of Zika fever in humans, is an RNA virus that belongs to the genus Flavivirus. Currently, there is no approved vaccine for clinical use to combat the ZIKV infection and contain the epidemic. Epitope-based peptide vaccines have a large untapped potential for boo...

Descripción completa

Detalles Bibliográficos
Autores principales: Bukhari, Syed Nisar Hussain, Jain, Amit, Haq, Ehtishamul, Khder, Moaiad Ahmad, Neware, Rahul, Bhola, Jyoti, Lari Najafi, Moslem
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8500748/
https://www.ncbi.nlm.nih.gov/pubmed/34631001
http://dx.doi.org/10.1155/2021/9591670
_version_ 1784580510756896768
author Bukhari, Syed Nisar Hussain
Jain, Amit
Haq, Ehtishamul
Khder, Moaiad Ahmad
Neware, Rahul
Bhola, Jyoti
Lari Najafi, Moslem
author_facet Bukhari, Syed Nisar Hussain
Jain, Amit
Haq, Ehtishamul
Khder, Moaiad Ahmad
Neware, Rahul
Bhola, Jyoti
Lari Najafi, Moslem
author_sort Bukhari, Syed Nisar Hussain
collection PubMed
description Zika virus (ZIKV), the causative agent of Zika fever in humans, is an RNA virus that belongs to the genus Flavivirus. Currently, there is no approved vaccine for clinical use to combat the ZIKV infection and contain the epidemic. Epitope-based peptide vaccines have a large untapped potential for boosting vaccination safety, cross-reactivity, and immunogenicity. Though many attempts have been made to develop vaccines for ZIKV, none of these have proved to be successful. Epitope-based peptide vaccines can act as powerful alternatives to conventional vaccines due to their low production cost, less reactogenic, and allergenic responses. For designing an effective and viable epitope-based peptide vaccine against this deadly virus, it is essential to select the antigenic T-cell epitopes since epitope-based vaccines are considered safe. The in silico machine-learning-based approach for ZIKV T-cell epitope prediction would save a lot of physical experimental time and efforts for speedy vaccine development compared to in vivo approaches. We hereby have trained a machine-learning-based computational model to predict novel ZIKV T-cell epitopes by employing physicochemical properties of amino acids. The proposed ensemble model based on a voting mechanism works by blending the predictions for each class (epitope or nonepitope) from each base classifier. Predictions obtained for each class by the individual classifier are summed up, and the class with the majority vote is predicted upon. An odd number of classifiers have been used to avoid the occurrence of ties in the voting. Experimentally determined ZIKV peptide sequences data set was collected from Immune Epitope Database and Analysis Resource (IEDB) repository. The data set consists of 3,519 sequences, of which 1,762 are epitopes and 1,757 are nonepitopes. The length of sequences ranges from 6 to 30 meter. For each sequence, we extracted 13 physicochemical features. The proposed ensemble model achieved sensitivity, specificity, Gini coefficient, AUC, precision, F-score, and accuracy of 0.976, 0.959, 0.993, 0.994, 0.989, 0.985, and 97.13%, respectively. To check the consistency of the model, we carried out five-fold cross-validation and an average accuracy of 96.072% is reported. Finally, a comparative analysis of the proposed model with existing methods has been carried out using a separate validation data set, suggesting the proposed ensemble model as a better model. The proposed ensemble model will help predict novel ZIKV vaccine candidates to save lives globally and prevent future epidemic-scale outbreaks.
format Online
Article
Text
id pubmed-8500748
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-85007482021-10-09 Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction Bukhari, Syed Nisar Hussain Jain, Amit Haq, Ehtishamul Khder, Moaiad Ahmad Neware, Rahul Bhola, Jyoti Lari Najafi, Moslem J Healthc Eng Research Article Zika virus (ZIKV), the causative agent of Zika fever in humans, is an RNA virus that belongs to the genus Flavivirus. Currently, there is no approved vaccine for clinical use to combat the ZIKV infection and contain the epidemic. Epitope-based peptide vaccines have a large untapped potential for boosting vaccination safety, cross-reactivity, and immunogenicity. Though many attempts have been made to develop vaccines for ZIKV, none of these have proved to be successful. Epitope-based peptide vaccines can act as powerful alternatives to conventional vaccines due to their low production cost, less reactogenic, and allergenic responses. For designing an effective and viable epitope-based peptide vaccine against this deadly virus, it is essential to select the antigenic T-cell epitopes since epitope-based vaccines are considered safe. The in silico machine-learning-based approach for ZIKV T-cell epitope prediction would save a lot of physical experimental time and efforts for speedy vaccine development compared to in vivo approaches. We hereby have trained a machine-learning-based computational model to predict novel ZIKV T-cell epitopes by employing physicochemical properties of amino acids. The proposed ensemble model based on a voting mechanism works by blending the predictions for each class (epitope or nonepitope) from each base classifier. Predictions obtained for each class by the individual classifier are summed up, and the class with the majority vote is predicted upon. An odd number of classifiers have been used to avoid the occurrence of ties in the voting. Experimentally determined ZIKV peptide sequences data set was collected from Immune Epitope Database and Analysis Resource (IEDB) repository. The data set consists of 3,519 sequences, of which 1,762 are epitopes and 1,757 are nonepitopes. The length of sequences ranges from 6 to 30 meter. For each sequence, we extracted 13 physicochemical features. The proposed ensemble model achieved sensitivity, specificity, Gini coefficient, AUC, precision, F-score, and accuracy of 0.976, 0.959, 0.993, 0.994, 0.989, 0.985, and 97.13%, respectively. To check the consistency of the model, we carried out five-fold cross-validation and an average accuracy of 96.072% is reported. Finally, a comparative analysis of the proposed model with existing methods has been carried out using a separate validation data set, suggesting the proposed ensemble model as a better model. The proposed ensemble model will help predict novel ZIKV vaccine candidates to save lives globally and prevent future epidemic-scale outbreaks. Hindawi 2021-10-01 /pmc/articles/PMC8500748/ /pubmed/34631001 http://dx.doi.org/10.1155/2021/9591670 Text en Copyright © 2021 Syed Nisar Hussain Bukhari et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bukhari, Syed Nisar Hussain
Jain, Amit
Haq, Ehtishamul
Khder, Moaiad Ahmad
Neware, Rahul
Bhola, Jyoti
Lari Najafi, Moslem
Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction
title Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction
title_full Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction
title_fullStr Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction
title_full_unstemmed Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction
title_short Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction
title_sort machine learning-based ensemble model for zika virus t-cell epitope prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8500748/
https://www.ncbi.nlm.nih.gov/pubmed/34631001
http://dx.doi.org/10.1155/2021/9591670
work_keys_str_mv AT bukharisyednisarhussain machinelearningbasedensemblemodelforzikavirustcellepitopeprediction
AT jainamit machinelearningbasedensemblemodelforzikavirustcellepitopeprediction
AT haqehtishamul machinelearningbasedensemblemodelforzikavirustcellepitopeprediction
AT khdermoaiadahmad machinelearningbasedensemblemodelforzikavirustcellepitopeprediction
AT newarerahul machinelearningbasedensemblemodelforzikavirustcellepitopeprediction
AT bholajyoti machinelearningbasedensemblemodelforzikavirustcellepitopeprediction
AT larinajafimoslem machinelearningbasedensemblemodelforzikavirustcellepitopeprediction