Cargando…

Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates

Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically...

Descripción completa

Detalles Bibliográficos
Autores principales: Bukhari, Syed Nisar Hussain, Webber, Julian, Mehbodniya, Abolfazl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9096330/
https://www.ncbi.nlm.nih.gov/pubmed/35552469
http://dx.doi.org/10.1038/s41598-022-11731-6
_version_ 1784705952726908928
author Bukhari, Syed Nisar Hussain
Webber, Julian
Mehbodniya, Abolfazl
author_facet Bukhari, Syed Nisar Hussain
Webber, Julian
Mehbodniya, Abolfazl
author_sort Bukhari, Syed Nisar Hussain
collection PubMed
description Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically approved vaccine to combat ZIKV infection and curb its pandemic. An epitope-based peptide vaccine (EBPV) is seen as a powerful alternative to conventional vaccinations because of its low production cost and short production time. Nonetheless, EBPVs have gotten less attention, despite the fact that they have a significant untapped potential for enhancing vaccine safety, immunogenicity, and cross-reactivity. Such a vaccine technology is based on target pathogen’s selected antigenic peptides called T-cell epitopes (TCE), which are synthesized chemically based on their amino acid sequences. The identification of TCEs using wet-lab experimental approach is challenging, expensive, and time-consuming. Therefore in this study, we present computational model for the prediction of ZIKV TCEs. The model proposed is an ensemble of decision trees that utilizes the physicochemical properties of amino acids. In this way a large amount of time and efforts would be saved for quick vaccine development. The peptide sequences dataset for model training was retrieved from Virus Pathogen Database and Analysis Resource (ViPR) database. The sequences dataset consist of experimentally verified T-cell epitopes (TCEs) and non-TCEs. The model demonstrated promising results when evaluated on test dataset. The evaluation metrics namely, accuracy, AUC, sensitivity, specificity, Gini and Mathew’s correlation coefficient (MCC) recorded values of 0.9789, 0.984, 0.981, 0.987, 0.974 and 0.948 respectively. The consistency and reliability of the model was assessed by carrying out the five (05)-fold cross-validation technique, and the mean accuracy of 0.97864 was reported. Finally, model was compared with standard machine learning (ML) algorithms and the proposed model outperformed all of them. The proposed model will aid in predicting novel and immunodominant TCEs of ZIKV. The predicted TCEs may have a high possibility of acting as prospective vaccine targets subjected to in-vivo and in-vitro scientific assessments, thereby saving lives worldwide, preventing future epidemic-scale outbreaks, and lowering the possibility of mutation escape.
format Online
Article
Text
id pubmed-9096330
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-90963302022-05-12 Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates Bukhari, Syed Nisar Hussain Webber, Julian Mehbodniya, Abolfazl Sci Rep Article Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically approved vaccine to combat ZIKV infection and curb its pandemic. An epitope-based peptide vaccine (EBPV) is seen as a powerful alternative to conventional vaccinations because of its low production cost and short production time. Nonetheless, EBPVs have gotten less attention, despite the fact that they have a significant untapped potential for enhancing vaccine safety, immunogenicity, and cross-reactivity. Such a vaccine technology is based on target pathogen’s selected antigenic peptides called T-cell epitopes (TCE), which are synthesized chemically based on their amino acid sequences. The identification of TCEs using wet-lab experimental approach is challenging, expensive, and time-consuming. Therefore in this study, we present computational model for the prediction of ZIKV TCEs. The model proposed is an ensemble of decision trees that utilizes the physicochemical properties of amino acids. In this way a large amount of time and efforts would be saved for quick vaccine development. The peptide sequences dataset for model training was retrieved from Virus Pathogen Database and Analysis Resource (ViPR) database. The sequences dataset consist of experimentally verified T-cell epitopes (TCEs) and non-TCEs. The model demonstrated promising results when evaluated on test dataset. The evaluation metrics namely, accuracy, AUC, sensitivity, specificity, Gini and Mathew’s correlation coefficient (MCC) recorded values of 0.9789, 0.984, 0.981, 0.987, 0.974 and 0.948 respectively. The consistency and reliability of the model was assessed by carrying out the five (05)-fold cross-validation technique, and the mean accuracy of 0.97864 was reported. Finally, model was compared with standard machine learning (ML) algorithms and the proposed model outperformed all of them. The proposed model will aid in predicting novel and immunodominant TCEs of ZIKV. The predicted TCEs may have a high possibility of acting as prospective vaccine targets subjected to in-vivo and in-vitro scientific assessments, thereby saving lives worldwide, preventing future epidemic-scale outbreaks, and lowering the possibility of mutation escape. Nature Publishing Group UK 2022-05-12 /pmc/articles/PMC9096330/ /pubmed/35552469 http://dx.doi.org/10.1038/s41598-022-11731-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Bukhari, Syed Nisar Hussain
Webber, Julian
Mehbodniya, Abolfazl
Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
title Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
title_full Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
title_fullStr Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
title_full_unstemmed Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
title_short Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
title_sort decision tree based ensemble machine learning model for the prediction of zika virus t-cell epitopes as potential vaccine candidates
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9096330/
https://www.ncbi.nlm.nih.gov/pubmed/35552469
http://dx.doi.org/10.1038/s41598-022-11731-6
work_keys_str_mv AT bukharisyednisarhussain decisiontreebasedensemblemachinelearningmodelforthepredictionofzikavirustcellepitopesaspotentialvaccinecandidates
AT webberjulian decisiontreebasedensemblemachinelearningmodelforthepredictionofzikavirustcellepitopesaspotentialvaccinecandidates
AT mehbodniyaabolfazl decisiontreebasedensemblemachinelearningmodelforthepredictionofzikavirustcellepitopesaspotentialvaccinecandidates