Cargando…
Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates
Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9096330/ https://www.ncbi.nlm.nih.gov/pubmed/35552469 http://dx.doi.org/10.1038/s41598-022-11731-6 |
_version_ | 1784705952726908928 |
---|---|
author | Bukhari, Syed Nisar Hussain Webber, Julian Mehbodniya, Abolfazl |
author_facet | Bukhari, Syed Nisar Hussain Webber, Julian Mehbodniya, Abolfazl |
author_sort | Bukhari, Syed Nisar Hussain |
collection | PubMed |
description | Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically approved vaccine to combat ZIKV infection and curb its pandemic. An epitope-based peptide vaccine (EBPV) is seen as a powerful alternative to conventional vaccinations because of its low production cost and short production time. Nonetheless, EBPVs have gotten less attention, despite the fact that they have a significant untapped potential for enhancing vaccine safety, immunogenicity, and cross-reactivity. Such a vaccine technology is based on target pathogen’s selected antigenic peptides called T-cell epitopes (TCE), which are synthesized chemically based on their amino acid sequences. The identification of TCEs using wet-lab experimental approach is challenging, expensive, and time-consuming. Therefore in this study, we present computational model for the prediction of ZIKV TCEs. The model proposed is an ensemble of decision trees that utilizes the physicochemical properties of amino acids. In this way a large amount of time and efforts would be saved for quick vaccine development. The peptide sequences dataset for model training was retrieved from Virus Pathogen Database and Analysis Resource (ViPR) database. The sequences dataset consist of experimentally verified T-cell epitopes (TCEs) and non-TCEs. The model demonstrated promising results when evaluated on test dataset. The evaluation metrics namely, accuracy, AUC, sensitivity, specificity, Gini and Mathew’s correlation coefficient (MCC) recorded values of 0.9789, 0.984, 0.981, 0.987, 0.974 and 0.948 respectively. The consistency and reliability of the model was assessed by carrying out the five (05)-fold cross-validation technique, and the mean accuracy of 0.97864 was reported. Finally, model was compared with standard machine learning (ML) algorithms and the proposed model outperformed all of them. The proposed model will aid in predicting novel and immunodominant TCEs of ZIKV. The predicted TCEs may have a high possibility of acting as prospective vaccine targets subjected to in-vivo and in-vitro scientific assessments, thereby saving lives worldwide, preventing future epidemic-scale outbreaks, and lowering the possibility of mutation escape. |
format | Online Article Text |
id | pubmed-9096330 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-90963302022-05-12 Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates Bukhari, Syed Nisar Hussain Webber, Julian Mehbodniya, Abolfazl Sci Rep Article Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically approved vaccine to combat ZIKV infection and curb its pandemic. An epitope-based peptide vaccine (EBPV) is seen as a powerful alternative to conventional vaccinations because of its low production cost and short production time. Nonetheless, EBPVs have gotten less attention, despite the fact that they have a significant untapped potential for enhancing vaccine safety, immunogenicity, and cross-reactivity. Such a vaccine technology is based on target pathogen’s selected antigenic peptides called T-cell epitopes (TCE), which are synthesized chemically based on their amino acid sequences. The identification of TCEs using wet-lab experimental approach is challenging, expensive, and time-consuming. Therefore in this study, we present computational model for the prediction of ZIKV TCEs. The model proposed is an ensemble of decision trees that utilizes the physicochemical properties of amino acids. In this way a large amount of time and efforts would be saved for quick vaccine development. The peptide sequences dataset for model training was retrieved from Virus Pathogen Database and Analysis Resource (ViPR) database. The sequences dataset consist of experimentally verified T-cell epitopes (TCEs) and non-TCEs. The model demonstrated promising results when evaluated on test dataset. The evaluation metrics namely, accuracy, AUC, sensitivity, specificity, Gini and Mathew’s correlation coefficient (MCC) recorded values of 0.9789, 0.984, 0.981, 0.987, 0.974 and 0.948 respectively. The consistency and reliability of the model was assessed by carrying out the five (05)-fold cross-validation technique, and the mean accuracy of 0.97864 was reported. Finally, model was compared with standard machine learning (ML) algorithms and the proposed model outperformed all of them. The proposed model will aid in predicting novel and immunodominant TCEs of ZIKV. The predicted TCEs may have a high possibility of acting as prospective vaccine targets subjected to in-vivo and in-vitro scientific assessments, thereby saving lives worldwide, preventing future epidemic-scale outbreaks, and lowering the possibility of mutation escape. Nature Publishing Group UK 2022-05-12 /pmc/articles/PMC9096330/ /pubmed/35552469 http://dx.doi.org/10.1038/s41598-022-11731-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Bukhari, Syed Nisar Hussain Webber, Julian Mehbodniya, Abolfazl Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates |
title | Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates |
title_full | Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates |
title_fullStr | Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates |
title_full_unstemmed | Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates |
title_short | Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates |
title_sort | decision tree based ensemble machine learning model for the prediction of zika virus t-cell epitopes as potential vaccine candidates |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9096330/ https://www.ncbi.nlm.nih.gov/pubmed/35552469 http://dx.doi.org/10.1038/s41598-022-11731-6 |
work_keys_str_mv | AT bukharisyednisarhussain decisiontreebasedensemblemachinelearningmodelforthepredictionofzikavirustcellepitopesaspotentialvaccinecandidates AT webberjulian decisiontreebasedensemblemachinelearningmodelforthepredictionofzikavirustcellepitopesaspotentialvaccinecandidates AT mehbodniyaabolfazl decisiontreebasedensemblemachinelearningmodelforthepredictionofzikavirustcellepitopesaspotentialvaccinecandidates |