Cargando…

Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm

Linear-B cell epitopes (LBCE) play a vital role in vaccine design; thus, efficiently detecting them from protein sequences is of primary importance. These epitopes consist of amino acids arranged in continuous or discontinuous patterns. Vaccines employ attenuated viruses and purified antigens. LBCE...

Descripción completa

Detalles Bibliográficos
Autores principales: Angaitkar, Pratik, Aljrees, Turki, Kumar Pandey, Saroj, Kumar, Ankit, Janghel, Rekh Ram, Sahu, Tirath Prasad, Singh, Kamred Udham, Singh, Teekam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10480427/
https://www.ncbi.nlm.nih.gov/pubmed/37670007
http://dx.doi.org/10.1038/s41598-023-41179-1
_version_ 1785101783505305600
author Angaitkar, Pratik
Aljrees, Turki
Kumar Pandey, Saroj
Kumar, Ankit
Janghel, Rekh Ram
Sahu, Tirath Prasad
Singh, Kamred Udham
Singh, Teekam
author_facet Angaitkar, Pratik
Aljrees, Turki
Kumar Pandey, Saroj
Kumar, Ankit
Janghel, Rekh Ram
Sahu, Tirath Prasad
Singh, Kamred Udham
Singh, Teekam
author_sort Angaitkar, Pratik
collection PubMed
description Linear-B cell epitopes (LBCE) play a vital role in vaccine design; thus, efficiently detecting them from protein sequences is of primary importance. These epitopes consist of amino acids arranged in continuous or discontinuous patterns. Vaccines employ attenuated viruses and purified antigens. LBCE stimulate humoral immunity in the body, where B and T cells target circulating infections. To predict LBCE, the underlying protein sequences undergo a process of feature extraction, feature selection, and classification. Various system models have been proposed for this purpose, but their classification accuracy is only moderate. In order to enhance the accuracy of LBCE classification, this paper presents a novel 2-step metaheuristic variant-feature selection method that combines a linear support vector classifier (LSVC) with a Modified Genetic Algorithm (MGA). The feature selection model employs mono-peptide, dipeptide, and tripeptide features, focusing on the most diverse ones. These selected features are fed into a machine learning (ML)-based parallel ensemble classifier. The ensemble classifier combines correctly classified instances from various classifiers, including k-Nearest Neighbor (kNN), random forest (RF), logistic regression (LR), and support vector machine (SVM). The ensemble classifier came up with an impressively high accuracy of 99.3% as a result of its work. This accuracy is superior to the most recent models that are considered to be state-of-the-art for linear B-cell classification. As a direct consequence of this, the entire system model can now be utilised effectively in real-time clinical settings.
format Online
Article
Text
id pubmed-10480427
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-104804272023-09-07 Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm Angaitkar, Pratik Aljrees, Turki Kumar Pandey, Saroj Kumar, Ankit Janghel, Rekh Ram Sahu, Tirath Prasad Singh, Kamred Udham Singh, Teekam Sci Rep Article Linear-B cell epitopes (LBCE) play a vital role in vaccine design; thus, efficiently detecting them from protein sequences is of primary importance. These epitopes consist of amino acids arranged in continuous or discontinuous patterns. Vaccines employ attenuated viruses and purified antigens. LBCE stimulate humoral immunity in the body, where B and T cells target circulating infections. To predict LBCE, the underlying protein sequences undergo a process of feature extraction, feature selection, and classification. Various system models have been proposed for this purpose, but their classification accuracy is only moderate. In order to enhance the accuracy of LBCE classification, this paper presents a novel 2-step metaheuristic variant-feature selection method that combines a linear support vector classifier (LSVC) with a Modified Genetic Algorithm (MGA). The feature selection model employs mono-peptide, dipeptide, and tripeptide features, focusing on the most diverse ones. These selected features are fed into a machine learning (ML)-based parallel ensemble classifier. The ensemble classifier combines correctly classified instances from various classifiers, including k-Nearest Neighbor (kNN), random forest (RF), logistic regression (LR), and support vector machine (SVM). The ensemble classifier came up with an impressively high accuracy of 99.3% as a result of its work. This accuracy is superior to the most recent models that are considered to be state-of-the-art for linear B-cell classification. As a direct consequence of this, the entire system model can now be utilised effectively in real-time clinical settings. Nature Publishing Group UK 2023-09-05 /pmc/articles/PMC10480427/ /pubmed/37670007 http://dx.doi.org/10.1038/s41598-023-41179-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Angaitkar, Pratik
Aljrees, Turki
Kumar Pandey, Saroj
Kumar, Ankit
Janghel, Rekh Ram
Sahu, Tirath Prasad
Singh, Kamred Udham
Singh, Teekam
Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
title Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
title_full Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
title_fullStr Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
title_full_unstemmed Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
title_short Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
title_sort inferring linear-b cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10480427/
https://www.ncbi.nlm.nih.gov/pubmed/37670007
http://dx.doi.org/10.1038/s41598-023-41179-1
work_keys_str_mv AT angaitkarpratik inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT aljreesturki inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT kumarpandeysaroj inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT kumarankit inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT janghelrekhram inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT sahutirathprasad inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT singhkamredudham inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm
AT singhteekam inferringlinearbcellepitopesusing2stepmetaheuristicvariantfeatureselectionusinggeneticalgorithm