Cargando…

Recent development of machine learning-based methods for the prediction of defensin family and subfamily

Nearly all living species comprise of host defense peptides called defensins, that are crucial for innate immunity. These peptides work by activating the immune system which kills the microbes directly or indirectly, thus providing protection to the host. Thus far, numerous preclinical and clinical...

Descripción completa

Detalles Bibliográficos
Autores principales: Charoenkwan, Phasit, Schaduangrat, Nalini, Mahmud, S. M. Hasan, Thinnukool, Orawit, Shoombuatong, Watshara
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Leibniz Research Centre for Working Environment and Human Factors 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9360473/
https://www.ncbi.nlm.nih.gov/pubmed/35949489
http://dx.doi.org/10.17179/excli2022-4913
_version_ 1784764326651887616
author Charoenkwan, Phasit
Schaduangrat, Nalini
Mahmud, S. M. Hasan
Thinnukool, Orawit
Shoombuatong, Watshara
author_facet Charoenkwan, Phasit
Schaduangrat, Nalini
Mahmud, S. M. Hasan
Thinnukool, Orawit
Shoombuatong, Watshara
author_sort Charoenkwan, Phasit
collection PubMed
description Nearly all living species comprise of host defense peptides called defensins, that are crucial for innate immunity. These peptides work by activating the immune system which kills the microbes directly or indirectly, thus providing protection to the host. Thus far, numerous preclinical and clinical trials for peptide-based drugs are currently being evaluated. Although, experimental methods can help to precisely identify the defensin peptide family and subfamily, these approaches are often time-consuming and cost-ineffective. On the other hand, machine learning (ML) methods are able to effectively employ protein sequence information without the knowledge of a protein's three-dimensional structure, thus highlighting their predictive ability for the large-scale identification. To date, several ML methods have been developed for the in silico identification of the defensin peptide family and subfamily. Therefore, summarizing the advantages and disadvantages of the existing methods is urgently needed in order to provide useful suggestions for the development and improvement of new computational models for the identification of the defensin peptide family and subfamily. With this goal in mind, we first provide a comprehensive survey on a collection of six state-of-the-art computational approaches for predicting the defensin peptide family and subfamily. Herein, we cover different important aspects, including the dataset quality, feature encoding methods, feature selection schemes, ML algorithms, cross-validation methods and web server availability/usability. Moreover, we provide our thoughts on the limitations of existing methods and future perspectives for improving the prediction performance and model interpretability. The insights and suggestions gained from this review are anticipated to serve as a valuable guidance for researchers for the development of more robust and useful predictors.
format Online
Article
Text
id pubmed-9360473
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Leibniz Research Centre for Working Environment and Human Factors
record_format MEDLINE/PubMed
spelling pubmed-93604732022-08-09 Recent development of machine learning-based methods for the prediction of defensin family and subfamily Charoenkwan, Phasit Schaduangrat, Nalini Mahmud, S. M. Hasan Thinnukool, Orawit Shoombuatong, Watshara EXCLI J Review Article Nearly all living species comprise of host defense peptides called defensins, that are crucial for innate immunity. These peptides work by activating the immune system which kills the microbes directly or indirectly, thus providing protection to the host. Thus far, numerous preclinical and clinical trials for peptide-based drugs are currently being evaluated. Although, experimental methods can help to precisely identify the defensin peptide family and subfamily, these approaches are often time-consuming and cost-ineffective. On the other hand, machine learning (ML) methods are able to effectively employ protein sequence information without the knowledge of a protein's three-dimensional structure, thus highlighting their predictive ability for the large-scale identification. To date, several ML methods have been developed for the in silico identification of the defensin peptide family and subfamily. Therefore, summarizing the advantages and disadvantages of the existing methods is urgently needed in order to provide useful suggestions for the development and improvement of new computational models for the identification of the defensin peptide family and subfamily. With this goal in mind, we first provide a comprehensive survey on a collection of six state-of-the-art computational approaches for predicting the defensin peptide family and subfamily. Herein, we cover different important aspects, including the dataset quality, feature encoding methods, feature selection schemes, ML algorithms, cross-validation methods and web server availability/usability. Moreover, we provide our thoughts on the limitations of existing methods and future perspectives for improving the prediction performance and model interpretability. The insights and suggestions gained from this review are anticipated to serve as a valuable guidance for researchers for the development of more robust and useful predictors. Leibniz Research Centre for Working Environment and Human Factors 2022-05-05 /pmc/articles/PMC9360473/ /pubmed/35949489 http://dx.doi.org/10.17179/excli2022-4913 Text en Copyright © 2022 Charoenkwan et al. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ) You are free to copy, distribute and transmit the work, provided the original author and source are credited.
spellingShingle Review Article
Charoenkwan, Phasit
Schaduangrat, Nalini
Mahmud, S. M. Hasan
Thinnukool, Orawit
Shoombuatong, Watshara
Recent development of machine learning-based methods for the prediction of defensin family and subfamily
title Recent development of machine learning-based methods for the prediction of defensin family and subfamily
title_full Recent development of machine learning-based methods for the prediction of defensin family and subfamily
title_fullStr Recent development of machine learning-based methods for the prediction of defensin family and subfamily
title_full_unstemmed Recent development of machine learning-based methods for the prediction of defensin family and subfamily
title_short Recent development of machine learning-based methods for the prediction of defensin family and subfamily
title_sort recent development of machine learning-based methods for the prediction of defensin family and subfamily
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9360473/
https://www.ncbi.nlm.nih.gov/pubmed/35949489
http://dx.doi.org/10.17179/excli2022-4913
work_keys_str_mv AT charoenkwanphasit recentdevelopmentofmachinelearningbasedmethodsforthepredictionofdefensinfamilyandsubfamily
AT schaduangratnalini recentdevelopmentofmachinelearningbasedmethodsforthepredictionofdefensinfamilyandsubfamily
AT mahmudsmhasan recentdevelopmentofmachinelearningbasedmethodsforthepredictionofdefensinfamilyandsubfamily
AT thinnukoolorawit recentdevelopmentofmachinelearningbasedmethodsforthepredictionofdefensinfamilyandsubfamily
AT shoombuatongwatshara recentdevelopmentofmachinelearningbasedmethodsforthepredictionofdefensinfamilyandsubfamily