Cargando…

Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins

Thermophilic proteins (TPPs) are critical for basic research and in the food industry due to their ability to maintain a thermodynamically stable fold at extremely high temperatures. Thus, the expeditious identification of novel TPPs through computational models from protein sequences is very desira...

Descripción completa

Detalles Bibliográficos
Autores principales:	Charoenkwan, Phasit, Schaduangrat, Nalini, Hasan, Md Mehedi, Moni, Mohammad Ali, Lió, Pietro, Shoombuatong, Watshara
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Leibniz Research Centre for Working Environment and Human Factors 2022
Materias:	Review Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9150013/ https://www.ncbi.nlm.nih.gov/pubmed/35651661 http://dx.doi.org/10.17179/excli2022-4723

_version_	1784717332055064576
author	Charoenkwan, Phasit Schaduangrat, Nalini Hasan, Md Mehedi Moni, Mohammad Ali Lió, Pietro Shoombuatong, Watshara
author_facet	Charoenkwan, Phasit Schaduangrat, Nalini Hasan, Md Mehedi Moni, Mohammad Ali Lió, Pietro Shoombuatong, Watshara
author_sort	Charoenkwan, Phasit
collection	PubMed
description	Thermophilic proteins (TPPs) are critical for basic research and in the food industry due to their ability to maintain a thermodynamically stable fold at extremely high temperatures. Thus, the expeditious identification of novel TPPs through computational models from protein sequences is very desirable. Over the last few decades, a number of computational methods, especially machine learning (ML)-based methods, for in silico prediction of TPPs have been developed. Therefore, it is desirable to revisit these methods and summarize their advantages and disadvantages in order to further develop new computational approaches to achieve more accurate and improved prediction of TPPs. With this goal in mind, we comprehensively investigate a large collection of fourteen state-of-the-art TPP predictors in terms of their dataset size, feature encoding schemes, feature selection strategies, ML algorithms, evaluation strategies and web server/software usability. To the best of our knowledge, this article represents the first comprehensive review on the development of ML-based methods for in silico prediction of TPPs. Among these TPP predictors, they can be classified into two groups according to the interpretability of ML algorithms employed (i.e., computational black-box methods and computational white-box methods). In order to perform the comparative analysis, we conducted a comparative study on several currently available TPP predictors based on two benchmark datasets. Finally, we provide future perspectives for the design and development of new computational models for TPP prediction. We hope that this comprehensive review will facilitate researchers in selecting an appropriate TPP predictor that is the most suitable one to deal with their purposes and provide useful perspectives for the development of more effective and accurate TPP predictors.
format	Online Article Text
id	pubmed-9150013
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Leibniz Research Centre for Working Environment and Human Factors
record_format	MEDLINE/PubMed
spelling	pubmed-91500132022-05-31 Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins Charoenkwan, Phasit Schaduangrat, Nalini Hasan, Md Mehedi Moni, Mohammad Ali Lió, Pietro Shoombuatong, Watshara EXCLI J Review Article Thermophilic proteins (TPPs) are critical for basic research and in the food industry due to their ability to maintain a thermodynamically stable fold at extremely high temperatures. Thus, the expeditious identification of novel TPPs through computational models from protein sequences is very desirable. Over the last few decades, a number of computational methods, especially machine learning (ML)-based methods, for in silico prediction of TPPs have been developed. Therefore, it is desirable to revisit these methods and summarize their advantages and disadvantages in order to further develop new computational approaches to achieve more accurate and improved prediction of TPPs. With this goal in mind, we comprehensively investigate a large collection of fourteen state-of-the-art TPP predictors in terms of their dataset size, feature encoding schemes, feature selection strategies, ML algorithms, evaluation strategies and web server/software usability. To the best of our knowledge, this article represents the first comprehensive review on the development of ML-based methods for in silico prediction of TPPs. Among these TPP predictors, they can be classified into two groups according to the interpretability of ML algorithms employed (i.e., computational black-box methods and computational white-box methods). In order to perform the comparative analysis, we conducted a comparative study on several currently available TPP predictors based on two benchmark datasets. Finally, we provide future perspectives for the design and development of new computational models for TPP prediction. We hope that this comprehensive review will facilitate researchers in selecting an appropriate TPP predictor that is the most suitable one to deal with their purposes and provide useful perspectives for the development of more effective and accurate TPP predictors. Leibniz Research Centre for Working Environment and Human Factors 2022-03-02 /pmc/articles/PMC9150013/ /pubmed/35651661 http://dx.doi.org/10.17179/excli2022-4723 Text en Copyright © 2022 Charoenkwan et al. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ) You are free to copy, distribute and transmit the work, provided the original author and source are credited.
spellingShingle	Review Article Charoenkwan, Phasit Schaduangrat, Nalini Hasan, Md Mehedi Moni, Mohammad Ali Lió, Pietro Shoombuatong, Watshara Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
title	Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
title_full	Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
title_fullStr	Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
title_full_unstemmed	Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
title_short	Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
title_sort	empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins
topic	Review Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9150013/ https://www.ncbi.nlm.nih.gov/pubmed/35651661 http://dx.doi.org/10.17179/excli2022-4723
work_keys_str_mv	AT charoenkwanphasit empiricalcomparisonandanalysisofmachinelearningbasedpredictorsforpredictingandanalyzingofthermophilicproteins AT schaduangratnalini empiricalcomparisonandanalysisofmachinelearningbasedpredictorsforpredictingandanalyzingofthermophilicproteins AT hasanmdmehedi empiricalcomparisonandanalysisofmachinelearningbasedpredictorsforpredictingandanalyzingofthermophilicproteins AT monimohammadali empiricalcomparisonandanalysisofmachinelearningbasedpredictorsforpredictingandanalyzingofthermophilicproteins AT liopietro empiricalcomparisonandanalysisofmachinelearningbasedpredictorsforpredictingandanalyzingofthermophilicproteins AT shoombuatongwatshara empiricalcomparisonandanalysisofmachinelearningbasedpredictorsforpredictingandanalyzingofthermophilicproteins

Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins

Ejemplares similares