Cargando…

iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy

Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on bioch...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmed, Zahoor, Zulfiqar, Hasan, Khan, Abdullah Aman, Gul, Ijaz, Dao, Fu-Ying, Zhang, Zhao-Yue, Yu, Xiao-Long, Tang, Lixia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8902591/
https://www.ncbi.nlm.nih.gov/pubmed/35273581
http://dx.doi.org/10.3389/fmicb.2022.790063
_version_ 1784664620458311680
author Ahmed, Zahoor
Zulfiqar, Hasan
Khan, Abdullah Aman
Gul, Ijaz
Dao, Fu-Ying
Zhang, Zhao-Yue
Yu, Xiao-Long
Tang, Lixia
author_facet Ahmed, Zahoor
Zulfiqar, Hasan
Khan, Abdullah Aman
Gul, Ijaz
Dao, Fu-Ying
Zhang, Zhao-Yue
Yu, Xiao-Long
Tang, Lixia
author_sort Ahmed, Zahoor
collection PubMed
description Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins.
format Online
Article
Text
id pubmed-8902591
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89025912022-03-09 iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy Ahmed, Zahoor Zulfiqar, Hasan Khan, Abdullah Aman Gul, Ijaz Dao, Fu-Ying Zhang, Zhao-Yue Yu, Xiao-Long Tang, Lixia Front Microbiol Microbiology Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins. Frontiers Media S.A. 2022-02-22 /pmc/articles/PMC8902591/ /pubmed/35273581 http://dx.doi.org/10.3389/fmicb.2022.790063 Text en Copyright © 2022 Ahmed, Zulfiqar, Khan, Gul, Dao, Zhang, Yu and Tang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Ahmed, Zahoor
Zulfiqar, Hasan
Khan, Abdullah Aman
Gul, Ijaz
Dao, Fu-Ying
Zhang, Zhao-Yue
Yu, Xiao-Long
Tang, Lixia
iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
title iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
title_full iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
title_fullStr iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
title_full_unstemmed iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
title_short iThermo: A Sequence-Based Model for Identifying Thermophilic Proteins Using a Multi-Feature Fusion Strategy
title_sort ithermo: a sequence-based model for identifying thermophilic proteins using a multi-feature fusion strategy
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8902591/
https://www.ncbi.nlm.nih.gov/pubmed/35273581
http://dx.doi.org/10.3389/fmicb.2022.790063
work_keys_str_mv AT ahmedzahoor ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT zulfiqarhasan ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT khanabdullahaman ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT gulijaz ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT daofuying ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT zhangzhaoyue ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT yuxiaolong ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy
AT tanglixia ithermoasequencebasedmodelforidentifyingthermophilicproteinsusingamultifeaturefusionstrategy